A success-history based learning procedure to optimize server throughput in large distributed control systems

Abstract

Large distributed control systems typically can be modeled by a hierarchical structure with two physical layers: Console Level Computers (CLCs) and Front End Computers (FECs). The control system of the Relativistic Heavy Ion Collider (RHIC) at Brookhaven consists of more than 500 FECs, each acting as a server providing services to a potentially unlimited number of clients. This can lead to a bottleneck in the system, as heavy traffic can slow down or even crash a system, making it momentarily unresponsive. In this paper, we consider this problem from a game theory perspective. Specifically, we consider the case where the server has a varying capacity. First, we model this problem as an integer programming problem. Second, we adopt a regret-based procedure as a basic solution and then propose a success-history based scheme to better accommodate the dynamic server capacity. Finally, simulation results show that both algorithms perform well and lead to a significant improvement of system performance. Moreover, compared with the regret-based procedure, the proposed success-history based scheme results in a higher server throughput and lower crash probability under the dynamic environment.

Publication
In 16th International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS)
Jing Chen 陈婧
Jing Chen 陈婧
Professor