TY - JOUR
T1 - Efficient and scalable reinforcement learning for large-scale network control
AU - Ma, Chengdong
AU - Li, Aming
AU - Du, Yali
AU - Dong, Hao
AU - Yang, Yaodong
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/9/3
Y1 - 2024/9/3
N2 - The primary challenge in the development of large-scale artificial intelligence (AI) systems lies in achieving scalable decision-making—extending the AI models while maintaining sufficient performance. Existing research indicates that distributed AI can improve scalability by decomposing complex tasks and distributing them across collaborative nodes. However, previous technologies suffered from compromised real-world applicability and scalability due to the massive requirement of communication and sampled data. Here we develop a model-based decentralized policy optimization framework, which can be efficiently deployed in multi-agent systems. By leveraging local observation through the agent-level topological decoupling of global dynamics, we prove that this decentralized mechanism achieves accurate estimations of global information. Importantly, we further introduce model learning to reinforce the optimal policy for monotonic improvement with a limited amount of sampled data. Empirical results on diverse scenarios show the superior scalability of our approach, particularly in real-world systems with hundreds of agents, thereby paving the way for scaling up AI systems.
AB - The primary challenge in the development of large-scale artificial intelligence (AI) systems lies in achieving scalable decision-making—extending the AI models while maintaining sufficient performance. Existing research indicates that distributed AI can improve scalability by decomposing complex tasks and distributing them across collaborative nodes. However, previous technologies suffered from compromised real-world applicability and scalability due to the massive requirement of communication and sampled data. Here we develop a model-based decentralized policy optimization framework, which can be efficiently deployed in multi-agent systems. By leveraging local observation through the agent-level topological decoupling of global dynamics, we prove that this decentralized mechanism achieves accurate estimations of global information. Importantly, we further introduce model learning to reinforce the optimal policy for monotonic improvement with a limited amount of sampled data. Empirical results on diverse scenarios show the superior scalability of our approach, particularly in real-world systems with hundreds of agents, thereby paving the way for scaling up AI systems.
UR - http://www.scopus.com/inward/record.url?scp=85203020555&partnerID=8YFLogxK
U2 - 10.1038/s42256-024-00879-7
DO - 10.1038/s42256-024-00879-7
M3 - Article
SN - 2522-5839
VL - 6
SP - 1006
EP - 1020
JO - Nature Machine Intelligence
JF - Nature Machine Intelligence
IS - 9
ER -