MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Yong Yu, Jun Wang, Weinan Zhang; 24(150):1−12, 2023.

Abstract

Population-based multi-agent reinforcement learning (PB-MARL) combines dynamic population selection with multi-agent reinforcement learning algorithms (MARL). Although PB-MARL has shown significant progress in complex multi-agent tasks, its sequential execution suffers from low computational efficiency due to diverse computing patterns and policy combinations. To address this issue, we propose a solution that utilizes a stateless central task dispatcher and stateful workers to handle PB-MARL’s subroutines, leveraging parallelism across different components for efficient problem-solving. To implement this approach, we introduce MALib, a parallel framework that includes a task control model, independent data servers, and an abstraction of MARL training paradigms. The framework has been extensively tested and is available under the MIT license (https://github.com/sjtu-marl/malib)

[abs]

[pdf][bib]

[code]