All docs
Literature

🔧 Briefly, why is MARL so difficult?

🪨 Cases where MARL got to shine as a successful solution have so far been predominantly observed in simulation-only environments.

  • The sim-to-real gap is painful to MARL in particular, as collecting the right type of data remains difficult in situ [1].
    • Labelled data is expensive: for example, collecting underwater sonar data is limited by overdrifting, noise, and the presence of shadows;

🪨 Deciding how and when to communicate is a major issue between the agents.

  • However: Sharing knowledge through parameters and transfer learning is important in overcoming the scalability issues in MARL. [2]
    • mean-field approach, for example
    • curriculum learning
    • DTDE with deep Q-networks (DQNs)

🪨 Generalization is difficult across scenarios and environments.

  • In gaming, a pretrained model that does well with its known set of agents often fails when a human player interacts with it.

🪨 Communication issues, such as bandwidth & latency, need to be addressed early on when designing architecture.

🪨 Data heterogeneity. Understanding, quantifying & developing the proper algorithms to tackle it. [3]

References

[1] A. Labiosa and J. P. Hanna, "Multi-robot collaboration through reinforcement learning and abstract simulation," arXiv:2503.05092, 2025.

[2] Z. Ning and L. Xie, "A survey on multi-agent reinforcement learning and its application," J. Autom. Intell., vol. 3, no. 2, pp. 73–91, 2024, doi: 10.1016/j.jai.2024.02.003.

[3] T. Hu, Z. Pu, Y. Wang, T. Qiu, M. Chen, and X. Yu, "Heterogeneity in multi-agent reinforcement learning," in Proc. Int. Conf. Auton. Agents Multiagent Syst. (AAMAS), 2026, arXiv:2512.22941.