All docs
Literature

🪛 Bandwidth-Aware Federated Reinforcement Learning

Literature Review

Rahmati and Rahmati [1] addressed a limitation shared by most federated RL frameworks, which is the assumption that communication channels between agents and the aggregation server have stable, sufficient bandwidth. In practice, multi-agent autonomous systems operating in the field face fluctuating network capacity, latency spikes, and intermittent connectivity.

Standard federated RL frameworks that ignore these constraints produce synchronisation delays that degrade real-time performance. BA-FRL (Bandwidth-Aware Federated Reinforcement Learning) tackles this by integrating two mechanisms:

  • an adaptive synchronisation strategy that dynamically modulates update frequency and model-sharing volume based on real-time network conditions and environmental volatility, and
  • gradient sparsification that compresses transmitted updates to reduce bandwidth consumption.

The BA-FRL framework is directly relevant to the communication-constrained operation of HFMARL. At both the intra-cluster level and the inter-cluster level, bandwidth availability is neither guaranteed nor static, for example, for cross-exchange latency in financial systems. BA-FRL's adaptive synchronisation principle maps onto HFMARL's per-agent communication triggers: rather than communicating at fixed intervals (kmodU=0k \bmod U = 0), agents modulate their participation based on measured channel quality, complementing the staleness-bounded aggregation from AFedPG [2] and the buffered asynchronous approach from FedBuff [3]. The gradient sparsification component also parallels RSM-MASAC's segment-based parameter exchange [4], where agents transmit parameter segments rather than full models to reduce bandwidth requirements.

However, BA-FRL assumes a centralised server architecture and homogeneous agents, leaving the extension to decentralised, heterogeneous-algorithm federations as an open problem addressed by the HFMARL framework.

References

[1] M. Rahmati and N. Rahmati, "Bandwidth-Aware Federated Reinforcement Learning for Real-Time Multi-Agent Autonomous Systems under Dynamic Environments," Journal of Electrical Systems and Information Technology, vol. 13, art. 15, 2026. doi:10.1186/s43067-026-00323-3.

[2] G. Lan, D.-J. Han, A. Hashemi, V. Aggarwal, and C. G. Brinton, "Asynchronous federated reinforcement learning with policy gradient updates: Algorithm design and convergence analysis," in Proc. Int. Conf. Learn. Representations (ICLR), 2025, arXiv:2404.08003.

[3] J. Nguyen, K. Malik, H. Zhan, A. Yousefpour, M. Rabbat, M. Malek, and D. Huba, "Federated learning with buffered asynchronous aggregation," in Proc. Int. Conf. Artif. Intell. Statist. (AISTATS), 2022, arXiv:2106.06639.

[4] X. Yu, R. Li, C. Liang, and Z. Zhao, "Communication-efficient soft actor-critic policy collaboration via regulated segment mixture," arXiv:2312.10123, 2024.