🪨 Buffered Asynchronous Federated Aggregation
Literature Review
Nguyen et al. [1] proposed FedBuff, a buffered asynchronous aggregation method for federated learning that bridges synchronous and fully asynchronous FL. The core observation motivating FedBuff is that synchronous FL cannot scale efficiently beyond a few hundred clients training in parallel:
- increasing the number of concurrent clients yields diminishing returns in model performance and training speed, analogous to large-batch training in centralised settings.
Fully asynchronous FL resolves this scalability issue but introduces a different problem:
- aggregating individual client updates one at a time is incompatible with Secure Aggregation protocols, undermining privacy guarantees.
FedBuff addresses both concerns by buffering asynchronous client updates at the server. Clients train independently and submit updates whenever they finish local computation. The server collects these updates into a buffer of size . Once updates have been received (from any combination of clients), the server aggregates them in a single batch and applies the result to the global model.
This design achieves three properties simultaneously:
- stragglers do not block training, since faster clients contribute more frequently;
- Secure Aggregation remains compatible, since updates are batched rather than processed individually; and
- differential privacy can be applied to the buffered aggregate.
For the HFMARL framework, FedBuff serves as the baseline aggregation strategy at the intra-cluster level. However, FedBuff assumes a centralised server and homogeneous model architectures across clients. The inter-cluster level, which operates peer-to-peer between cluster heads running different algorithms, requires a decentralised aggregation strategy beyond FedBuff's scope.
References
[1] J. Nguyen, K. Malik, H. Zhan, A. Yousefpour, M. Rabbat, M. Malek, and D. Huba, "Federated learning with buffered asynchronous aggregation," in Proc. Int. Conf. Artif. Intell. Statist. (AISTATS), 2022, arXiv:2106.06639.