Abstract
Federated learning (FL) relies on the frequent exchange of model parameters between clients and the aggregator to achieve efficient model convergence. However, network latency presents a significant challenge, particularly in congested edge/IoT scenarios, hindering the efficiency and effectiveness of distributed machine learning (ML). While existing solutions often depend on hard-coded topologies, addressing this challenge is critical to unlocking FL's full potential in real-world scenarios. This paper proposes a novel approach to mitigate network latency issues by introducing a threefold functionality: latency-aware client selection, latency-aware aggregator assignment, and consistent replication of training progress. Our proof of concept provides a scalable and robust solution to alleviate latency's impact and improve the efficiency of distributed ML operations. Through this research, we aim to advance the field of FL by offering practical solutions that enhance performance and resilience in latency-sensitive environments.