To main content

Asynchronous Federated Learning

Handling the complexity of Asynchronous Federated Learning in cyber-physical systems.

Contact persons

Ill.: Pixabay

Master Project

Asynchronous Federated Learning is a variation of Federated Learning where the updates from participating devices or nodes are not required to be synchronized or collected in lockstep. In traditional Federated Learning, all participating devices or nodes typically update the global model concurrently, leading to synchronization challenges. In asynchronous Federated Learning, this strict synchronization requirement is relaxed, allowing devices to update the global model independently and at their own pace.

This type of Federated Learning is particularly suitable for cyber-physical scenarios involving live sensor data, where strict synchronization is impractical due to varying data availability, network conditions, or computational resources. It requires specialized algorithms and techniques to handle the asynchronous nature of updates while ensuring model convergence and consistency.

Research Topic focus

Asynchronous Federated Learning is still far from an established market-ready technology, and this MSc project will investigate possible approaches for addressing one or several of the following challenges:

  • Consistency: Ensuring model consistency across devices is more challenging in asynchronous settings. Some devices may have stale models, while others have more recent updates, potentially leading to convergence issues.
  • Communication overhead: Asynchronous updates may lead to increased communication overhead, as devices need to communicate with the central server whenever they have an update ready. This can be less efficient in terms of network usage.
  • Algorithm complexity: Asynchronous Federated Learning algorithms tend to be more complex than their synchronous counterparts, as they must handle the asynchronous nature of updates.
  • Concurrency control: To maintain consistency and avoid conflicts in updates, asynchronous Federated Learning algorithms often require sophisticated concurrency control mechanisms, such as parameter server-based architectures or version control.

Expected Results and Learning Outcome

This research aims to advance the understanding of Asynchronous Federated Learning, offering solutions to the challenges related to model convergence, communication efficiency, and scalability. The expected outcomes may include novel algorithms, protocols, and techniques that make Asynchronous Federated Learning more efficient, secure, and applicable to a wide range of real-world scenarios, thereby contributing to the progress of decentralized machine learning.

Qualifications:

  • Experience with ML and existing frameworks: Tensorflow, PyTorch, XGBoost, HuggingFace, etc.
  • Good programming skills with Python.
  • Good understanding of distributed systems, network topologies, and the IoT.
  • Experience with one of the Federated Learning frameworks: Flower, Tensorflow Federated, Substra, FATE, etc. (Desirable).

References

  1. Li, Tian, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. "Federated learning: Challenges, methods, and future directions."
  2. Xu, Chenhao, Youyang Qu, Yong Xiang, and Longxiang Gao. "Asynchronous federated learning on heterogeneous devices: A survey."