To main content

Decentralised Federated Learning

Making Federated Learning even more decentralised to achieve better fault tolerance and resilience.

Contact persons

Ill.: Pixabay

Master Project

Federated Learning is a machine learning approach that enables the training of models across decentralized and privacy-sensitive data sources. Instead of centralizing the data, Federated Learning allows the model to be trained on individual devices or servers, keeping data local while aggregating model updates from multiple sources to improve the global model's performance. This approach enhances privacy by reducing the need to share raw data and is particularly useful in applications where data security and confidentiality are paramount, such as healthcare, finance, and edge computing.

While Federated Learning is designed to be a decentralized approach in terms of data storage and privacy, the central aggregator plays a crucial role in coordinating the model updates from different devices or servers. If the central aggregator becomes unavailable or compromised, it can disrupt the entire Federated Learning process, making it a single point of failure for the whole system.

Research Topic focus

This MSc project will investigate possible ways of mitigating this single-point-of-failure problem for better resilience and fault tolerance of next-generation Federated Learning architectures. Possible solutions could include (but are not limited to):

  • Peer-to-peer aggregation (i.e. Gossip Learning), where devices collaborate directly with each other to aggregate model updates, rather than with a central aggregator.
  • State replication (e.g. using consensus algorithms such as RAFT), such that every device in the federated system maintains a consistent and up-to-date state making it suitable to become the new aggregator and continue from the latest checkpoint in the event of failures or disruptions.

Expected Results and Learning Outcome

This research will contribute to the advancement of state-of-the-art distributed systems by integrating Federated Learning with decentralization techniques. The expected outcomes include improved resilience, fault tolerance, and privacy in distributed systems, as well as a comprehensive understanding of the benefits and trade-offs associated with this integration.

Qualifications:

  • Experience with ML and existing frameworks: Tensorflow, PyTorch, XGBoost, HuggingFace, etc.
  • Good programming skills with Python
  • Good understanding of distributed systems, network topologies, and the IoT
  • Experience with one of the Federated Learning frameworks: Flower, Tensorflow Federated, Substra, FATE, etc. (Desirable)

References

  1. Li, Tian, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. "Federated learning: Challenges, methods, and future directions."
  2. Hegedűs, István, Gábor Danner, and Márk Jelasity. "Gossip learning as a decentralized alternative to federated learning."