To main content

Data distribution aware clustering for parallel split learning in healthcare applications

Abstract

Split learning, a promising approach in privacy-preserving machine learning, decentralizes model training by dividing it among client devices and a central server. However, split learning has exhibited a certain level of slowness in its vanilla approach, mainly due to the serial processing of devices. Recent research endeavors have addressed this challenge by introducing parallelism and thus accelerating the split learning process. However, the existing split learning methodologies often overlook the critical aspect of data distribution among client devices. This paper introduces a Data Distribution Aware Clustering-based Parallel Split Learning (DCSL), a scheme purposefully crafted to address the complexities stemming from non-identically and non-independently distributed (non-IID) data among client devices engaged in the split learning paradigm. In healthcare applications, comprehending the intricacies of data distribution is imperative, particularly given the non-IID nature of medical datasets, to ensure accurate analysis and decision-making. The DCSL leverages a novel clustering technique to create clusters of medical client devices, considering the data distributions of their local datasets, and employs parallel model training within the device clusters. It enhances model convergence and reduces training latency by optimizing the cluster formation. Extensive experiments demonstrate that DCSL outperforms traditional split learning approaches, significantly improving accuracy and reducing training latency across various applications.

Category

Academic article

Language

English

Author(s)

  • Md. Tanvir Arafat
  • Md. Abdur Razzaque
  • Abdulhameed Alelaiwi
  • Md Zia Uddin
  • Mohammad Mehedi Hassan

Affiliation

  • SINTEF Digital / Sustainable Communication Technologies
  • University of Dhaka
  • King Saud University

Date

18.06.2025

Year

2025

Published in

Future Generation Computer Systems

ISSN

0167-739X

Volume

174

View this publication at Norwegian Research Information Repository