Reducing leakage in distributed deep learning for sensitive health data

May 6, 2019


Praneeth Vepakomma, Otkrist Gupta, Abhimanyu Dubey, Ramesh Raskar. Reducing Leakage in Distributed Deep Learning for Sensitive Health Data, ICLR 2019 AI for Social Good Workshop (2019).


For distributed machine learning with health data we demonstrate how minimizing distance correlation between raw data and intermediary representations (smashed data) reduces leakage of sensitive raw data patterns during client communications  while maintaining model accuracy. Leakage (measured using KL Divergence  between input and intermediate representation) is the risk associated with the invertibility from intermediary representations, can prevent resource poor health organizations from using distributed deep learning services. We demonstrate that  our method reduces leakage in terms of distance correlation between raw data  and communication payloads from an order of 0.95 to 0.19 and from 0.92 to 0.33  during training with image datasets while maintaining a similar classification accuracy.

Related Content