Differentially private distributed logistic regression using private and public data.

TitleDifferentially private distributed logistic regression using private and public data.
Publication TypeJournal Article
Year of Publication2014
AuthorsJi, Z, Jiang, X, Wang, S, Xiong, L, Ohno-Machado, L
JournalBMC Med Genomics
Volume7 Suppl 1
PaginationS14
Date Published2014
ISSN1755-8794
iDASH CategoryPrivacy Technology
Abstract<p><b>BACKGROUND: </b>Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced.</p><p><b>METHODOLOGY: </b>In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data.</p><p><b>EXPERIMENTS AND RESULTS: </b>We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios.</p><p><b>CONCLUSION: </b>Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.</p>
DOI10.1186/1755-8794-7-S1-S14
Alternate JournalBMC Med Genomics
PubMed ID25079786
PubMed Central IDPMC4101668
Grant ListR00LM011392 / LM / NLM NIH HHS / United States
R01HS019913 / HS / AHRQ HHS / United States
U54HL108460 / HL / NHLBI NIH HHS / United States
UL1TR0001000 / TR / NCATS NIH HHS / United States