<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Boxwala, Aziz A</style></author><author><style face="normal" font="default" size="100%">Kim, Jihoon</style></author><author><style face="normal" font="default" size="100%">Grillo, Janice M</style></author><author><style face="normal" font="default" size="100%">Ohno-Machado, Lucila</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Using statistical and machine learning to help institutions detect suspicious access to electronic health records.</style></title><secondary-title><style face="normal" font="default" size="100%">J Am Med Inform Assoc</style></secondary-title><alt-title><style face="normal" font="default" size="100%">J Am Med Inform Assoc</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Artificial Intelligence</style></keyword><keyword><style  face="normal" font="default" size="100%">Computer Security</style></keyword><keyword><style  face="normal" font="default" size="100%">Electronic Health Records</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">Logistic Models</style></keyword><keyword><style  face="normal" font="default" size="100%">Management Audit</style></keyword><keyword><style  face="normal" font="default" size="100%">Pilot Projects</style></keyword><keyword><style  face="normal" font="default" size="100%">Sensitivity and Specificity</style></keyword><keyword><style  face="normal" font="default" size="100%">Software Validation</style></keyword><keyword><style  face="normal" font="default" size="100%">United States</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2011</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2011 Jul-Aug</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">18</style></volume><pages><style face="normal" font="default" size="100%">498-505</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">OBJECTIVE: To determine whether statistical and machine-learning methods, when applied to electronic health record (EHR) access data, could help identify suspicious (ie, potentially inappropriate) access to EHRs.

METHODS: From EHR access logs and other organizational data collected over a 2-month period, the authors extracted 26 features likely to be useful in detecting suspicious accesses. Selected events were marked as either suspicious or appropriate by privacy officers, and served as the gold standard set for model evaluation. The authors trained logistic regression (LR) and support vector machine (SVM) models on 10-fold cross-validation sets of 1291 labeled events. The authors evaluated the sensitivity of final models on an external set of 58 events that were identified as truly inappropriate and investigated independently from this study using standard operating procedures.

RESULTS: The area under the receiver operating characteristic curve of the models on the whole data set of 1291 events was 0.91 for LR, and 0.95 for SVM. The sensitivity of the baseline model on this set was 0.8. When the final models were evaluated on the set of 58 investigated events, all of which were determined as truly inappropriate, the sensitivity was 0 for the baseline method, 0.76 for LR, and 0.79 for SVM.

LIMITATIONS: The LR and SVM models may not generalize because of interinstitutional differences in organizational structures, applications, and workflows. Nevertheless, our approach for constructing the models using statistical and machine-learning techniques can be generalized. An important limitation is the relatively small sample used for the training set due to the effort required for its construction.

CONCLUSION: The results suggest that statistical and machine-learning methods can play an important role in helping privacy officers detect suspicious accesses to EHRs.</style></abstract><issue><style face="normal" font="default" size="100%">4</style></issue></record></records></xml>