TY - JOUR T1 - A Bayesian Approach to Graphical Record Linkage and Deduplication JF - Journal of the American Statistical Association Y1 - 2016 A1 - Rebecca C. Steorts A1 - Rob Hall A1 - Stephen E. Fienberg AB - ABSTRACTWe propose an unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation involves the representation of the pattern of links between records as a bipartite graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible representation of the linkage structure naturally allows us to estimate the attributes of the unique observable people in the population, calculate transitive linkage probabilities across records (and represent this visually), and propagate the uncertainty of record linkage into later analyses. Our method makes it particularly easy to integrate record linkage with post-processing procedures such as logistic regression, capture–recapture, etc. Our linkage structure lends itself to an efficient, linear-time, hybrid Markov chain Monte Carlo algorithm, which overcomes many obstacles encountered by previously record linkage approaches, despite the high-dimensional parameter space. We illustrate our method using longitudinal data from the National Long Term Care Survey and with data from the Italian Survey on Household and Wealth, where we assess the accuracy of our method and show it to be better in terms of error rates and empirical scalability than other approaches in the literature. Supplementary materials for this article are available online. VL - 111 UR - http://dx.doi.org/10.1080/01621459.2015.1105807 ER - TY - JOUR T1 - Achieving both valid and secure logistic regression analysis on aggregated data from different private sources JF - Journal of Privacy and Confidentiality Y1 - 2012 A1 - Yuval Nardi A1 - Robert Hall A1 - Stephen E. Fienberg VL - 4 ER - TY - CONF T1 - Counting the people T2 - Nathan and Beatrice Keyfitz Lecture in Mathematics and the Social Sciences Y1 - 2012 A1 - Stephen E. Fienberg JF - Nathan and Beatrice Keyfitz Lecture in Mathematics and the Social Sciences PB - Fields Institute CY - Toronto, Canada ER - TY - JOUR T1 - Differential Privacy for Protecting Multi-dimensional Contingency Table Data: Extensions and Applications JF - Journal of Privacy and Confidentiality Y1 - 2012 A1 - Yang Xiaolin A1 - Stephen E. Fienberg A1 - Alessandro Rinaldo VL - 4 ER - TY - RPRT T1 - A Generalized Fellegi-Sunter Framework for Multiple Record Linkage with Application to Homicide Records Systems Y1 - 2012 A1 - Mauricio Sadinle A1 - Stephen E. Fienberg JF - arXiv UR - https://arxiv.org/abs/1205.3217 ER - TY - JOUR T1 - Privacy in a world of electronic data: Whom should you trust? JF - Notices of the AMS Y1 - 2012 A1 - Stephen E. Fienberg VL - 59 ER - TY - JOUR T1 - Privacy-preserving data sharing in high dimensional regression and classification settings JF - Journal of Privacy and Confidentiality Y1 - 2012 A1 - Stephen E. Fienberg A1 - Jiashun Jin VL - 4 ER - TY - CONF T1 - Statistics in Service to the Nation T2 - Presentation Samuel S. Wilks Lecture Y1 - 2012 A1 - Stephen E. Fienberg JF - Presentation Samuel S. Wilks Lecture CY - Princeton, NJ N1 - April 23, 2012 ER - TY - CONF T1 - Teaching about Big Data: Curricular Issues T2 - 2012 Joint Statistical Meetings Y1 - 2012 A1 - Stephen E. Fienberg JF - 2012 Joint Statistical Meetings CY - San Diego, CA ER - TY - CONF T1 - Valid Statistical Inference on Automatically Matched Files T2 - Privacy in Statistical Databases Y1 - 2012 A1 - Robert Hall A1 - Stephen E. Fienberg ED - Josep Domingo-Ferrer ED - Ilenia Tinnirello JF - Privacy in Statistical Databases PB - Springer ER - TY - JOUR T1 - Comment on Gates: Toward a Reconceptualization of Confidentiality Protection in the Context of Linkages with Administrative Records JF - Journal of Privacy and Confidentiality Y1 - 2011 A1 - Stephen E. Fienberg VL - 3 ER - TY - JOUR T1 - Secure multiparty linear regression based on homomorphic encryption JF - Journal of Official Statistics Y1 - 2011 A1 - Robert Hall A1 - Stephen E. Fienberg A1 - Yuval Nardi VL - 27 ER -