A session on "Synthetic establishment microdata around the world" was organized at the International Statistical Institute (ISI)'s 60th World Statistics Congress – ISI2015 in Rio de Janeiro, Brazil by Lars Vilhuber (NCRN-Cornell):
Around the world, national statistical agencies face substantial challenges in attempting to release establishment-level business microdata to researchers. Doing so often represents too large a risk to establishments' confidentiality. The U.S. Census Bureau created a synthetic longitudinal business database, and released it for limited distribution. Subsequent have assessed how well the approach translates to other countries' data and legal environments. The session will inform on progress in the United States, in Germany, and on lessons learned about the utility of such an approach. Justification: The synthetic data approach is of potential interest to many statistical agencies, and the session will provide valuable information about utility, cost, and risk of such an approach. Discussants are from statistical agencies, and can speak to the policy and implementation issues associated with these approaches.
Vishesh Karwa (CMU and Harvard)
- John M. Abowd (NCRN-Cornell) and Kevin L. McKinney (U.S. Census Bureau), "Noise Infusion as a Confidentiality Protection Measure for Graph-based Statistics" (available as CES WP-14-30)
- Lars Vilhuber (NCRN-Cornell) and Javier Miranda (U.S. Census Bureau), "Using partially synthetic data to replace suppression in the Business Dynamics Statistics"
- Jörg Drechsler (IAB Germany) and Lars Vilhuber (NCRN-Cornell), "Synthetic Longitudinal Business Databases for International Comparisons"
- Satkartar Kinney (NCRN-NISS), Jerry Reiter (NCRN-Duke), and Javier Miranda (U.S. Census Bureau), "Improving the Synthetic Longitudinal Business Database: Synthesizing Firms"
- Ian Schmutte (University of Georgia), "Differentially Private Publication of Data on Wages and Job Mobility"
- Stefan Bender (Deutsche Bundesbank, Germany)
Additional Involvement by NCRN Node Members at WSC 2015:
- Noel Cressie (Missouri Node) also spoke at the World Statistical Congress. His presentation was entitled “Spatio-Temporal Data Fusion for Big Data and its Application to Satellite Remote Sensing”.
- Mauricio Sadinle, formerly with CMU, now at Duke/NISS, presented a poster “Bayesian Estimation of Bipartite Matchings for Record Linkage” at the World Statistical Congress as well.
- Larry Cox (Duke/NISS) taught a short course on “Data Anonymization Balancing Data Confidentiality and Data Quality