NCRN Virtual Seminar - Synthetic establishment and firm data

Speakers: Saki Kinney (RTI) and Lars Vilhuber (Cornell University)

Title: Synthetic Establishment and Firm Data


"Assessing the Data Quality of Public Use Tabulations Produced from Synthetic Data: Synthetic Business Dynamics Statistics" (Lars Vilhuber, Cornell)

We describe and analyze a method that blends records from both observed and synthetic microdata into public-use tabulations on establishment statistics. The resulting tables use synthetic data only in potentially sensitive cells. We describe different algorithms, and present preliminary results when applied to the Census Bureau's Business Dynamics Statistics and Synthetic Longitudinal Business Database, highlighting accuracy and protection afforded by the method when compared to existing public-use tabulations (with suppressions). (archived presentation)

"Synthetic Data Generation for Firm Links" (Saki Kinney, RTI)

In most countries, national statistical agencies do not release establishment-level business microdata, because doing so represents too large a risk to establishments' confidentiality. Agencies potentially can manage these risks by releasing synthetic microdata, i.e., individual establishment records simulated from statistical models designed to mimic the joint distribution of the underlying observed data. Previously, we used this approach to generate a public-use version---now available for public use---of the U.S. Census Bureau's Longitudinal Business Database (LBD), a longitudinal census of establishments dating back to 1976. While the synthetic LBD has proven to be a useful product, we now seek to improve and expand it by using new synthesis models and adding features. This paper describes our efforts to create the second generation of the SynLBD, including synthesis procedures that we believe could be replicated in other contexts. (archived presentation)


Jan 06, 2016, 3:00pm to 4:30pm EST
Ithaca, NY 14853
United States