Bayesian multiple imputation for large-scale categorical data with structural zeros

Manrique-Vallier, D., and J. P. Reiter. Bayesian multiple imputation for large-scale categorical data with structural zeros. Duke University / National Institute of Statistical Sciences (NISS) Preprint 1813:34889, 2013, available at http://hdl.handle.net/1813/34889.
Bayesian multiple imputation for large-scale categorical data with structural zeros Manrique-Vallier, D.; Reiter, J. P. We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. This approach has several appealing features: imputations are generated from coherent, Bayesian joint models that automatically capture complex dependencies and readily scale to large numbers of variables. We outline a Gibbs sampling algorithm for implementing the approach, and we illustrate its potential with a repeated sampling study using public use census microdata from the state of New York, USA.