TY  - RPRT
T1  - Bayesian mixture modeling for multivariate conditional distributions
Y1  - 2016
A1  - Maria DeYoreo
A1  - Jerome P. Reiter
AB  - We present a Bayesian mixture model for estimating the joint distribution of mixed ordinal, nominal, and continuous data conditional on a set of fixed variables. The model uses multivariate normal and categorical mixture kernels for the random variables. It induces dependence between the random and fixed variables through the means of the multivariate normal mixture kernels and via a truncated local Dirichlet process. The latter encourages observations with similar values of the fixed variables to share mixture components. Using a simulation of data fusion, we illustrate that the model can estimate underlying relationships in the data and the distributions of the missing values more accurately than a mixture model applied to the random and fixed variables jointly. We use the model to analyze consumers' reading behaviors using a quota sample, i.e., a sample where the empirical distribution of some variables is fixed by design and so should not be modeled as random, conducted by the book publisher HarperCollins.
PB  - ArXiv
UR  - http://arxiv.org/abs/1606.04457
ER  - 

TY  - JOUR
T1  - Bayesian Simultaneous Edit and Imputation for Multivariate Categorical Data
JF  - Journal of the American Statistical Association
Y1  - 2016
A1  - Daniel Manrique-Vallier
A1  - Jerome P. Reiter
AB  - In categorical data, it is typically the case that some combinations of variables are theoretically impossible, such as a three year old child who is married or a man who is pregnant. In practice, however, reported values often include such structural zeros due to, for example, respondent mistakes or data processing errors. To purge data of such errors, many statistical organizations use a process known as edit-imputation. The basic idea is first to select reported values to change according to some heuristic or loss function, and second to replace those values with plausible imputations. This two-stage process typically does not fully utilize information in the data when determining locations of errors, nor does it appropriately reflect uncertainty resulting from the edits and imputations. We present an alternative approach to editing and imputation for categorical microdata with structural zeros that addresses these shortcomings. Specifically, we use a Bayesian hierarchical model that couples a stochastic model for the measurement error process with a Dirichlet process mixture of multinomial distributions for the underlying, error free values. The latter model is restricted to have support only on the set of theoretically possible combinations. We illustrate this integrated approach to editing and imputation using simulation studies with data from the 2000 U. S. census, and compare it to a two-stage edit-imputation routine. Supplementary material is available online.
UR  - http://dx.doi.org/10.1080/01621459.2016.1231612
ER  - 

TY  - JOUR
T1  - Bayesian Latent Pattern Mixture Models for Handling Attrition in Panel Studies With Refreshment Samples
JF  - ArXiv
Y1  - 2015
A1  - Yajuan Si
A1  - Jerome P. Reiter
A1  - D. Sunshine Hillygus
KW  - Categorical
KW  - Dirichlet pro- cess
KW  - Multiple imputation
KW  - Non-ignorable
KW  - Panel attrition
KW  - Refreshment sample
AB  - Many panel studies collect refreshment samples---new, randomly sampled respondents who complete the questionnaire at the same time as a subsequent wave of the panel. With appropriate modeling, these samples can be leveraged to correct inferences for biases caused by non-ignorable attrition. We present such a model when the panel includes many categorical survey variables. The model relies on a Bayesian latent pattern mixture model, in which an indicator for attrition and the survey variables are modeled jointly via a latent class model. We allow the multinomial probabilities within classes to depend on the attrition indicator, which offers additional flexibility over standard applications of latent class models. We present results of simulation studies that illustrate the benefits of this flexibility. We apply the model to correct attrition bias in an analysis of data from the 2007-2008 Associated Press/Yahoo News election panel study.
UR  - http://arxiv.org/abs/1509.02124
IS  - 1509.02124
ER  -