TY - JOUR T1 - A framework for sharing confidential research data, applied to investigating differential pay by race in the U. S. government Y1 - Submitted A1 - Barrientos, A. F. A1 - Bolton, A. A1 - Balmat, T. A1 - Reiter, J. P. A1 - Machanavajjhala, A. A1 - Chen, Y. A1 - Kneifel, C. A1 - DeLong, M. A1 - de Figueiredo, J. M. AB - Data stewards seeking to provide access to large-scale social science data face a difficult challenge. They have to share data in ways that protect privacy and confidentiality, are informative for many analyses and purposes, and are relatively straightforward to use by data analysts. We present a framework for addressing this challenge. The framework uses an integrated system that includes fully synthetic data intended for wide access, coupled with means for approved users to access the confidential data via secure remote access solutions, glued together by verification servers that allow users to assess the quality of their analyses with the synthetic data. We apply this framework to data on the careers of employees of the U. S. federal government, studying differentials in pay by race. The integrated system performs as intended, allowing users to explore the synthetic data for potential pay differentials and learn through verifications which findings in the synthetic data hold up in the confidential data and which do not. We find differentials across races; for example, the gap between black and white female federal employees' pay increased over the time period. We present models for generating synthetic careers and differentially private algorithms for verification of regression results. ER - TY - RPRT T1 - Computationally Efficient Multivariate Spatio-Temporal Models for High-Dimensional Count-Valued Data. (With Discussion). Y1 - 2017 A1 - Bradley, J.R. A1 - Holan, S.H. A1 - Wikle, C.K. KW - Aggregation KW - American Community Survey KW - Bayesian hierarchical model KW - Big Data KW - Longitudinal Employer-Household Dynamics (LEHD) program KW - Markov chain Monte Carlo KW - Non-Gaussian. KW - Quarterly Workforce Indicators AB - We introduce a Bayesian approach for multivariate spatio-temporal prediction for high-dimensional count-valued data. Our primary interest is when there are possibly millions of data points referenced over different variables, geographic regions, and times. This problem requires extensive methodological advancements, as jointly modeling correlated data of this size leads to the so-called "big n problem." The computational complexity of prediction in this setting is further exacerbated by acknowledging that count-valued data are naturally non-Gaussian. Thus, we develop a new computationally efficient distribution theory for this setting. In particular, we introduce a multivariate log-gamma distribution and provide substantial theoretical development including: results regarding conditional distributions, marginal distributions, an asymptotic relationship with the multivariate normal distribution, and full-conditional distributions for a Gibbs sampler. To incorporate dependence between variables, regions, and time points, a multivariate spatio-temporal mixed effects model (MSTM) is used. The results in this manuscript are extremely general, and can be used for data that exhibit fewer sources of dependency than what we consider (e.g., multivariate, spatial-only, or spatio-temporal-only data). Hence, the implications of our modeling framework may have a large impact on the general problem of jointly modeling correlated count-valued data. We show the effectiveness of our approach through a simulation study. Additionally, we demonstrate our proposed methodology with an important application analyzing data obtained from the Longitudinal Employer-Household Dynamics (LEHD) program, which is administered by the U.S. Census Bureau. JF - arXiv UR - https://arxiv.org/abs/1512.07273 ER - TY - CONF T1 - Differentially private regression diagnostics T2 - IEEE International Conference on Data Mining Y1 - 2017 A1 - Chen, Y. A1 - Machanavajjhala, A. A1 - Reiter, J. P. A1 - Barrientos, A. AB - Many data producers seek to provide users access to confidential data without unduly compromising data subjects' privacy and confidentiality. When intense redaction is needed to do so, one general strategy is to require users to do analyses without seeing the confidential data, for example, by releasing fully synthetic data or by allowing users to query remote systems for disclosure-protected outputs of statistical models. With fully synthetic data or redacted outputs, the analyst never really knows how much to trust the resulting findings. In particular, if the user did the same analysis on the confidential data, would regression coefficients of interest be statistically significant or not? We present algorithms for assessing this question that satisfy differential privacy. We describe conditions under which the algorithms should give accurate answers about statistical significance. We illustrate the properties of the methods using artificial and genuine data. JF - IEEE International Conference on Data Mining ER - TY - RPRT T1 - Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Secure the Future of the Federal Statistical System? Y1 - 2017 A1 - Weinberg, Daniel A1 - Abowd, John M. A1 - Belli, Robert F. A1 - Cressie, Noel A1 - Folch, David C. A1 - Holan, Scott H. A1 - Levenstein, Margaret C. A1 - Olson, Kristen M. A1 - Reiter, Jerome P. A1 - Shapiro, Matthew D. A1 - Smyth, Jolene A1 - Soh, Leen-Kiat A1 - Spencer, Bruce A1 - Spielman, Seth E. A1 - Vilhuber, Lars A1 - Wikle, Christopher AB -

Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Secure the Future of the Federal Statistical System? Weinberg, Daniel; Abowd, John M.; Belli, Robert F.; Cressie, Noel; Folch, David C.; Holan, Scott H.; Levenstein, Margaret C.; Olson, Kristen M.; Reiter, Jerome P.; Shapiro, Matthew D.; Smyth, Jolene; Soh, Leen-Kiat; Spencer, Bruce; Spielman, Seth E.; Vilhuber, Lars; Wikle, Christopher The National Science Foundation-Census Bureau Research Network (NCRN) was established in 2011 to create interdisciplinary research nodes on methodological questions of interest and significance to the broader research community and to the Federal Statistical System (FSS), particularly the Census Bureau. The activities to date have covered both fundamental and applied statistical research and have focused at least in part on the training of current and future generations of researchers in skills of relevance to surveys and alternative measurement of economic units, households, and persons. This paper discusses some of the key research findings of the eight nodes, organized into six topics: (1) Improving census and survey data collection methods; (2) Using alternative sources of data; (3) Protecting privacy and confidentiality by improving disclosure avoidance; (4) Using spatial and spatio-temporal statistical modeling to improve estimates; (5) Assessing data cost and quality tradeoffs; and (6) Combining information from multiple sources. It also reports on collaborations across nodes and with federal agencies, new software developed, and educational activities and outcomes. The paper concludes with an evaluation of the ability of the FSS to apply the NCRN’s research outcomes and suggests some next steps, as well as the implications of this research-network model for future federal government renewal initiatives. This paper began as a May 8, 2015 presentation to the National Academies of Science’s Committee on National Statistics by two of the principal investigators of the National Science Foundation-Census Bureau Research Network (NCRN) – John Abowd and the late Steve Fienberg (Carnegie Mellon University). The authors acknowledge the contributions of the other principal investigators of the NCRN who are not co-authors of the paper (William Block, William Eddy, Alan Karr, Charles Manski, Nicholas Nagle, and Rebecca Nugent), the co- principal investigators, and the comments of Patrick Cantwell, Constance Citro, Adam Eck, Brian Harris-Kojetin, and Eloise Parker. We note with sorrow the deaths of Stephen Fienberg and Allan McCutcheon, two of the original NCRN principal investigators. The principal investigators also wish to acknowledge Cheryl Eavey’s sterling grant administration on behalf of the NSF. The conclusions reached in this paper are not the responsibility of the National Science Foundation (NSF), the Census Bureau, or any of the institutions to which the authors belong

PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/52650 ER - TY - JOUR T1 - Multi-rubric Models for Ordinal Spatial Data with Application to Online Ratings from Yelp Y1 - 2017 A1 - Linero, A.R. A1 - Bradley, J.R. A1 - Desai, A. KW - Bayesian hierarchical model KW - Data augmentation KW - Nonparametric Bayes KW - ordinal data KW - recommender systems KW - spatial prediction. AB - Interest in online rating data has increased in recent years. Such data consists of ordinal ratings of products or local businesses provided by users of a website, such as \Yelp\ or \texttt{Amazon}. One source of heterogeneity in ratings is that users apply different standards when supplying their ratings; even if two users benefit from a product the same amount, they may translate their benefit into ratings in different ways. In this article we propose an ordinal data model, which we refer to as a multi-rubric model, which treats the criteria used to convert a latent utility into a rating as user-specific random effects, with the distribution of these random effects being modeled nonparametrically. We demonstrate that this approach is capable of accounting for this type of variability in addition to usual sources of heterogeneity due to item quality, user biases, interactions between items and users, and the spatial structure of the users and items. We apply the model developed here to publicly available data from the website \Yelp\ and demonstrate that it produces interpretable clusterings of users according to their rating behavior, in addition to providing better predictions of ratings and better summaries of overall item quality. UR - https://arxiv.org/abs/1706.03012 ER - TY - JOUR T1 - Regionalization of Multiscale Spatial Processes using a Criterion for Spatial Aggregation Error JF - Journal of the Royal Statistical Society -- Series B. Y1 - 2017 A1 - Bradley, J.R. A1 - Wikle, C.K. A1 - Holan, S.H. KW - American Community Survey KW - empirical orthogonal functions KW - MAUP KW - Reduced rank KW - Spatial basis functions KW - Survey data AB - The modifiable areal unit problem and the ecological fallacy are known problems that occur when modeling multiscale spatial processes. We investigate how these forms of spatial aggregation error can guide a regionalization over a spatial domain of interest. By "regionalization" we mean a specification of geographies that define the spatial support for areal data. This topic has been studied vigorously by geographers, but has been given less attention by spatial statisticians. Thus, we propose a criterion for spatial aggregation error (CAGE), which we minimize to obtain an optimal regionalization. To define CAGE we draw a connection between spatial aggregation error and a new multiscale representation of the Karhunen-Loeve (K-L) expansion. This relationship between CAGE and the multiscale K-L expansion leads to illuminating theoretical developments including: connections between spatial aggregation error, squared prediction error, spatial variance, and a novel extension of Obled-Creutin eigenfunctions. The effectiveness of our approach is demonstrated through an analysis of two datasets, one using the American Community Survey and one related to environmental ocean winds. UR - https://arxiv.org/abs/1502.01974 ER - TY - JOUR T1 - Bayesian Hierarchical Models with Conjugate Full-Conditional Distributions for Dependent Data from the Natural Exponential Family JF - Journal of the American Statistical Association - T&M. Y1 - 2016 A1 - Bradley, J.R. A1 - Holan, S.H. A1 - Wikle, C.K. AB - We introduce a Bayesian approach for analyzing (possibly) high-dimensional dependent data that are distributed according to a member from the natural exponential family of distributions. This problem requires extensive methodological advancements, as jointly modeling high-dimensional dependent data leads to the so-called "big n problem." The computational complexity of the "big n problem" is further exacerbated when allowing for non-Gaussian data models, as is the case here. Thus, we develop new computationally efficient distribution theory for this setting. In particular, we introduce something we call the "conjugate multivariate distribution," which is motivated by the univariate distribution introduced in Diaconis and Ylvisaker (1979). Furthermore, we provide substantial theoretical and methodological development including: results regarding conditional distributions, an asymptotic relationship with the multivariate normal distribution, conjugate prior distributions, and full-conditional distributions for a Gibbs sampler. The results in this manuscript are extremely general, and can be adapted to many different settings. We demonstrate the proposed methodology through simulated examples and analyses based on estimates obtained from the US Census Bureaus' American Community Survey (ACS). UR - https://arxiv.org/abs/1701.07506 ER - TY - JOUR T1 - Bayesian Spatial Change of Support for Count-Valued Survey Data with Application to the American Community Survey JF - Journal of the American Statistical Association Y1 - 2016 A1 - Bradley, J.R. A1 - Wikle, C.K. A1 - Holan, S.H. AB - We introduce Bayesian spatial change of support methodology for count-valued survey data with known survey variances. Our proposed methodology is motivated by the American Community Survey (ACS), an ongoing survey administered by the U.S. Census Bureau that provides timely information on several key demographic variables. Specifically, the ACS produces 1-year, 3-year, and 5-year "period-estimates," and corresponding margins of errors, for published demographic and socio-economic variables recorded over predefined geographies within the United States. Despite the availability of these predefined geographies it is often of interest to data users to specify customized user-defined spatial supports. In particular, it is useful to estimate demographic variables defined on "new" spatial supports in "real-time." This problem is known as spatial change of support (COS), which is typically performed under the assumption that the data follows a Gaussian distribution. However, count-valued survey data is naturally non-Gaussian and, hence, we consider modeling these data using a Poisson distribution. Additionally, survey-data are often accompanied by estimates of error, which we incorporate into our analysis. We interpret Poisson count-valued data in small areas as an aggregation of events from a spatial point process. This approach provides us with the flexibility necessary to allow ACS users to consider a variety of spatial supports in "real-time." We demonstrate the effectiveness of our approach through a simulated example as well as through an analysis using public-use ACS data. UR - https://arxiv.org/abs/1405.7227 ER - TY - ABST T1 - Data management and analytic use of paradata: SIPP-EHC audit trails Y1 - 2016 A1 - Lee, Jinyoung A1 - Seloske, Ben A1 - Córdova Cazar, Ana Lucía A1 - Eck, Adam A1 - Kirchner, Antje A1 - Belli, Robert F. ER - TY - JOUR T1 - Multivariate Spatio-Temporal Survey Fusion with Application to the American Community Survey and Local Area Unemployment Statistics JF - Stat Y1 - 2016 A1 - Bradley, J.R. A1 - Holan, S.H. A1 - Wikle, C.K AB - There are often multiple surveys available that estimate and report related demographic variables of interest that are referenced over space and/or time. Not all surveys produce the same information, and thus, combining these surveys typically leads to higher quality estimates. That is, not every survey has the same level of precision nor do they always provide estimates of the same variables. In addition, various surveys often produce estimates with incomplete spatio-temporal coverage. By combining surveys using a Bayesian approach, we can account for different margins of error and leverage dependencies to produce estimates of every variable considered at every spatial location and every time point. Specifically, our strategy is to use a hierarchical modelling approach, where the first stage of the model incorporates the margin of error associated with each survey. Then, in a lower stage of the hierarchical model, the multivariate spatio-temporal mixed effects model is used to incorporate multivariate spatio-temporal dependencies of the processes of interest. We adopt a fully Bayesian approach for combining surveys; that is, given all of the available surveys, the conditional distributions of the latent processes of interest are used for statistical inference. To demonstrate our proposed methodology, we jointly analyze period estimates from the US Census Bureau's American Community Survey, and estimates obtained from the Bureau of Labor Statistics Local Area Unemployment Statistics program. Copyright © 2016 John Wiley & Sons, Ltd. UR - http://onlinelibrary.wiley.com/doi/10.1002/sta4.120/full ER - TY - RPRT T1 - NCRN Meeting Spring 2016: Attitudes Towards Geolocation-Enabled Census Forms Y1 - 2016 A1 - Brandimarte, Laura A1 - Chiew, Ernest A1 - Ventura, Sam A1 - Acquisti, Alessandro AB - NCRN Meeting Spring 2016: Attitudes Towards Geolocation-Enabled Census Forms Brandimarte, Laura; Chiew, Ernest; Ventura, Sam; Acquisti, Alessandro Geolocation refers to the automatic identification of the physical locations of Internet users. In an online survey experiment, we studied respondent reactions towards different types of geolocation. After coordinating with US Census Bureau researchers, we designed and administered a replica of a census form to a sample of respondents. We also created slightly different forms by manipulating the type of geolocation implemented. Using the IP address of each respondent, we approximated the geographical coordinates of the respondent and displayed this location on a map on the survey. Across different experimental conditions, we manipulated the map interface between the three interfaces on the Google Maps API: default road map, Satellite View, and Street View. We also provided either a specific, pinpointed location, or a set of two circles of 1- and 2-miles radius. Snapshots of responses were captured at every instant information was added, altered, or deleted by respondents when completing the survey. We measured willingness to provide information on the typical Census form, as well as privacy concerns associated with geolocation technologies and attitudes towards the use of online geographical maps to identify one’s exact current location. Presented at the NCRN Meeting Spring 2016 in Washington DC on May 9-10, 2016; see http://www.ncrn.info/event/ncrn-spring-2016-meeting PB - Carnegie-Mellon University UR - http://hdl.handle.net/1813/43889 ER - TY - RPRT T1 - NCRN Meeting Spring 2016: Evaluating Data quality in Time Diary Surveys Using Paradata Y1 - 2016 A1 - Córdova Cazar, Ana Lucía A1 - Belli, Robert AB - NCRN Meeting Spring 2016: Evaluating Data quality in Time Diary Surveys Using Paradata Córdova Cazar, Ana Lucía; Belli, Robert Over the past decades, time use researchers have been increasingly interested in analyzing wellbeing in tandem with the use of time (Juster and Stafford, 1985; Krueger et al, 2009). Many methodological issues have arose in this endeavor, including the concern about the quality of the time use data. Survey researchers have increasingly turned to the analysis of paradata to better understand and model data quality. In particular, it has been argued that paradata may serve as proxy of the respondents’ cognitive response process, and can be used as an additional tool to assess the impact of data generation on data quality. In this presentation, data quality in the American Time Use Survey (ATUS) will be assessed through the use of paradata and survey responses. Specifically, I will talk about a data quality index I have created, which includes measures of different types of ATUS errors (e.g. low number of reported activities, failures to report an activity), and paradata variables (e.g. response latencies, incompletes). The overall objective of this study is to contribute to data quality assessment in the collection of timeline data from national surveys by providing insights on those interviewing dynamics that most impact data quality. These insights will help to improve future instruments and training of interviewers, as well as to reduce costs. Presented at the NCRN Meeting Spring 2016 in Washington DC on May 9-10, 2016; see http://www.ncrn.info/event/ncrn-spring-2016-meeting PB - University of Nebraska UR - http://hdl.handle.net/1813/43896 ER - TY - RPRT T1 - NCRN Meeting Spring 2016: The ATUS and SIPP-EHC: Recent Developments Y1 - 2016 A1 - Belli, Robert F. AB - NCRN Meeting Spring 2016: The ATUS and SIPP-EHC: Recent Developments Belli, Robert F. One of the main objectives of the NCRN award to the University of Nebraska node is to investigate data quality associated with timeline interviewing as conducted with the American Time Use Survey (ATUS) time diary and the Survey of Income and Program Participation event history calendar (SIPP-EHC). Specifically, our efforts are focused on the relationships between interviewing dynamics as extracted from analyses of paradata with measures of data quality. With the ATUS, our recent efforts have revealed that respondents differ in how they handle difficulty with remembering activities, with some overcoming these difficulties and others succumbing to them. With the SIPP-EHC, we are still in the initial stages of extracting variables from the paradata that are associated with interviewing dynamics. Our work has also involved the development of a CATI time diary in which we are able to analyze audio streams to capture interviewing dynamics. I will conclude this talk by discussing challenges that have yet to be overcome with our work, and our vision of moving forward with the eventual development of self-administered timeline instruments that will be respondent-friendly due to the assistance of intelligent-agent driven virtual interviewers. Presented at the NCRN Meeting Spring 2016 in Washington DC on May 9-10, 2016; see http://www.ncrn.info/event/ncrn-spring-2016-meeting PB - University of Nebraska UR - http://hdl.handle.net/1813/43893 ER - TY - JOUR T1 - Parallel associations and the structure of autobiographical knowledge JF - Journal of Applied Research in Memory and Cognition Y1 - 2016 A1 - Belli, R.F. A1 - T. Al Baghal KW - Autobiographical memory; Autobiographical knowledge; Autobiographical periods; Episodic memory; Retrospective reports AB - The self-memory system (SMS) model of autobiographical knowledge conceives that memories are structured thematically, organized both hierarchically and temporally. This model has been challenged on several fronts, including the absence of parallel linkages across pathways. Calendar survey interviewing shows the frequent and varied use of parallel associations in autobiographical recall. Parallel associations in these data are commonplace, and are driven more by respondents’ generative retrieval than by interviewers’ probing. Parallel associations represent a number of autobiographical knowledge themes that are interrelated across life domains. The content of parallel associations is nearly evenly split between general and transitional events, supporting the importance of transitions in autographical memory. Associations in respondents’ memories (both parallel and sequential), demonstrate complex interactions with interviewer verbal behaviors during generative retrieval. In addition to discussing the implications of these results to the SMS model, implications are also drawn for transition theory and the basic-systems model. VL - 5 IS - 2 ER - TY - JOUR T1 - Using Data Mining to Predict the Occurrence of Respondent Retrieval Strategies in Calendar Interviewing: The Quality of Retrospective Reports JF - Journal of Official Statistics Y1 - 2016 A1 - Belli, Robert F. A1 - Miller, L. Dee A1 - Baghal, Tarek Al A1 - Soh, Leen-Kiat AB - Determining which verbal behaviors of interviewers and respondents are dependent on one another is a complex problem that can be facilitated via data-mining approaches. Data are derived from the interviews of 153 respondents of the Panel Study of Income Dynamics (PSID) who were interviewed about their life-course histories. Behavioral sequences of interviewer-respondent interactions that were most predictive of respondents spontaneously using parallel, timing, duration, and sequential retrieval strategies in their generation of answers were examined. We also examined which behavioral sequences were predictive of retrospective reporting data quality as shown by correspondence between calendar responses with responses collected in prior waves of the PSID. The verbal behaviors of immediately preceding interviewer and respondent turns of speech were assessed in terms of their co-occurrence with each respondent retrieval strategy. Interviewers’ use of parallel probes is associated with poorer data quality, whereas interviewers’ use of timing and duration probes, especially in tandem, is associated with better data quality. Respondents’ use of timing and duration strategies is also associated with better data quality and both strategies are facilitated by interviewer timing probes. Data mining alongside regression techniques is valuable to examine which interviewer-respondent interactions will benefit data quality. VL - 32 IS - 3 ER - TY - JOUR T1 - Bayesian Spatial Change of Support for Count-Valued Survey Data with Application to the American Community Survey JF - Journal of the American Statistical Association Y1 - 2015 A1 - Bradley, Jonathan A1 - Wikle, C.K. A1 - Holan, S. H. AB - We introduce Bayesian spatial change of support methodology for count-valued survey data with known survey variances. Our proposed methodology is motivated by the American Community Survey (ACS), an ongoing survey administered by the U.S. Census Bureau that provides timely information on several key demographic variables. Specifically, the ACS produces 1-year, 3-year, and 5-year “period-estimates,” and corresponding margins of errors, for published demographic and socio-economic variables recorded over predefined geographies within the United States. Despite the availability of these predefined geographies it is often of interest to data-users to specify customized user-defined spatial supports. In particular, it is useful to estimate demographic variables defined on “new” spatial supports in “real-time.” This problem is known as spatial change of support (COS), which is typically performed under the assumption that the data follows a Gaussian distribution. However, count-valued survey data is naturally non-Gaussian and, hence, we consider modeling these data using a Poisson distribution. Additionally, survey-data are often accompanied by estimates of error, which we incorporate into our analysis. We interpret Poisson count-valued data in small areas as an aggregation of events from a spatial point process. This approach provides us with the flexibility necessary to allow ACS users to consider a variety of spatial supports in “real-time.” We show the effectiveness of our approach through a simulated example as well as through an analysis using public-use ACS data. UR - http://www.tandfonline.com/doi/abs/10.1080/01621459.2015.1117471 ER - TY - JOUR T1 - Bayesian Spatial Change of Support for Count-Valued Survey Data with Application to the American Community Survey JF - Journal of the American Statistical Association Y1 - 2015 A1 - Bradley, Jonathan R. A1 - Wikle, Christopher K. A1 - Holan, Scott H. KW - Aggregation KW - American Community Survey KW - Bayesian hierarchical model KW - Givens angle prior KW - Markov chain Monte Carlo KW - Multiscale model KW - Non-Gaussian. AB - We introduce Bayesian spatial change of support methodology for count-valued survey data with known survey variances. Our proposed methodology is motivated by the American Community Survey (ACS), an ongoing survey administered by the U.S. Census Bureau that provides timely information on several key demographic variables. Specifically, the ACS produces 1-year, 3-year, and 5-year “period-estimates,” and corresponding margins of errors, for published demographic and socio-economic variables recorded over predefined geographies within the United States. Despite the availability of these predefined geographies it is often of interest to data-users to specify customized user-defined spatial supports. In particular, it is useful to estimate demographic variables defined on “new” spatial supports in “real-time.” This problem is known as spatial change of support (COS), which is typically performed under the assumption that the data follows a Gaussian distribution. However, count-valued survey data is naturally non-Gaussian and, hence, we consider modeling these data using a Poisson distribution. Additionally, survey-data are often accompanied by estimates of error, which we incorporate into our analysis. We interpret Poisson count-valued data in small areas as an aggregation of events from a spatial point process. This approach provides us with the flexibility necessary to allow ACS users to consider a variety of spatial supports in “real-time.” We show the effectiveness of our approach through a simulated example as well as through an analysis using public-use ACS data. UR - http://www.tandfonline.com/doi/abs/10.1080/01621459.2015.1117471 ER - TY - JOUR T1 - Bayesian Spatial Change of Support for Count–Valued Survey Data JF - ArXiv Y1 - 2015 A1 - Bradley, J. R. A1 - Wikle, C.K. A1 - Holan, S. H. AB - We introduce Bayesian spatial change of support methodology for count-valued survey data with known survey variances. Our proposed methodology is motivated by the American Community Survey (ACS), an ongoing survey administered by the U.S. Census Bureau that provides timely information on several key demographic variables. Specifically, the ACS produces 1-year, 3-year, and 5-year "period-estimates," and corresponding margins of errors, for published demographic and socio-economic variables recorded over predefined geographies within the United States. Despite the availability of these predefined geographies it is often of interest to data users to specify customized user-defined spatial supports. In particular, it is useful to estimate demographic variables defined on "new" spatial supports in "real-time." This problem is known as spatial change of support (COS), which is typically performed under the assumption that the data follows a Gaussian distribution. However, count-valued survey data is naturally non-Gaussian and, hence, we consider modeling these data using a Poisson distribution. Additionally, survey-data are often accompanied by estimates of error, which we incorporate into our analysis. We interpret Poisson count-valued data in small areas as an aggregation of events from a spatial point process. This approach provides us with the flexibility necessary to allow ACS users to consider a variety of spatial supports in "real-time." We demonstrate the effectiveness of our approach through a simulated example as well as through an analysis using public-use ACS data. UR - http://arxiv.org/abs/1405.7227 IS - 1405.7227 ER - TY - JOUR T1 - Capturing multivariate spatial dependence: Model, estimate, and then predict JF - Statistical Science Y1 - 2015 A1 - Cressie, N. A1 - Burden, S. A1 - Davis, W. A1 - Krivitsky, P. A1 - Mokhtarian, P. A1 - Seusse, T. A1 - Zammit-Mangion, A. VL - 30 UR - http://projecteuclid.org/euclid.ss/1433341474 IS - 2 ER - TY - JOUR T1 - Change in Visible Impervious Surface Area in Southeastern Michigan Before and After the “Great Recession:” Spatial Differentiation in Remotely Sensed Land-Cover Dynamics JF - Population and Environment Y1 - 2015 A1 - Wilson, C. R. A1 - Brown, D. G. VL - 36 UR - http://link.springer.com/article/10.1007%2Fs11111-014-0219-y IS - 3 ER - TY - CONF T1 - Changing ‘Who’ or ‘Where’: Implications for Data Quality in the American Time Use Survey T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Deal, C.E. A1 - Kirchner, A. A1 - Cordova-Cazar, A.L. A1 - Ellyne, L. A1 - Belli, R.F. JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - JOUR T1 - Comparing and selecting spatial predictors using local criteria JF - Test Y1 - 2015 A1 - Bradley, J.R. A1 - Cressie, N. A1 - Shi, T. VL - 24 UR - http://dx.doi.org/10.1007/s11749-014-0415-1 IS - 1 ER - TY - CONF T1 - Determining Potential for Breakoff in Time Diary Survey Using Paradata T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Wettlaufer, D. A1 - Arunachalam, H. A1 - Atkin, G. A1 - Eck, A. A1 - Soh, L.-K. A1 - Belli, R.F. JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CHAP T1 - Evaluation of diagnostics for hierarchical spatial statistical models T2 - Geometry Driven Statistics Y1 - 2015 A1 - Cressie, N. A1 - Burden, S. ED - I.L. Dryden ED - J.T. Kent JF - Geometry Driven Statistics PB - Wiley CY - Chinchester SN - 978-1118866573 UR - http://niasra.uow.edu.au/content/groups/public/@web/@inf/@math/documents/doc/uow169240.pdf ER - TY - JOUR T1 - Figures of merit for simultaneous inference and comparisons in simulation experiments JF - Stat Y1 - 2015 A1 - Cressie, N. A1 - Burden, S. VL - 4 UR - http://onlinelibrary.wiley.com/doi/10.1002/sta4.88/epdf IS - 1 ER - TY - CONF T1 - I Know What You Did Next: Predicting Respondent’s Next Activity Using Machine Learning T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Arunachalam, H. A1 - Atkin, G. A1 - Eck, A. A1 - Wettlaufer, D. A1 - Soh, L.-K. A1 - Belli, R.F. JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - JOUR T1 - Multiple imputation for harmonizing longitudinal non-commensurate measures in individual participant data meta-analysis JF - Statistics in Medicine Y1 - 2015 A1 - Siddique, J. A1 - Reiter, J. P. A1 - Brincks, A. A1 - Gibbons, R. A1 - Crespi, C. A1 - Brown, C. H. UR - http://onlinelibrary.wiley.com/doi/10.1002/sim.6562/abstract ER - TY - ICOMM T1 - Multiscale Analysis of Survey Data: Recent Developments and Exciting Prospects Y1 - 2015 A1 - Bradley, J.R. A1 - Wikle, C.K. A1 - Holan, S.H. JF - Statistics Views ER - TY - JOUR T1 - Multivariate Spatio-Temporal Models for High-Dimensional Areal Data with Application to Longitudinal Employer-Household Dynamics JF - ArXiv Y1 - 2015 A1 - Bradley, J. R. A1 - Holan, S. H. A1 - Wikle, C.K. AB - Many data sources report related variables of interest that are also referenced over geographic regions and time; however, there are relatively few general statistical methods that one can readily use that incorporate these multivariate spatio-temporal dependencies. Additionally, many multivariate spatio-temporal areal datasets are extremely high-dimensional, which leads to practical issues when formulating statistical models. For example, we analyze Quarterly Workforce Indicators (QWI) published by the US Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) program. QWIs are available by different variables, regions, and time points, resulting in millions of tabulations. Despite their already expansive coverage, by adopting a fully Bayesian framework, the scope of the QWIs can be extended to provide estimates of missing values along with associated measures of uncertainty. Motivated by the LEHD, and other applications in federal statistics, we introduce the multivariate spatio-temporal mixed effects model (MSTM), which can be used to efficiently model high-dimensional multivariate spatio-temporal areal datasets. The proposed MSTM extends the notion of Moran's I basis functions to the multivariate spatio-temporal setting. This extension leads to several methodological contributions including extremely effective dimension reduction, a dynamic linear model for multivariate spatio-temporal areal processes, and the reduction of a high-dimensional parameter space using {a novel} parameter model. UR - http://arxiv.org/abs/1503.00982 IS - 1503.00982 ER - TY - JOUR T1 - Multivariate Spatio-Temporal Models for High-Dimensional Areal Data with Application to Longitudinal Employer-Household Dynamics JF - Annals of Applied Statistics Y1 - 2015 A1 - Bradley, J.R. A1 - Holan, S.H. A1 - Wikle, C.K. AB - Many data sources report related variables of interest that are also referenced over geographic regions and time; however, there are relatively few general statistical methods that one can readily use that incorporate these multivariate spatio-temporal dependencies. Additionally, many multivariate spatio-temporal areal datasets are extremely high-dimensional, which leads to practical issues when formulating statistical models. For example, we analyze Quarterly Workforce Indicators (QWI) published by the US Census Bureau’s Longitudinal Employer-Household Dynamics (LEHD) program. QWIs are available by different variables, regions, and time points, resulting in millions of tabulations. Despite their already expansive coverage, by adopting a fully Bayesian framework, the scope of the QWIs can be extended to provide estimates of missing values along with associated measures of uncertainty. Motivated by the LEHD, and other applications in federal statistics, we introduce the multivariate spatio-temporal mixed effects model (MSTM), which can be used to efficiently model high-dimensional multivariate spatio-temporal areal datasets. The proposed MSTM extends the notion of Moran’s I basis functions to the multivariate spatio-temporal setting. This extension leads to several methodological contributions including extremely effective dimension reduction, a dynamic linear model for multivariate spatio-temporal areal processes, and the reduction of a high-dimensional parameter space using a novel parameter model. VL - 9 IS - 4 ER - TY - RPRT T1 - NCRN Meeting Spring 2015: Models for Multiscale Spatially-Referenced Count Data Y1 - 2015 A1 - Holan, Scott A1 - Bradley, Jonathan R. A1 - Wikle, Christopher K. AB - NCRN Meeting Spring 2015: Models for Multiscale Spatially-Referenced Count Data Holan, Scott; Bradley, Jonathan R.; Wikle, Christopher K. Presentation at the NCRN Meeting Spring 2015 PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/40176 ER - TY - RPRT T1 - NCRN Meeting Spring 2015: Regionalization of Multiscale Spatial Processes Using a Criterion for Spatial Aggregation Error Y1 - 2015 A1 - Wikle, Christopher K. A1 - Bradley, Jonathan A1 - Holan, Scott AB - NCRN Meeting Spring 2015: Regionalization of Multiscale Spatial Processes Using a Criterion for Spatial Aggregation Error Wikle, Christopher K.; Bradley, Jonathan; Holan, Scott Develop and implement a statistical criterion to diagnose spatial aggregation error that can facilitate the choice of regionalizations of spatial data. Presentation at NCRN Meeting Spring 2015 PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/40177 ER - TY - JOUR T1 - Perceptions, behaviors and satisfaction related to public safety for persons with disabilities in the United States JF - Criminal Justice Review Y1 - 2015 A1 - Brucker, D. VL - 1 IS - 18 ER - TY - RPRT T1 - Presentation: NADDI 2015: Crowdsourcing DDI Development: New Features from the CED2AR Project Y1 - 2015 A1 - Perry, Benjamin A1 - Kambhampaty, Venkata A1 - Brumsted, Kyle A1 - Vilhuber, Lars A1 - Block, William AB - Presentation: NADDI 2015: Crowdsourcing DDI Development: New Features from the CED2AR Project Perry, Benjamin; Kambhampaty, Venkata; Brumsted, Kyle; Vilhuber, Lars; Block, William Recent years have shown the power of user-sourced information evidenced by the success of Wikipedia and its many emulators. This sort of unstructured discussion is currently not feasible as a part of the otherwise successful metadata repositories. Creating and augmenting metadata is a labor-intensive endeavor. Harnessing collective knowledge from actual data users can supplement officially generated metadata. As part of our Comprehensive Extensible Data Documentation and Access Repository (CED2AR) infrastructure, we demonstrate a prototype of crowdsourced DDI, using DDI-C and supplemental XML. The system allows for any number of network connected instances (web or desktop deployments) of the CED2AR DDI editor to concurrently create and modify metadata. The backend transparently handles changes, and frontend has the ability to separate official edits (by designated curators of the data and the metadata) from crowd-sourced content. We briefly discuss offline edit contributions as well. CED2AR uses DDI-C and supplemental XML together with Git for a very portable and lightweight implementation. This distributed network implementation allows for large scale metadata curation without the need for a hardware intensive computing environment, and can leverage existing cloud services, such as Github or Bitbucket. Ben Perry (Cornell/NCRN) presents joint work with Venkata Kambhampaty, Kyle Brumsted, Lars Vilhuber, & William C. Block at NADDI 2015. PB - Cornell University UR - http://hdl.handle.net/1813/40172 ER - TY - JOUR T1 - Privacy and human behavior in the age of information JF - Science Y1 - 2015 A1 - Alessandro Acquisti A1 - Laura Brandimarte A1 - George Loewenstein KW - confidentiality KW - privacy AB - This Review summarizes and draws connections between diverse streams of empirical research on privacy behavior. We use three themes to connect insights from social and behavioral sciences: people’s uncertainty about the consequences of privacy-related behaviors and their own preferences over those consequences; the context-dependence of people’s concern, or lack thereof, about privacy; and the degree to which privacy concerns are malleable—manipulable by commercial and governmental interests. Organizing our discussion by these themes, we offer observations concerning the role of public policy in the protection of privacy in the information age. VL - 347 UR - http://www.sciencemag.org/content/347/6221/509 IS - 6221 ER - TY - JOUR T1 - Regionalization of Multiscale Spatial Processes using a Criterion for Spatial Aggregation Error JF - ArXiv Y1 - 2015 A1 - Bradley, J. R. A1 - Wikle, C.K. A1 - Holan, S. H. AB - The modifiable areal unit problem and the ecological fallacy are known problems that occur when modeling multiscale spatial processes. We investigate how these forms of spatial aggregation error can guide a regionalization over a spatial domain of interest. By "regionalization" we mean a specification of geographies that define the spatial support for areal data. This topic has been studied vigorously by geographers, but has been given less attention by spatial statisticians. Thus, we propose a criterion for spatial aggregation error (CAGE), which we minimize to obtain an optimal regionalization. To define CAGE we draw a connection between spatial aggregation error and a new multiscale representation of the Karhunen-Loeve (K-L) expansion. This relationship between CAGE and the multiscale K-L expansion leads to illuminating theoretical developments including: connections between spatial aggregation error, squared prediction error, spatial variance, and a novel extension of Obled-Creutin eigenfunctions. The effectiveness of our approach is demonstrated through an analysis of two datasets, one using the American Community Survey and one related to environmental ocean winds. UR - http://arxiv.org/abs/1502.01974 IS - 1502.01974 ER - TY - JOUR T1 - Rejoinder on: Comparing and selecting spatial predictors using local criteria JF - Test Y1 - 2015 A1 - Bradley, J.R. A1 - Cressie, N. A1 - Shi, T. VL - 24 UR - http://dx.doi.org/10.1007/s11749-014-0414-2 IS - 1 ER - TY - JOUR T1 - The SAR model for very large datasets: A reduced-rank approach JF - Econometrics Y1 - 2015 A1 - Burden, S. A1 - Cressie, N. A1 - Steel, D.G. VL - 3 UR - http://www.mdpi.com/2225-1146/3/2/317 IS - 2 ER - TY - JOUR T1 - Spatio-temporal change of support with application to American Community Survey multi-year period estimates JF - Stat Y1 - 2015 A1 - Bradley, Jonathan R. A1 - Wikle, Christopher K. A1 - Holan, Scott H. KW - Bayesian KW - change-of-support KW - dynamical KW - hierarchical models KW - mixed-effects model KW - Moran's I KW - multi-year period estimate AB - We present hierarchical Bayesian methodology to perform spatio-temporal change of support (COS) for survey data with Gaussian sampling errors. This methodology is motivated by the American Community Survey (ACS), which is an ongoing survey administered by the US Census Bureau that provides timely information on several key demographic variables. The ACS has published 1-year, 3-year, and 5-year period estimates, and margins of errors, for demographic and socio-economic variables recorded over predefined geographies. The spatio-temporal COS methodology considered here provides data users with a way to estimate ACS variables on customized geographies and time periods while accounting for sampling errors. Additionally, 3-year ACS period estimates are to be discontinued, and this methodology can provide predictions of ACS variables for 3-year periods given the available period estimates. The methodology is based on a spatio-temporal mixed-effects model with a low-dimensional spatio-temporal basis function representation, which provides multi-resolution estimates through basis function aggregation in space and time. This methodology includes a novel parameterization that uses a target dynamical process and recently proposed parsimonious Moran's I propagator structures. Our approach is demonstrated through two applications using public-use ACS estimates and is shown to produce good predictions on a hold-out set of 3-year period estimates. Copyright © 2015 John Wiley & Sons, Ltd. VL - 4 UR - http://dx.doi.org/10.1002/sta4.94 ER - TY - JOUR T1 - Understanding the Human Condition through Survey Informatics JF - IEEE Computer Y1 - 2015 A1 - Eck, A. A1 - Leen-Kiat, S. A1 - McCutcheon, A. L. A1 - Smyth, J.D. A1 - Belli, R.F. VL - 48 IS - 11 ER - TY - CONF T1 - The Use of Paradata to Evaluate Interview Complexity and Data Quality (in Calendar and Time Diary Surveys) T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Cordova-Cazar, A.L. A1 - Belli, R.F. JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CONF T1 - Using Data Mining to Examine Interviewer-Respondent Interactions in Calendar Interviews T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Belli, R.F. A1 - Miller, L.D. A1 - Soh, L.-K. A1 - T. Al Baghal JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CONF T1 - Using Machine Learning Techniques to Predict Respondent Type from A Priori Demographic Information T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Atkin, G. A1 - Arunachalam, H. A1 - Eck, A. A1 - Wettlaufer, D. A1 - Soh, L.-K. A1 - Belli, R.F. JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CHAP T1 - Autobiographical memory dynamics in survey research T2 - SAGE Handbook of Applied Memory Y1 - 2014 A1 - Belli, R. F. ED - T. J. Perfect ED - D. S. Lindsay JF - SAGE Handbook of Applied Memory PB - Sage UR - http://dx.doi.org/10.4135/9781446294703 ER - TY - CONF T1 - Call back later: The association of recruitment contact and error in the American Time Use Survey T2 - American Association for Public Opinion Research 2014 Annual Conference Y1 - 2014 A1 - Countryman, A. A1 - Cordova-Cazar, A.L. A1 - Deal, C.E. A1 - Belli, R.F. JF - American Association for Public Opinion Research 2014 Annual Conference CY - Anaheim, CA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - RPRT T1 - CED 2 AR: The Comprehensive Extensible Data Documentation and Access Repository Y1 - 2014 A1 - Lagoze, Carl A1 - Vilhuber, Lars A1 - Williams, Jeremy A1 - Perry, Benjamin A1 - Block, William C. AB - CED 2 AR: The Comprehensive Extensible Data Documentation and Access Repository Lagoze, Carl; Vilhuber, Lars; Williams, Jeremy; Perry, Benjamin; Block, William C. We describe the design, implementation, and deployment of the Comprehensive Extensible Data Documentation and Access Repository (CED 2 AR). This is a metadata repository system that allows researchers to search, browse, access, and cite confidential data and metadata through either a web-based user interface or programmatically through a search API, all the while re-reusing and linking to existing archive and provider generated metadata. CED 2 AR is distinguished from other metadata repository-based applications due to requirements that derive from its social science context. These include the need to cloak confidential data and metadata and manage complex provenance chains Presented at 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL), Sept 8-12, 2014 PB - Cornell University UR - http://hdl.handle.net/1813/44702 ER - TY - RPRT T1 - Collaborative Editing of DDI Metadata: The Latest from the CED2AR Project Y1 - 2014 A1 - Perry, Benjamin A1 - Kambhampaty, Venkata A1 - Brumsted, Kyle A1 - Vilhuber, Lars A1 - Block, William AB - Collaborative Editing of DDI Metadata: The Latest from the CED2AR Project Perry, Benjamin; Kambhampaty, Venkata; Brumsted, Kyle; Vilhuber, Lars; Block, William Benjamin Perry's presentation on "Collaborative Editing and Versioning of DDI Metadata: The Latest from Cornell's NCRN CED²AR Software" at the 6th Annual European DDI User Conference in London, 12/02/2014. PB - Cornell University UR - http://hdl.handle.net/1813/38200 ER - TY - JOUR T1 - A Comparison of Spatial Predictors when Datasets Could be Very Large JF - ArXiv Y1 - 2014 A1 - Bradley, J. R. A1 - Cressie, N. A1 - Shi, T. KW - Statistics - Methodology AB -

In this article, we review and compare a number of methods of spatial prediction. To demonstrate the breadth of available choices, we consider both traditional and more-recently-introduced spatial predictors. Specifically, in our exposition we review: traditional stationary kriging, smoothing splines, negative-exponential distance-weighting, Fixed Rank Kriging, modified predictive processes, a stochastic partial differential equation approach, and lattice kriging. This comparison is meant to provide a service to practitioners wishing to decide between spatial predictors. Hence, we provide technical material for the unfamiliar, which includes the definition and motivation for each (deterministic and stochastic) spatial predictor. We use a benchmark dataset of CO2 data from NASA's AIRS instrument to address computational efficiencies that include CPU time and memory usage. Furthermore, the predictive performance of each spatial predictor is assessed empirically using a hold-out subset of the AIRS data.

UR - http://arxiv.org/abs/1410.7748 IS - 1410.7748 ER - TY - JOUR T1 - Dasymetric Modeling and Uncertainty JF - The Annals of the Association of American Geographers Y1 - 2014 A1 - Nagle, N. A1 - Buttenfield, B. A1 - Leyk, S. A1 - Spielman, S. E. VL - 104 UR - http://www.tandfonline.com/doi/abs/10.1080/00045608.2013.843439 ER - TY - CONF T1 - Designing an Intelligent Time Diary Instrument: Visualization, Dynamic Feedback, and Error Prevention and Mitigation T2 - UNL/SRAM/Gallup Symposium Y1 - 2014 A1 - Atkin, G. A1 - Arunachalam, H. A1 - Eck, A. A1 - Soh, L.-K. A1 - Belli, R.F. JF - UNL/SRAM/Gallup Symposium CY - Omaha, NE UR - http://grc.unl.edu/unlsramgallup-symposium ER - TY - CONF T1 - Designing an Intelligent Time Diary Instrument: Visualization, Dynamic Feedback, and Error Prevention and Mitigation T2 - American Association for Public Opinion Research 2014 Annual Conference Y1 - 2014 A1 - Atkin, G. A1 - Arunachalam, H. A1 - Eck, A. A1 - Soh, L.-K. A1 - Belli, R. JF - American Association for Public Opinion Research 2014 Annual Conference CY - Anaheim, CA. UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - ICOMM T1 - How to Make a Better Map—Using Neuroscience Y1 - 2014 A1 - Laura Bliss KW - Nicholas Nagle KW - Seth Spielman AB -

The work of Seth Spielman and Nicholas Nagle was noted in this article in City Lab, a publication from The Atlantic magazine, available at http://www.citylab.com/design/2014/11/how-to-make-a-better-map-according-to-science/382898/.

PB - Citylab UR - http://www.citylab.com/design/2014/11/how-to-make-a-better-map-according-to-science/382898/ ER - TY - CONF T1 - Interviewer variance and prevalence of verbal behaviors in calendar and conventional interviewing T2 - American Association for Public Opinion Research 2014 Annual Conference Y1 - 2014 A1 - Belli, R.F. A1 - Charoenruk, N., JF - American Association for Public Opinion Research 2014 Annual Conference CY - Anaheim, CA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CONF T1 - Interviewer variance of interviewer and respondent behaviors: A comparison between calendar and conventional interviewing T2 - XVIII International Sociological Association World Congress of Sociology Y1 - 2014 A1 - Belli, R.F. A1 - Charoenruk, N., JF - XVIII International Sociological Association World Congress of Sociology CY - Yokohama, Japan UR - https://isaconf.confex.com/isaconf/wc2014/webprogram/Paper34278.html ER - TY - CONF T1 - Making sense of paradata: Challenges faced and lessons learned T2 - American Association for Public Opinion Research 2014 Annual Conference Y1 - 2014 A1 - Eck, A. A1 - Stuart, L. A1 - Atkin, G. A1 - Soh, L-K A1 - McCutcheon, A.L. A1 - Belli, R.F. JF - American Association for Public Opinion Research 2014 Annual Conference CY - Anaheim, CA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CONF T1 - Making Sense of Paradata: Challenges Faced and Lessons Learned T2 - UNL/SRAM/Gallup Symposium Y1 - 2014 A1 - Eck, A. A1 - Stuart, L. A1 - Atkin, G. A1 - Soh, L-K A1 - McCutcheon, A.L. A1 - Belli, R.F. JF - UNL/SRAM/Gallup Symposium CY - Omaha, NE UR - http://grc.unl.edu/unlsramgallup-symposium ER - TY - JOUR T1 - Multiple imputation by ordered monotone blocks with application to the Anthrax Vaccine Adsorbed Trial JF - Journal of Computational and Graphical Statistics Y1 - 2014 A1 - Li, Fan A1 - Baccini, Michela A1 - Mealli, Fabrizia A1 - Zell, Elizabeth R. A1 - Frangakis, Constantine E. A1 - Rubin, Donald B VL - 23 UR - http://www.tandfonline.com/doi/abs/10.1080/10618600.2013.826583 ER - TY - RPRT T1 - NCRN Meeting Fall 2014: Change in Visible Impervious Surface Area in Southeastern Michigan Before and After the "Great Recession" Y1 - 2014 A1 - Wilson, Courtney A1 - Brown, Daniel G. AB - NCRN Meeting Fall 2014: Change in Visible Impervious Surface Area in Southeastern Michigan Before and After the "Great Recession" Wilson, Courtney; Brown, Daniel G. Presentation at Fall 2014 NCRN meeting PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/37446 ER - TY - RPRT T1 - NCRN Meeting Fall 2014: Mixed Effects Modeling for Multivariate-Spatio-Temporal Areal Data Y1 - 2014 A1 - Bradley, Jonathan A1 - Holan, Scott A1 - Wikle, Christopher AB - NCRN Meeting Fall 2014: Mixed Effects Modeling for Multivariate-Spatio-Temporal Areal Data Bradley, Jonathan; Holan, Scott; Wikle, Christopher Presentation from NCRN Fall 2014 meeting PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/37749 ER - TY - RPRT T1 - NCRN Meeting Spring 2014: Aiming at a More Cost-Effective Census Via Online Data Collection: Privacy Trade-Offs of Geo-Location Y1 - 2014 A1 - Brandimarte, Laura A1 - Acquisti, Alessandro AB - NCRN Meeting Spring 2014: Aiming at a More Cost-Effective Census Via Online Data Collection: Privacy Trade-Offs of Geo-Location Brandimarte, Laura; Acquisti, Alessandro presentation at NCRN Spring 2014 meeting PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/36397 ER - TY - RPRT T1 - NCRN Meeting Spring 2014: Integrating PROV with DDI: Mechanisms of Data Discovery within the U.S. Census Bureau Y1 - 2014 A1 - Block, William A1 - Brown, Warren A1 - Williams, Jeremy A1 - Vilhuber, Lars A1 - Lagoze, Carl AB - NCRN Meeting Spring 2014: Integrating PROV with DDI: Mechanisms of Data Discovery within the U.S. Census Bureau Block, William; Brown, Warren; Williams, Jeremy; Vilhuber, Lars; Lagoze, Carl presentation at NCRN Spring 2014 meeting PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/36392 ER - TY - CONF T1 - The Poisson Change of Support Problem with Applications to the American Community Survey T2 - Joint Statistical Meetings 2014 Y1 - 2014 A1 - Bradley, J.R. JF - Joint Statistical Meetings 2014 ER - TY - CONF T1 - Remembering where: A look at the American Time Use Survey T2 - Paper presented at the annual conference of the Midwest Association for Public Opinion Research Y1 - 2014 A1 - Deal, C. A1 - Cordova-Cazar, A.L. A1 - Countryman, A. A1 - Kirchner, A. A1 - Belli, R.F. JF - Paper presented at the annual conference of the Midwest Association for Public Opinion Research CY - Chicago, IL UR - http://www.mapor.org/conferences.html ER - TY - ABST T1 - SIPP: From Conventional Questionnaire to Event History Calendar Interviewing Y1 - 2014 A1 - Belli, R.F. N1 - Workshop on ìConducting Research using the Survey of Income and Program Participation (SIPP). Presented at Duke University, Social Science Research Institute, Durham, NC ER - TY - CONF T1 - Spiny CACTOS: OSN Users Attitudes and Perceptions Towards Cryptographic Access Control Tools T2 - Proceedings of the Workshop on Usable Security (USEC) Y1 - 2014 A1 - Balsa, E., A1 - Brandimarte, L., A1 - Acquisti, A., A1 - Diaz, C., A1 - Gürses, S. JF - Proceedings of the Workshop on Usable Security (USEC) UR - https://www.internetsociety.org/doc/spiny-cactos-osn-users-attitudes-and-perceptions-towards-cryptographic-access-control-tools ER - TY - CONF T1 - Survey Fusion for Data that Exhibit Multivariate, Spatio-Temporal Dependencies T2 - Joint Statistical Meetings 2014 Y1 - 2014 A1 - Bradley, J.R. JF - Joint Statistical Meetings 2014 ER - TY - THES T1 - Towards an Understanding of Dynamics Between Race, Population Movement, and the Built Environment of American Cities (undergraduate honors thesis) Y1 - 2014 A1 - Bellman, B. PB - University of Colorado at Boulder ER - TY - CHAP T1 - The Untold Story of Multi-Mode (Online and Mail) Consumer Panels: From Optimal Recruitment to Retention and Attrition T2 - Online Panel Surveys: An Interdisciplinary Approach Y1 - 2014 A1 - McCutcheon, Allan L. A1 - Rao, K., A1 - Kaminska, O. ED - Callegaro, M. ED - Baker, R. ED - Bethlehem, J. ED - Göritz, A. ED - Krosnick, J. ED - Lavrakas, P. JF - Online Panel Surveys: An Interdisciplinary Approach PB - Wiley ER - TY - CONF T1 - The use of paradata (in time use surveys) to better evaluate data quality T2 - American Association for Public Opinion Research 2014 Annual Conference Y1 - 2014 A1 - Cordova-Cazar, A.L. A1 - Belli, R.F. JF - American Association for Public Opinion Research 2014 Annual Conference CY - Anaheim, CA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - JOUR T1 - What are You Doing Now? Activity Level Responses and Errors in the American Time Use Survey JF - Journal of Survey Statistics and Methodology Y1 - 2014 A1 - T. Al Baghal A1 - Belli, R.F. A1 - Phillips, A.L. A1 - Ruther, N. VL - 2 IS - 4 ER - TY - CONF T1 - Would a Privacy Fundamentalist Sell their DNA for \$1000... if Nothing Bad Happened Thereafter? A Study of the Western Categories, Behavior Intentions, and Consequences T2 - Proceedings of the Tenth Symposium on Usable Privacy and Security (SOUPS) Y1 - 2014 A1 - Woodruff, A. A1 - Pihur, V. A1 - Acquisti, A. A1 - Consolvo, S. A1 - Schmidt, L. A1 - Brandimarte, L. JF - Proceedings of the Tenth Symposium on Usable Privacy and Security (SOUPS) PB - ACM CY - New York, NY UR - https://www.usenix.org/conference/soups2014/proceedings/presentation/woodruff N1 - IAPP SOUPS Privacy Award Winner ER - TY - CONF T1 - Bayesian learning of joint distributions of objects T2 - Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS) 2013 Y1 - 2013 A1 - Banerjee, A. A1 - Murray, J. A1 - Dunson, D. B. AB -

There is increasing interest in broad application areas in defining flexible joint models for data having a variety of measurement scales, while also allowing data of complex types, such as functions, images and documents. We consider a general framework for nonparametric Bayes joint modeling through mixture models that incorporate dependence across data types through a joint mixing measure. The mixing measure is assigned a novel infinite tensor factorization (ITF) prior that allows flexible dependence in cluster allocation across data types. The ITF prior is formulated as a tensor product of stick-breaking processes. Focusing on a convenient special case corresponding to a Parafac factorization, we provide basic theory justifying the flexibility of the proposed prior and resulting asymptotic properties. Focusing on ITF mixtures of product kernels, we develop a new Gibbs sampling algorithm for routine implementation relying on slice sampling. The methods are compared with alternative joint mixture models based on Dirichlet processes and related approaches through simulations and real data applications.

Also at http://arxiv.org/abs/1303.0449

JF - Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS) 2013 UR - http://jmlr.csail.mit.edu/proceedings/papers/v31/banerjee13a.html ER - TY - JOUR T1 - Data Management of Confidential Data JF - International Journal of Digital Curation Y1 - 2013 A1 - Carl Lagoze A1 - William C. Block A1 - Jeremy Williams A1 - John M. Abowd A1 - Lars Vilhuber AB - Social science researchers increasingly make use of data that is confidential because it contains linkages to the identities of people, corporations, etc. The value of this data lies in the ability to join the identifiable entities with external data such as genome data, geospatial information, and the like. However, the confidentiality of this data is a barrier to its utility and curation, making it difficult to fulfill US federal data management mandates and interfering with basic scholarly practices such as validation and reuse of existing results. We describe the complexity of the relationships among data that span a public and private divide. We then describe our work on the CED2AR prototype, a first step in providing researchers with a tool that spans this divide and makes it possible for them to search, access, and cite that data. VL - 8 N1 - Presented at 8th International Digital Curation Conference 2013, Amsterdam. See also http://hdl.handle.net/1813/30924 ER - TY - RPRT T1 - Encoding Provenance of Social Science Data: Integrating PROV with DDI Y1 - 2013 A1 - Lagoze, Carl A1 - Block, William C A1 - Williams, Jeremy A1 - Abowd, John A1 - Vilhuber, Lars AB - Encoding Provenance of Social Science Data: Integrating PROV with DDI Lagoze, Carl; Block, William C; Williams, Jeremy; Abowd, John; Vilhuber, Lars Provenance is a key component of evaluating the integrity and reusability of data for scholarship. While recording and providing access provenance has always been important, it is even more critical in the web environment in which data from distributed sources and of varying integrity can be combined and derived. The PROV model, developed under the auspices of the W3C, is a foundation for semantically-rich, interoperable, and web-compatible provenance metadata. We report on the results of our experimentation with integrating the PROV model into the DDI metadata for a complex, but characteristic, example social science data. We also present some preliminary thinking on how to visualize those graphs in the user interface. Submitted to EDDI13 5th Annual European DDI User Conference December 2013, Paris, France PB - Cornell University UR - http://hdl.handle.net/1813/34443 ER - TY - CONF T1 - Encoding Provenance of Social Science Data: Integrating PROV with DDI T2 - 5th Annual European DDI User Conference Y1 - 2013 A1 - Carl Lagoze A1 - William C. Block A1 - Jeremy Williams A1 - Lars Vilhuber KW - DDI KW - eSocial Science KW - Metadata KW - Provenance AB - Provenance is a key component of evaluating the integrity and reusability of data for scholarship. While recording and providing access provenance has always been important, it is even more critical in the web environment in which data from distributed sources and of varying integrity can be combined and derived. The PROV model, developed under the auspices of the W3C, is a foundation for semantically-rich, interoperable, and web-compatible provenance metadata. We report on the results of our experimentation with integrating the PROV model into the DDI metadata for a complex, but characteristic, example social science data. We also present some preliminary thinking on how to visualize those graphs in the user interface. JF - 5th Annual European DDI User Conference ER - TY - CONF T1 - Examining the relationship between error and behavior in the American Time Use Survey using audit trail paradata T2 - American Association for Public Opinion Research 2013 Annual Conference Y1 - 2013 A1 - Ruther, N. A1 - T. Al Baghal A1 - A. Eck A1 - L. Stuart A1 - L. Phillips A1 - R. Belli A1 - Soh, L-K JF - American Association for Public Opinion Research 2013 Annual Conference CY - Boston, MA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - JOUR T1 - Gone in 15 Seconds: The Limits of Privacy Transparency and Control JF - IEEE Security & Privacy Y1 - 2013 A1 - Acquisti, A. A1 - Adjerid, I. A1 - Brandimarte, L. VL - 11 ER - TY - RPRT T1 - Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files Y1 - 2013 A1 - Block, William C. A1 - Williams, Jeremy A1 - Vilhuber, Lars A1 - Lagoze, Carl A1 - Brown, Warren A1 - Abowd, John M. AB - Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files Block, William C.; Williams, Jeremy; Vilhuber, Lars; Lagoze, Carl; Brown, Warren; Abowd, John M. Presentation at NADDI 2013 This record has also been archived at http://kuscholarworks.ku.edu/dspace/handle/1808/11093 . PB - Cornell University UR - http://hdl.handle.net/1813/33362 ER - TY - CONF T1 - Is it the Typeset or the Type of Statistics? Disfluent Font and Self-Disclosure T2 - Proceedings of Learning from Authoritative Security Experiment Results (LASER) Y1 - 2013 A1 - Balebako, R. A1 - Pe'er, E. A1 - Brandimarte, L. A1 - Cranor, L. F. A1 - Acquisti, A. JF - Proceedings of Learning from Authoritative Security Experiment Results (LASER) PB - USENIX Association CY - New York, NY UR - https://www.usenix.org/laser2013/program/balebako ER - TY - RPRT T1 - Managing Confidentiality and Provenance across Mixed Private and Publicly-Accessed Data and Metadata Y1 - 2013 A1 - Vilhuber, Lars A1 - Abowd, John A1 - Block, William A1 - Lagoze, Carl A1 - Williams, Jeremy AB - Managing Confidentiality and Provenance across Mixed Private and Publicly-Accessed Data and Metadata Vilhuber, Lars; Abowd, John; Block, William; Lagoze, Carl; Williams, Jeremy Social science researchers are increasingly interested in making use of confidential micro-data that contains linkages to the identities of people, corporations, etc. The value of this linking lies in the potential to join these identifiable entities with external data such as genome data, geospatial information, and the like. Leveraging these linkages is an essential aspect of “big data” scholarship. However, the utility of these confidential data for scholarship is compromised by the complex nature of their management and curation. This makes it difficult to fulfill US federal data management mandates and interferes with basic scholarly practices such as validation and reuse of existing results. We describe in this paper our work on the CED2AR prototype, a first step in providing researchers with a tool that spans the confidential/publicly-accessible divide, making it possible for researchers to identify, search, access, and cite those data. The particular points of interest in our work are the cloaking of metadata fields and the expression of provenance chains. For the former, we make use of existing fields in the DDI (Data Description Initiative) specification and suggest some minor changes to the specification. For the latter problem, we investigate the integration of DDI with recent work by the W3C PROV working group that has developed a generalizable and extensible model for expressing data provenance. PB - Cornell University UR - http://hdl.handle.net/1813/34534 ER - TY - JOUR T1 - Memory, communication, and data quality in calendar interviews JF - Public Opinion Quarterly Y1 - 2013 A1 - Belli, R. F., A1 - Bilgen, I., A1 - T. Al Baghal VL - 77 ER - TY - JOUR T1 - Misplaced confidences: Privacy and the control paradox JF - Social Psychological and Personality Science Y1 - 2013 A1 - Laura Brandimarte A1 - Alessandro Acquisti A1 - George Loewenstein VL - 4 ER - TY - CONF T1 - Predicting the occurrence of respondent retrieval strategies in calendar interviewing: The quality of autobiographical recall in surveys T2 - Biennial conference of the Society for Applied Research in Memory and Cognition Y1 - 2013 A1 - Belli, R.F. A1 - Miller, L.D. A1 - Soh, L-K A1 - T. Al Baghal JF - Biennial conference of the Society for Applied Research in Memory and Cognition CY - Rotterdam, Netherlands UR - http://static1.squarespace.com/static/504170d6e4b0b97fe5a59760/t/52457a8be4b0012b7a5f462a/1380285067247/SARMAC_X_PaperJune27.pdf ER - TY - CONF T1 - Predicting the occurrence of respondent retrieval strategies in calendar interviewing: The quality of retrospective reports T2 - American Association for Public Opinion Research 2013 Annual Conference Y1 - 2013 A1 - Belli, R.F. A1 - Miller, L.D. A1 - Soh, L-K A1 - T. Al Baghal JF - American Association for Public Opinion Research 2013 Annual Conference CY - Boston, MA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CONF T1 - The process of turning audit trails from a CATI survey into useful data: Interviewer behavior paradata in the American Time Use Survey T2 - American Association for Public Opinion Research 2013 Annual Conference Y1 - 2013 A1 - Ruther, N. A1 - Phipps, P. A1 - Belli, R.F. JF - American Association for Public Opinion Research 2013 Annual Conference CY - Boston, MA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - ABST T1 - A Reduced Rank Model for Analyzing Multivariate Spatial Datasets Y1 - 2013 A1 - Bradley, J.R. JF - University of Missouri-Kansas City PB - University of Missouri-Kansas City ER - TY - CONF T1 - Troubles with time-use: Examining potential indicators of error in the American Time Use Survey T2 - American Association for Public Opinion Research 2013 Annual Conference Y1 - 2013 A1 - Phillips, A.L. A1 - T. Al Baghal A1 - Belli, R.F. JF - American Association for Public Opinion Research 2013 Annual Conference CY - Boston, MA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CONF T1 - What are you doing now?: Audit trails, Activity level responses and error in the American Time Use Survey T2 - American Association for Public Opinion Research Y1 - 2013 A1 - T. Al Baghal A1 - Phillips, A.L. A1 - Ruther, N. A1 - Belli, R.F. A1 - Stuart, L. A1 - Eck, A. A1 - Soh, L-K JF - American Association for Public Opinion Research CY - Boston, MA UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - CONF T1 - Calendar interviewing in life course research: Associations between verbal behaviors and data quality T2 - Eighth International Conference on Social Science Methodology Y1 - 2012 A1 - Belli, R.F. A1 - Bilgen, I. A1 - T. Al Baghal JF - Eighth International Conference on Social Science Methodology CY - Sydney Australia UR - https://conference.acspri.org.au/index.php/rc33/2012/paper/view/366 ER - TY - RPRT T1 - Data Management of Confidential Data Y1 - 2012 A1 - Lagoze, Carl A1 - Block, William C. A1 - Williams, Jeremy A1 - Abowd, John M. A1 - Vilhuber, Lars AB - Data Management of Confidential Data Lagoze, Carl; Block, William C.; Williams, Jeremy; Abowd, John M.; Vilhuber, Lars Social science researchers increasingly make use of data that is confidential because it contains linkages to the identities of people, corporations, etc. The value of this data lies in the ability to join the identifiable entities with external data such as genome data, geospatial information, and the like. However, the confidentiality of this data is a barrier to its utility and curation, making it difficult to fulfill US federal data management mandates and interfering with basic scholarly practices such as validation and reuse of existing results. We describe the complexity of the relationships among data that span a public and private divide. We then describe our work on the CED2AR prototype, a first step in providing researchers with a tool that spans this divide and makes it possible for them to search, access, and cite that data. PB - Cornell University UR - http://hdl.handle.net/1813/30924 ER - TY - RPRT T1 - An Early Prototype of the Comprehensive Extensible Data Documentation and Access Repository (CED2AR) Y1 - 2012 A1 - Block, William C. A1 - Williams, Jeremy A1 - Abowd, John M. A1 - Vilhuber, Lars A1 - Lagoze, Carl AB - An Early Prototype of the Comprehensive Extensible Data Documentation and Access Repository (CED2AR) Block, William C.; Williams, Jeremy; Abowd, John M.; Vilhuber, Lars; Lagoze, Carl This presentation will demonstrate the latest DDI-related technological developments of Cornell University’s $3 million NSF-Census Research Network (NCRN) award, dedicated to improving the documentation, discoverability, and accessibility of public and restricted data from the federal statistical system in the United States. The current internal name for our DDI-based system is the Comprehensive Extensible Data Documentation and Access Repository (CED²AR). CED²AR ingests metadata from heterogeneous sources and supports filtered synchronization between restricted and public metadata holdings. Currently-supported CED²AR “connector workflows” include mechanisms to ingest IPUMS, zero-observation files from the American Community Survey (DDI 2.1), and SIPP Synthetic Beta (DDI 1.2). These disparate metadata sources are all transformed into a DDI 2.5 compliant form and stored in a single repository. In addition, we will demonstrate an extension to DDI 2.5 that allows for the labeling of elements within the schema to indicate confidentiality. This metadata can then be filtered, allowing the creation of derived public use metadata from an original confidential source. This repository is currently searchable online through a prototype application demonstrating the ability to search across previously heterogeneous metadata sources. Presentation at the 4th Annual European DDI User Conference (EDDI12), Norwegian Social Science Data Services, Bergen, Norway, 3 December, 2012 PB - Cornell University UR - http://hdl.handle.net/1813/30922 ER - TY - CONF T1 - The Economics of Privacy T2 - The Oxford Handbook of the Digital Economy Y1 - 2012 A1 - Laura Brandimarte A1 - Alessandro Acquisti ED - Martin Peitz ED - Joel Waldfogel JF - The Oxford Handbook of the Digital Economy PB - Oxford University Press SN - 9780195397840 ER - TY - CHAP T1 - Entropy Estimations Using Correlated Symmetric Stable Random Projections T2 - Advances in Neural Information Processing Systems 25 Y1 - 2012 A1 - Ping Li A1 - Cun-Hui Zhang ED - P. Bartlett ED - F.C.N. Pereira ED - C.J.C. Burges ED - L. Bottou ED - K.Q. Weinberger JF - Advances in Neural Information Processing Systems 25 UR - http://books.nips.cc/papers/files/nips25/NIPS2012_1456.pdf ER - TY - CONF T1 - Exploring interviewer and respondent interactions: An innovative behavior coding approach T2 - Midwest Association for Public Opinion Research 2012 Annual Conference Y1 - 2012 A1 - Walton, L. A1 - Stange, M. A1 - Powell, R. A1 - Belli, R.F. JF - Midwest Association for Public Opinion Research 2012 Annual Conference CY - Chicago, IL UR - http://www.mapor.org/conferences.html ER - TY - CONF T1 - Interviewer variance of interviewer and respondent behaviors: A new frontier in analyzing the interviewer-respondent interaction T2 - Midwest Association for Public Opinion Research 2012 Annual Conference Y1 - 2012 A1 - Charoenruk, N. A1 - Parkhurst, B. A1 - Ay, M. A1 - Belli, R. F. JF - Midwest Association for Public Opinion Research 2012 Annual Conference CY - Chicago, IL UR - http://www.mapor.org/conferences.html N1 - Annual conference of the Midwest Association for Public Opinion Research, Chicago, Illinois. ER - TY - RPRT T1 - The NSF-Census Research Network: Cornell Node Y1 - 2012 A1 - Block, William C. A1 - Lagoze, Carl A1 - Vilhuber, Lars A1 - Brown, Warren A. A1 - Williams, Jeremy A1 - Arguillas, Florio AB - The NSF-Census Research Network: Cornell Node Block, William C.; Lagoze, Carl; Vilhuber, Lars; Brown, Warren A.; Williams, Jeremy; Arguillas, Florio Cornell University has received a $3M NSF-Census Research Network (NCRN) award to improve the documentation and discoverability of both public and restricted data from the federal statistical system. The current internal name for this project is the Comprehensive Extensible Data Documentation and Access Repository (CED²AR). The diagram to the right provides a high level architectural overview of the system to be implemented. The CED²AR will be based upon leading metadata standards such as the Data Documentation Initiative (DDI) and Statistical Data and Metadata eXchange (SDMX) and be flexibly designed to ingest documentation from a variety of source files. It will permit synchronization between the public and confidential instances of the repository. The scholarly community will be able to use the CED²AR as it would a conventional metadata repository, deprived only of the values of certain confidential information, but not their metadata. The authorized user, working on the secure Census Bureau network, could use the CED²AR with full information in authorized domains. PB - Cornell University UR - http://hdl.handle.net/1813/30925 ER - TY - CHAP T1 - One Permutation Hashing T2 - Advances in Neural Information Processing Systems 25 Y1 - 2012 A1 - Ping Li A1 - Art Owen A1 - Cun-Hui Zhang ED - P. Bartlett ED - F.C.N. Pereira ED - C.J.C. Burges ED - L. Bottou ED - K.Q. Weinberger JF - Advances in Neural Information Processing Systems 25 UR - http://books.nips.cc/papers/files/nips25/NIPS2012_1436.pdf ER - TY - CHAP T1 - A Proposed Solution to the Archiving and Curation of Confidential Scientific Inputs T2 - Privacy in Statistical Databases Y1 - 2012 A1 - Abowd, John M. A1 - Vilhuber, Lars A1 - Block, William ED - Domingo-Ferrer, Josep ED - Tinnirello, Ilenia KW - Data Archive KW - Data Curation KW - Privacy-preserving Datamining KW - Statistical Disclosure Limitation JF - Privacy in Statistical Databases T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg VL - 7556 SN - 978-3-642-33626-3 UR - http://dx.doi.org/10.1007/978-3-642-33627-0_17 ER - TY - CONF T1 - Sleight of Privacy T2 - Conference on Web Privacy Measurement Y1 - 2012 A1 - Idris Adjerid A1 - Alessandro Acquisti A1 - Laura Brandimarte JF - Conference on Web Privacy Measurement ER - TY - CONF T1 - Troubles with time-use: Examining potential indicators of error in the ATUS T2 - Midwest Association for Public Opinion Research 2012 Annual Conference Y1 - 2012 A1 - Phillips, A. L., A1 - T. Al Baghal A1 - Belli, R. F. JF - Midwest Association for Public Opinion Research 2012 Annual Conference CY - Chicago, IL UR - http://www.mapor.org/conferences.html N1 - Presented at the annual conference of the Midwest Association for Public Opinion Research, Chicago, Illinois ER - TY - RPRT T1 - A Proposed Solution to the Archiving and Curation of Confidential Scientific Inputs Y1 - 2011 A1 - Abowd, John M. A1 - Vilhuber, Lars A1 - Block, William AB - A Proposed Solution to the Archiving and Curation of Confidential Scientific Inputs Abowd, John M.; Vilhuber, Lars; Block, William We develop the core of a method for solving the data archive and curation problem that confronts the custodians of restricted-access research data and the scientific users of such data. Our solution recognizes the dual protections afforded by physical security and access limitation protocols. It is based on extensible tools and can be easily incorporated into existing instructional materials. PB - Cornell University UR - http://hdl.handle.net/1813/30923 ER - TY - JOUR T1 - Parallel Associations and the Structure of Autobiographical Knowledge JF - Journal of Applied Research in Memory and Cognition Y1 - 6 A1 - Belli, Robert F. A1 - Al Baghal, Tarek KW - Autobiographical knowledge KW - Autobiographical memory KW - Autobiographical periods KW - Episodic memory KW - Retrospective reports AB - The self-memory system (SMS) model of autobiographical knowledge conceives that memories are structured thematically, organized both hierarchically and temporally. This model has been challenged on several fronts, including the absence of parallel linkages across pathways. Calendar survey interviewing shows the frequent and varied use of parallel associations in autobiographical recall. Parallel associations in these data are commonplace, and are driven more by respondents’ generative retrieval than by interviewers’ probing. Parallel associations represent a number of autobiographical knowledge themes that are interrelated across life domains. The content of parallel associations is nearly evenly split between general and transitional events, supporting the importance of transitions in autographical memory. Associations in respondents’ memories (both parallel and sequential), demonstrate complex interactions with interviewer verbal behaviors during generative retrieval. In addition to discussing the implications of these results to the SMS model, implications are also drawn for transition theory and the basic-systems model. VL - 5 SN - 2211-3681 UR - http://www.sciencedirect.com/science/article/pii/S2211368116300183 IS - 2 ER - TY - ABST T1 - The ATUS and SIPP-EHC: Recent developments Y1 - 0 A1 - Belli, R. F. ER - TY - CHAP T1 - Calendar and time diary methods: The tools to assess well-being in the 21st century T2 - Handbook of research methods in health and social sciences Y1 - 0 A1 - Córdova Cazar, Ana Lucía A1 - Belli, Robert F. ED - Liamputtong, P JF - Handbook of research methods in health and social sciences PB - Springer ER - TY - ABST T1 - Does relation of retrieval pathways to data quality differ by self or proxy response status? Y1 - 0 A1 - Lee, Jinyoung A1 - Belli, Robert F. ER - TY - ABST T1 - Evaluating Data quality in Time Diary Surveys Using Paradata Y1 - 0 A1 - Córdova Cazar, Ana Lucía A1 - Belli, Robert F. ER - TY - ABST T1 - An evaluation study of the use of paradata to enhance data quality in the American Time Use Survey (ATUS) Y1 - 0 A1 - Córdova Cazar, Ana Lucía A1 - Belli, Robert F. ER - TY - ABST T1 - Memory Gaps in the American Time Use Survey. Are Respondents Forgetful or is There More to it? Y1 - 0 A1 - Kirchner, Antje A1 - Belli, Robert F. A1 - Deal, Caitlin E. A1 - Córdova-Cazar, Ana Lucia ER - TY - ABST T1 - Respondent retrieval strategies inform the structure of autobiographical knowledge Y1 - 0 A1 - Belli, R. F. ER - TY - ABST T1 - Using audit trails to evaluate an event history calendar survey instrument Y1 - 0 A1 - Lee, Jinyoung A1 - Seloske, Ben A1 - Belli, Robert F. ER - TY - ABST T1 - Using behavior coding to understand respondent retrieval strategies that inform the structure of autobiographical knowledge Y1 - 0 A1 - Belli, R. F. ER - TY - ABST T1 - Working with the SIPP-EHC audit trails: Parallel and sequential retrieval Y1 - 0 A1 - Lee, Jinyoung A1 - Seloske, Ben A1 - Córdova Cazar, Ana Lucía A1 - Eck, Adam A1 - Belli, Robert F. ER -