TY - JOUR T1 - The Response of Consumer Spending to Changes in Gasoline Prices Y1 - forthcoming A1 - Gelman, Michael A1 - Gorodnichenko, Yuriy A1 - Kariv, Shachar A1 - Koustas, Dmitri A1 - Shapiro, Matthew D A1 - Silverman, Daniel A1 - Tadelis, Steven AB - This paper estimates how overall consumer spending responds to changes in gasoline prices. It uses the differential impact across consumers of the sudden, large drop in gasoline prices in 2014 for identification. This estimation strategy is implemented using comprehensive, daily transaction-level data for a large panel of individuals. The estimated marginal propensity to consume (MPC) is approximately one, a higher estimate than estimates found in less comprehensive or well-measured data. This estimate takes into account the elasticity of demand for gasoline and potential slow adjustment to changes in prices. The high MPC implies that changes in gasoline prices have large aggregate effects. ER - TY - RPRT T1 - Recalculating - How Uncertainty in Local Labor Market Definitions Affects Empirical Findings Y1 - 2017 A1 - Foote, Andrew A1 - Kutzbach, Mark J. A1 - Vilhuber, Lars AB - Recalculating - How Uncertainty in Local Labor Market Definitions Affects Empirical Findings Foote, Andrew; Kutzbach, Mark J.; Vilhuber, Lars This paper evaluates the use of commuting zones as a local labor market definition. We revisit Tolbert and Sizer (1996) and demonstrate the sensitivity of definitions to two features of the methodology. We show how these features impact empirical estimates using a well-known application of commuting zones. We conclude with advice to researchers using commuting zones on how to demonstrate the robustness of empirical findings to uncertainty in definitions. The analysis, conclusions, and opinions expressed herein are those of the author(s) alone and do not necessarily represent the views of the U.S. Census Bureau or the Federal Deposit Insurance Corporation. All results have been reviewed to ensure that no confidential information is disclosed, and no confidential data was used in this paper. This document is released to inform interested parties of ongoing research and to encourage discussion of work in progress. Much of the work developing this paper occurred while Mark Kutzbach was an employee of the U.S. Census Bureau. PB - Cornell University UR - http://hdl.handle.net/1813/52649 ER - TY - JOUR T1 - Regionalization of Multiscale Spatial Processes using a Criterion for Spatial Aggregation Error JF - Journal of the Royal Statistical Society -- Series B. Y1 - 2017 A1 - Bradley, J.R. A1 - Wikle, C.K. A1 - Holan, S.H. KW - American Community Survey KW - empirical orthogonal functions KW - MAUP KW - Reduced rank KW - Spatial basis functions KW - Survey data AB - The modifiable areal unit problem and the ecological fallacy are known problems that occur when modeling multiscale spatial processes. We investigate how these forms of spatial aggregation error can guide a regionalization over a spatial domain of interest. By "regionalization" we mean a specification of geographies that define the spatial support for areal data. This topic has been studied vigorously by geographers, but has been given less attention by spatial statisticians. Thus, we propose a criterion for spatial aggregation error (CAGE), which we minimize to obtain an optimal regionalization. To define CAGE we draw a connection between spatial aggregation error and a new multiscale representation of the Karhunen-Loeve (K-L) expansion. This relationship between CAGE and the multiscale K-L expansion leads to illuminating theoretical developments including: connections between spatial aggregation error, squared prediction error, spatial variance, and a novel extension of Obled-Creutin eigenfunctions. The effectiveness of our approach is demonstrated through an analysis of two datasets, one using the American Community Survey and one related to environmental ocean winds. UR - https://arxiv.org/abs/1502.01974 ER - TY - RPRT T1 - Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods Y1 - 2017 A1 - Abowd, John A1 - Schmutte, Ian M. AB - Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods Abowd, John; Schmutte, Ian M. We consider the problem of determining the optimal accuracy of public statistics when increased accuracy requires a loss of privacy. To formalize this allocation problem, we use tools from statistics and computer science to model the publication technology used by a public statistical agency. We derive the demand for accurate statistics from first principles to generate interdependent preferences that account for the public-good nature of both data accuracy and privacy loss. We first show data accuracy is inefficiently under-supplied by a private provider. Solving the appropriate social planner’s problem produces an implementable publication strategy. We implement the socially optimal publication plan for statistics on income and health status using data from the American Community Survey, National Health Interview Survey, Federal Statistical System Public Opinion Survey and Cornell National Social Survey. Our analysis indicates that welfare losses from providing too much privacy protection and, therefore, too little accuracy can be substantial. PB - NCRN Coordinating Office UR - http://hdl.handle.net/1813/52612 ER - TY - RPRT T1 - Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods Y1 - 2017 A1 - John M. Abowd A1 - Ian M. Schmutte AB - We consider the problem of determining the optimal accuracy of public statistics when increased accuracy requires a loss of privacy. To formalize this allocation problem, we use tools from statistics and computer science to model the publication technology used by a public statistical agency. We derive the demand for accurate statistics from first principles to generate interdependent preferences that account for the public-good nature of both data accuracy and privacy loss. We first show data accuracy is inefficiently under-supplied by a private provider. Solving the appropriate social planner’s problem produces an implementable publication strategy. We implement the socially optimal publication plan for statistics on income and health status using data from the American Community Survey, National Health Interview Survey, Federal Statistical System Public Opinion Survey and Cornell National Social Survey. Our analysis indicates that welfare losses from providing too much privacy protection and, therefore, too little accuracy can be substantial. JF - Labor Dynamics Institute Document UR - http://digitalcommons.ilr.cornell.edu/ldi/37/ ER - TY - RPRT T1 - Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods Y1 - 2017 A1 - Abowd, John A1 - Schmutte, Ian M. AB - Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods Abowd, John; Schmutte, Ian M. We consider the problem of the public release of statistical information about a population–explicitly accounting for the public-good properties of both data accuracy and privacy loss. We first consider the implications of adding the public-good component to recently published models of private data publication under differential privacy guarantees using a Vickery-Clark-Groves mechanism and a Lindahl mechanism. We show that data quality will be inefficiently under-supplied. Next, we develop a standard social planner’s problem using the technology set implied by (ε, δ)-differential privacy with (α, β)-accuracy for the Private Multiplicative Weights query release mechanism to study the properties of optimal provision of data accuracy and privacy loss when both are public goods. Using the production possibilities frontier implied by this technology, explicitly parameterized interdependent preferences, and the social welfare function, we display properties of the solution to the social planner’s problem. Our results directly quantify the optimal choice of data accuracy and privacy loss as functions of the technology and preference parameters. Some of these properties can be quantified using population statistics on marginal preferences and correlations between income, data accuracy preferences, and privacy loss preferences that are available from survey data. Our results show that government data custodians should publish more accurate statistics with weaker privacy guarantees than would occur with purely private data publishing. Our statistical results using the General Social Survey and the Cornell National Social Survey indicate that the welfare losses from under-providing data accuracy while over-providing privacy protection can be substantial. A complete archive of the data and programs used in this paper is available via http://doi.org/10.5281/zenodo.345385. PB - Cornell University UR - http://hdl.handle.net/1813/39081 ER - TY - JOUR T1 - The role of statistical disclosure limitation in total survey error JF - Total Survey Error in Practice Y1 - 2017 A1 - A. F. Karr KW - big data issues KW - data quality KW - data swapping KW - decision quality KW - risk-utility paradigms KW - Statistical Disclosure Limitation KW - total survey error AB - This chapter presents the thesis, which is statistical disclosure limitation (SDL) that ought to be viewed as an integral component of total survey error (TSE). TSE and SDL will move forward together, but integrating multiple criteria: cost, risk, data quality, and decision quality. The chapter explores the value of unifying two key TSE procedures - editing and imputation - with SDL. It discusses “Big data” issues, which contains a mathematical formulation that, at least conceptually and at some point in the future, does unify TSE and SDL. Modern approaches to SDL are based explicitly or implicitly on tradeoffs between disclosure risk and data utility. There are three principal classes of SDL methods: reduction/coarsening techniques; perturbative methods; and synthetic data methods. Data swapping is among the most frequently applied SDL methods for categorical data. The chapter sketches how it can be informed by knowledge of TSE. ER - TY - RPRT T1 - Regression Modeling and File Matching Using Possibly Erroneous Matching Variables Y1 - 2016 A1 - Dalzell, N. M. A1 - Reiter, J. P. KW - Statistics - Applications AB - Many analyses require linking records from two databases comprising overlapping sets of individuals. In the absence of unique identifiers, the linkage procedure often involves matching on a set of categorical variables, such as demographics, common to both files. Typically, however, the resulting matches are inexact: some cross-classifications of the matching variables do not generate unique links across files. Further, the matching variables can be subject to reporting errors, which introduce additional uncertainty in analyses. We present a Bayesian file matching methodology designed to estimate regression models and match records simultaneously when categorical matching variables are subject to reporting error. The method relies on a hierarchical model that includes (1) the regression of interest involving variables from the two files given a vector indicating the links, (2) a model for the linking vector given the true values of the matching variables, (3) a measurement error model for reported values of the matching variables given their true values, and (4) a model for the true values of the matching variables. We describe algorithms for sampling from the posterior distribution of the model. We illustrate the methodology using artificial data and data from education records in the state of North Carolina. PB - ArXiv UR - http://arxiv.org/abs/1608.06309 ER - TY - JOUR T1 - Releasing synthetic magnitude micro data constrained to fixed marginal totals JF - Statistical Journal of the International Association for Official Statistics Y1 - 2016 A1 - Wei, Lan A1 - Reiter, Jerome P. KW - Confidential KW - Disclosure KW - establishment KW - mixture KW - poisson KW - risk AB - We present approaches to generating synthetic microdata for multivariate data that take on non-negative integer values, such as magnitude data in economic surveys. The basic idea is to estimate a mixture of Poisson distributions to describe the multivariate distribution, and release draws from the posterior predictive distribution of the model. We develop approaches that guarantee the synthetic data sum to marginal totals computed from the original data, as well approaches that do not enforce this equality. For both cases, we present methods for assessing disclosure risks inherent in releasing synthetic magnitude microdata. We illustrate the methodology using economic data from a survey of manufacturing establishments. VL - 32 UR - http://content.iospress.com/download/statistical-journal-of-the-iaos/sji959 IS - 1 ER - TY - THES T1 - Ranking Firms Using Revealed Preference and Other Essays About Labor Markets T2 - Department of Economics Y1 - 2015 A1 - Isaac Sorkin KW - economics KW - labor markets AB - This dissertation contains essays on three questions about the labor market. Chapter 1 considers the question: why do some firms pay so much and some so little? Firms account for a substantial portion of earnings inequality. Although the standard explanation is that there are search frictions that support an equilibrium with rents, this chapter finds that compensating differentials for nonpecuniary characteristics are at least as important. To reach this finding, this chapter develops a structural search model and estimates it on U.S. administrative data. The model analyzes the revealed preference information in the labor market: specifically, how workers move between the 1.5 million firms in the data. With on the order of 1.5 million parameters, standard estimation approaches are infeasible and so the chapter develops a new estimation approach that is feasible on such big data. Chapter 2 considers the question: why do men and women work at different firms? Men work for higher-paying firms than women. The chapter builds on chapter 1 to consider two explanations for why men and women work in different firms. First, men and women might search from different offer distributions. Second, men and women might have different rankings of firms. Estimation finds that the main explanation for why men and women are sorted is that women search from a lower-paying offer distribution than men. Indeed, men and women are estimated to have quite similar rankings of firms. Chapter 3 considers the question: what are there long-run effects of the minimum wage? An empirical consensus suggests that there are small employment effects of minimum wage increases. This chapter argues that these are short-run elasticities. Long-run elasticities, which may differ from short-run elasticities, are more policy relevant. This chapter develops a dynamic industry equilibrium model of labor demand. The model makes two points. First, long-run regressions have been misinterpreted because even if the short- and long-run employment elasticities differ, standard methods would not detect a difference using U.S. variation. Second, the model offers a reconciliation of the small estimated short-run employment effects with the commonly found pass-through of minimum wage increases to product prices. JF - Department of Economics PB - University of Michigan CY - Ann Arbor, MI UR - http://hdl.handle.net/2027.42/116747 ER - TY - JOUR T1 - Record Linkage using STATA: Pre-processing, Linking and Reviewing Utilities JF - The Stata Journal Y1 - 2015 A1 - Wasi, Nada A1 - Flaaen, Aaron AB - In this article, we describe Stata utilities that facilitate probabilistic record linkage—the technique typically used for merging two datasets with no common record identifier. While the preprocessing tools are developed specifically for linking two company databases, the other tools can be used for many different types of linkage. Specifically, the stnd_compname and stnd_address commands parse and standardize company names and addresses to improve the match quality when linking. The reclink2 command is a generalized version of Blasnik's reclink (2010, Statistical Software Components S456876, Department of Economics, Boston College) that allows for many-to-one matching. Finally, clrevmatch is an interactive tool that allows the user to review matched results in an efficient and seamless manner. Rather than exporting results to another file format (for example, Excel), inputting clerical reviews, and importing back into Stata, one can use the clrevmatch tool to conduct all of these steps within Stata. This helps improve the speed and flexibility of matching, which often involves multiple runs. VL - 15 UR - http://www.stata-journal.com/article.html?article=dm0082 IS - 3 ER - TY - CONF T1 - Recording What the Respondent Says: Does Question Format Matter? T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Smyth, J.D. A1 - Olson, K. JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - JOUR T1 - Reducing the Margins of Error in the American Community Survey Through Data-Driven Regionalization JF - PlosOne Y1 - 2015 A1 - Folch, D. A1 - Spielman, S. E. UR - http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0115626 ER - TY - JOUR T1 - Regionalization of Multiscale Spatial Processes using a Criterion for Spatial Aggregation Error JF - ArXiv Y1 - 2015 A1 - Bradley, J. R. A1 - Wikle, C.K. A1 - Holan, S. H. AB - The modifiable areal unit problem and the ecological fallacy are known problems that occur when modeling multiscale spatial processes. We investigate how these forms of spatial aggregation error can guide a regionalization over a spatial domain of interest. By "regionalization" we mean a specification of geographies that define the spatial support for areal data. This topic has been studied vigorously by geographers, but has been given less attention by spatial statisticians. Thus, we propose a criterion for spatial aggregation error (CAGE), which we minimize to obtain an optimal regionalization. To define CAGE we draw a connection between spatial aggregation error and a new multiscale representation of the Karhunen-Loeve (K-L) expansion. This relationship between CAGE and the multiscale K-L expansion leads to illuminating theoretical developments including: connections between spatial aggregation error, squared prediction error, spatial variance, and a novel extension of Obled-Creutin eigenfunctions. The effectiveness of our approach is demonstrated through an analysis of two datasets, one using the American Community Survey and one related to environmental ocean winds. UR - http://arxiv.org/abs/1502.01974 IS - 1502.01974 ER - TY - JOUR T1 - Rejoinder on: Comparing and selecting spatial predictors using local criteria JF - Test Y1 - 2015 A1 - Bradley, J.R. A1 - Cressie, N. A1 - Shi, T. VL - 24 UR - http://dx.doi.org/10.1007/s11749-014-0414-2 IS - 1 ER - TY - THES T1 - Relaxations of differential privacy and risk utility evaluations of synthetic data and fidelity measures T2 - Statistics Department Y1 - 2015 A1 - McClure, D. AB - Many organizations collect data that would be useful to public researchers, but cannot be shared due to promises of confidentiality to those that participated in the study. This thesis evaluates the risks and utility of several existing release methods, as well as develops new ones with different risk/utility tradeoffs. In Chapter 2, I present a new risk metric, called model-specific probabilistic differ- ential privacy (MPDP), which is a relaxed version of differential privacy that allows the risk of a release to be based on the worst-case among plausible datasets instead of all possible datasets. In addition, I develop a generic algorithm called local sensitiv- ity random sampling (LSRS) that, under certain assumptions, is guaranteed to give releases that meet MPDP for any query with computable local sensitivity. I demon- strate, using several well-known queries, that LSRS releases have much higher utility than standard differentially private release mechanism, the Laplace Mechanism, at only marginally higher risk. In Chapter 3, using to synthesis models, I empirically characterize the risks of releasing synthetic data under the standard “all but one” assumption on intruder background knowledge, as well the effect decreasing the number of observations the intruder knows beforehand has on that risk. I find in these examples that even in the “all but one” case, there is no risk except to extreme outliers, and even then the risk is mild. I find that the effect of removing observations from an intruder’s background knowledge has on risk heavily depends on how well that intruder can fill in those missing observations: the risk remains fairly constant if he/she can fill them in well, and the risk drops quickly if he/she cannot. In Chapter 4, I characterize the risk/utility tradeoffs for an augmentation of synthetic data called fidelity measures (see Section 1.2.3). Fidelity measures were proposed in Reiter et al. (2009) to quantify the degree to which the results of an analysis performed on a released synthetic dataset match with the results of the same analysis performed on the confidential data. I compare the risk/utility of two different fidelity measures, the confidence interval overlap (Karr et al., 2006) and a new fidelity measure I call the mean predicted probability difference (MPPD). Simultaneously, I compare the risk/utility tradeoffs of two different private release mechanisms, LSRS and a heuristic release method called “safety zones”. I find that the confidence interval overlap can be applied to a wider variety of analyses and is more specific than MPPD, but MPPD is more robust to the influence of individual observations in the confidential data, which means it can be released with less noise than the confidence interval overlap with the same level of risk. I also find that while safety zones are much simpler to compute and generally have good utility (whereas the utility of LSRS depends on the value of ε), it is also much more vulnerable to context specific attacks that, while not easy for an intruder to implement, are difficult to anticipate. JF - Statistics Department PB - Duke University VL - PhD UR - http://hdl.handle.net/10161/11365 ER - TY - CONF T1 - The Role of Device Type and Respondent Characteristics in Internet Panel Survey Breakoff T2 - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) Y1 - 2015 A1 - Allan L. McCutcheon JF - 70th Annual Conference of the American Association for Public Opinion Research (AAPOR) CY - Hollywood, Florida UR - http://www.aapor.org/AAPORKentico/Conference/Recent-Conferences.aspx ER - TY - RPRT T1 - Reducing Uncertainty in the American Community Survey through Data-Driven Regionalization Y1 - 2014 A1 - Spielman, Seth A1 - Folch, David AB - Reducing Uncertainty in the American Community Survey through Data-Driven Regionalization Spielman, Seth; Folch, David The American Community Survey (ACS) is the largest US survey of households and is the principal source for neighborhood scale information about the US population and economy. The ACS is used to allocate billions in federal spending and is a critical input to social scientific research in the US. However, estimates from the ACS can be highly unreliable. For example, in over 72% of census tracts the estimated number of children under 5 in poverty has a margin of error greater than the estimate. Uncertainty of this magnitude complicates the use of social data in policy making, research, and governance. This article develops a spatial optimization algorithm that is capable of reducing the margins of error in survey data via the creation of new composite geographies, a process called regionalization. Regionalization is a complex combinatorial problem. Here rather than focusing on the technical aspects of regionalization we demonstrate how to use a purpose built open source regionalization algorithm to post-process survey data in order to reduce the margins of error to some user-specified threshold. PB - University of Colorado at Boulder / University of Tennessee UR - http://hdl.handle.net/1813/38121 ER - TY - CONF T1 - Remembering where: A look at the American Time Use Survey T2 - Paper presented at the annual conference of the Midwest Association for Public Opinion Research Y1 - 2014 A1 - Deal, C. A1 - Cordova-Cazar, A.L. A1 - Countryman, A. A1 - Kirchner, A. A1 - Belli, R.F. JF - Paper presented at the annual conference of the Midwest Association for Public Opinion Research CY - Chicago, IL UR - http://www.mapor.org/conferences.html ER - TY - JOUR T1 - Reputation as a Sufficient Condition for Data Quality on Amazon Mechanical Turk JF - Behavior Research Methods Y1 - 2014 A1 - Peer, E. A1 - Vosgerau, J. A1 - Acquisti, A. VL - 46 ER - TY - CHAP T1 - The Rise of Incarceration Among the Poor with Mental Illnesses: How Neoliberal Policies Contribute T2 - The Routledge Handbook of Poverty in the United States Y1 - 2014 A1 - Camp, J. A1 - Haymes, S. A1 - Haymes, M. V. d. A1 - Miller, R.J. JF - The Routledge Handbook of Poverty in the United States PB - Routledge ER - TY - CONF T1 - The Role of Device Type in Internet Panel Survey Breakoff T2 - Midwest Association for Public Opinion Research Annual Conference Y1 - 2014 A1 - McCutcheon, A.L. JF - Midwest Association for Public Opinion Research Annual Conference CY - Chicago, IL UR - http://www.mapor.org/conferences.html ER - TY - ABST T1 - Recent Advances in Spatial Methods for Federal Surveys Y1 - 2013 A1 - Holan, S.H. ER - TY - RPRT T1 - Reconsidering the Consequences of Worker Displacements: Survey versus Administrative Measurements Y1 - 2013 A1 - Flaaen, Aaron A1 - Shapiro, Matthew A1 - Isaac Sorkin AB - Displaced workers suffer persistent earnings losses. This stark finding has been established by following workers in administrative data after mass layoffs under the presumption that these are involuntary job losses owing to economic distress. Using linked survey and administrative data, this paper examines this presumption by matching worker-supplied reasons for separations with what is happening at the firm. The paper documents substantially different earnings dynamics in mass layoffs depending on the reason the worker gives for the separation. Using a new methodology for accounting for the increase in the probability of separation among all types of survey response during in a mass layoff, the paper finds earnings loss estimates that are surprisingly close to those using only administrative data. Finally, the survey-administrative link allows the decomposition of earnings losses due to subsequent nonemployment into non-participation and unemployment. Including the zero earnings of those identified as being unemployed substantially increases the estimate of earnings losses. PB - University of Michigan UR - http://www-personal.umich.edu/~shapiro/papers/ReconsideringDisplacements.pdf ER - TY - ABST T1 - A Reduced Rank Model for Analyzing Multivariate Spatial Datasets Y1 - 2013 A1 - Bradley, J.R. JF - University of Missouri-Kansas City PB - University of Missouri-Kansas City ER - TY - JOUR T1 - Ringtail: a generalized nowcasting system. JF - WebDB Y1 - 2013 A1 - Antenucci, Dolan A1 - Li, Erdong A1 - Liu, Shaobo A1 - Zhang, Bochun A1 - Cafarella, Michael J A1 - Ré, Christopher AB - Social media nowcasting—using online user activity to de- scribe real-world phenomena—is an active area of research to supplement more traditional and costly data collection methods such as phone surveys. Given the potential impact of such research, we would expect general-purpose nowcast- ing systems to quickly become a standard tool among non- computer scientists, yet it has largely remained a research topic. We believe a major obstacle to widespread adoption is the nowcasting feature selection problem. Typical now- casting systems require the user to choose a handful of social media objects from a pool of billions of potential candidates, which can be a time-consuming and error-prone process. We have built Ringtail, a nowcasting system that helps the user by automatically suggesting high-quality signals. We demonstrate that Ringtail can make nowcasting easier by suggesting relevant features for a range of topics. The user provides just a short topic query (e.g., unemployment) and a small conventional dataset in order for Ringtail to quickly return a usable predictive nowcasting model. VL - 6 UR - http://cs.stanford.edu/people/chrismre/papers/Ringtail-VLDB-demo.pdf ER - TY - JOUR T1 - Ringtail: Feature Selection for Easier Nowcasting. JF - WebDB Y1 - 2013 A1 - Antenucci, Dolan A1 - Cafarella, Michael J A1 - Levenstein, Margaret C. A1 - Ré, Christopher A1 - Shapiro, Matthew AB - In recent years, social media “nowcasting”—the use of on- line user activity to predict various ongoing real-world social phenomena—has become a popular research topic; yet, this popularity has not led to widespread actual practice. We be- lieve a major obstacle to widespread adoption is the feature selection problem. Typical nowcasting systems require the user to choose a set of relevant social media objects, which is difficult, time-consuming, and can imply a statistical back- ground that users may not have. We propose Ringtail, which helps the user choose rele- vant social media signals. It takes a single user input string (e.g., unemployment) and yields a number of relevant signals the user can use to build a nowcasting model. We evaluate Ringtail on six different topics using a corpus of almost 6 billion tweets, showing that features chosen by Ringtail in a wholly-automated way are better or as good as those from a human and substantially better if Ringtail receives some human assistance. In all cases, Ringtail reduces the burden on the user. UR - http://www.cs.stanford.edu/people/chrismre/papers/webdb_ringtail.pdf ER - TY - JOUR T1 - Rising extreme poverty in the United States and the response of means-tested transfers. JF - Social Service Review Y1 - 2013 A1 - H. Luke Shaefer A1 - Edin, K. AB - This study documents an increase in the prevalence of extreme poverty among US households with children between 1996 and 2011 and assesses the response of major federal means-tested transfer programs. Extreme poverty is defined using a World Bank metric of global poverty: \$2 or less, per person, per day. Using the 1996–2008 panels of the Survey of Income and Program Participation (SIPP), we estimate that in mid-2011, 1.65 million households with 3.55 million children were living in extreme poverty in a given month, based on cash income, constituting 4.3 percent of all nonelderly households with children. The prevalence of extreme poverty has risen sharply since 1996, particularly among those most affected by the 1996 welfare reform. Adding SNAP benefits to household income reduces the number of extremely poor households with children by 48.0 percent in mid-2011. Adding SNAP, refundable tax credits, and housing subsidies reduces it by 62.8 percent. VL - 87 UR - http://www.jstor.org/stable/10.1086/671012 IS - 2 ER - TY - JOUR T1 - Rejoinder: An approach for identifying and predicting economic recessions in real time using time frequency functional models JF - Applied Stochastic Models in Business and Industry Y1 - 2012 A1 - Holan, S. A1 - Yang, W. A1 - Matteson, D. A1 - Wikle, C. VL - 28 UR - http://onlinelibrary.wiley.com/doi/10.1002/asmb.1955/full ER - TY - ABST T1 - Relation of questionnaire navigation patterns and data quality: Keystroke data analysis Y1 - 0 A1 - Lee, Jinyoung ER - TY - ABST T1 - Respondent retrieval strategies inform the structure of autobiographical knowledge Y1 - 0 A1 - Belli, R. F. ER - TY - ABST T1 - Response Scales: Effects on Data Quality for Interviewer Administered Surveys Y1 - 0 A1 - Sarwar, Mazen A1 - Olson, Kristen A1 - Smyth, Jolene ER -