Award Abstract # 1238612
Doctoral Dissertation Research: Investigating the Bias of Alternative Statistical Inference Methods in Sequential Mixed-Mode Surveys

NSF Org: SES
Division of Social and Economic Sciences
Recipient: REGENTS OF THE UNIVERSITY OF MICHIGAN
Initial Amendment Date: August 22, 2012
Latest Amendment Date: August 22, 2012
Award Number: 1238612
Award Instrument: Standard Grant
Program Manager: Cheryl Eavey
ceavey@nsf.gov
 (703)292-7269
SES
 Division of Social and Economic Sciences
SBE
 Directorate for Social, Behavioral and Economic Sciences
Start Date: September 1, 2012
End Date: August 31, 2014 (Estimated)
Total Intended Award Amount: $15,153.00
Total Awarded Amount to Date: $15,153.00
Funds Obligated to Date: FY 2012 = $15,153.00
History of Investigator:
  • Richard Valliant (Principal Investigator)
    rvalliant@survey.umd.edu
  • Zeynep Suzer Gurtekin (Co-Principal Investigator)
Recipient Sponsored Research Office: Regents of the University of Michigan - Ann Arbor
1109 GEDDES AVE STE 3300
ANN ARBOR
MI  US  48109-1015
(734)763-6438
Sponsor Congressional District: 06
Primary Place of Performance: The Regents of the University of Michigan
3003 S. State Street
Ann Arbor
MI  US  48109-1274
Primary Place of Performance
Congressional District:
06
Unique Entity Identifier (UEI): GNJ7BBP73WE9
Parent UEI:
NSF Program(s): Methodology, Measuremt & Stats
Primary Program Source: 01001213DB NSF RESEARCH & RELATED ACTIVIT
01001213RB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):
Program Element Code(s): 133300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.075

ABSTRACT

Sequential mixed-mode surveys use a mix of modes or data collection methods such as mail, telephone, in-person, and web to increase the number of people who respond to a survey. In sequential designs, there is usually no control in assigning subgroups of respondents to modes. As a result, nonrandom assignment of modes is an inherent characteristic of sequential mixed-mode surveys. This design is important since there are usually limited funds to probe people to respond. While the goal of using mixed modes is clear, one compelling research question is how the nonrandom mix of mode impacts survey data and how these effects should be handled in estimating survey population characteristics such as mean income, and health insurance coverage. To date, since the nonrandom mix of modes poses a challenge in evaluating the mode effects, the existing inference methods assume that mode effects can be ignored in sequential mixed-mode surveys despite their unknown impact on the quality of the survey estimates. This research develops and evaluates the statistical inference methods accounting for nonrandom mode effects to test the comparability of the survey estimates from the different modes. In parallel, this project also develops statistical inference methods accounting for both nonresponse and nonrandom mode effects in the presence of nonignorable mode effects. The public-use Current Population Survey (CPS), 1973, and Social Security Records Exact Match, and the nonpublic-use American Community Survey (ACS) data will be used to conduct empirical and simulation evaluations.

This research provides federal agencies, survey organizations, research centers, and other data producers assessment and inferential methods that adjust for both nonresponse and nonrandom mode effects in the context of sequential mixed-mode surveys. Some large surveys have employed some variation of mixed-mode surveys in order to meet budget constraints. On the other hand, in the presence of nonignorable mode effects, the bias properties for the survey population characteristics are not known and the existing assessment and inferential methods do not control for the nonrandom mode effects. This research produces sequential mixed-mode assessment methods which will test the ignorability of the mode effects which can be a threat for the quality of survey data. In parallel, this research also produces methods of inference which will yield higher quality survey estimates in the presence of nonignorable mode effects.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Work addressed two specific research questions related to multi-mode surveys: 1) Are the measurement error differences between modes ignorable? and 2) What are the properties of statistical inference methods that incorporate nonignorable measurement error differences under a mixed-mode survey design? Work was completed on three alternative mixed-mode survey mean estimators.  The general steps in the estimation method were: 1) Impute counterfactual data for the alternative mode(s) under specified mode choice and response models, 2) Use both observed and actual data for the complete set of respondents to compute mode specific survey means, 3) Adjust for nonresponse, 4) Compare mode specific mean estimates, and 5) Combine mode specific means under three alternative combining rules. Step 1) created data that were counterfactual in the sense that imputations for cases that responded using one mode were made as if they had responded via another mode. For the purpose of this research, nonresponse adjustments were omitted, although future research should address it.

Using 1973 Current Population Survey (CPS)-Internal Revenue Service (IRS) Match Data and 2012 CPS data, comparisons were made between the naïve and three alternative methods for Income and Health Insurance Coverage. Three alternative methods were: 1) Simple average estimator, in which mode specific means are combined by using a simple average estimator, 2) Combined estimator in which estimated means are weighted inversely according to their variances, and 3) Combined estimator in which estimated means are weighted inversely according to their mean square errors (MSEs).   A fourth choice was the naïve mean estimator ignoring any mode effects, i.e., 4) Combine data together in one sample and estimate the mean ignoring any differences in modes.  An empirical study and a simulation study on 1973 CPS Match Data were conducted to study the bias properties of naïve estimator and the three alternative estimators. A sensitivity analysis was conducted using the 2012 CPS data.

The empirical study was conducted on a subset of 1973 CPS-IRS Match Data using wage and salary income as reported in the CPS. CPS-reported wage and salary income for 1972 were compared against the 1972 person-level IRS wage and salary income data. The average relative differences by mode were not significantly different between the CPS in-person and the telephone modes. On the other hand, the direct comparisons of individual in-person respondents’ earnings to telephone respondent earnings suggest a possible nonignorable mode choice mechanism, that is, the mode choice mechanism depends on the variable itself. On average, in-person respondents reported their earnings $1,369 less per year than telephone respondents. After controlling for personal characteristics, education, work experience, race (white vs. other), occupation type (professional, sales, craft, laborer), and industry (construction, manufacturing, transportation, trade, service) and residential (household) characteristics, central city, suburb, region, the difference remained significant. The empirical study considers this subset as the population and draws 50 simulation samples under 400/800 sample sizes of persons. In addition to sample size variation, the simulations also varied two imputation methods: 1) ignorable mode choice, and 2) nonignorable mode choice and 1) item missing included, and 2) item missing excluded conditions. The results were in the expected direction. That is, the standard procedure of ignoring potential mode differences led to biased estimators.  The estimator that combined means inversely according to MSEs was generally least biased.  For this estimator to be feasible in practice, one of the modes must be selected as producing the most nearly unbiased estimators.  The differences between item mi...

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page