1. bookVolume 37 (2021): Issue 2 (June 2021)
    Special Issue on New Techniques and Technologies for Statistics
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
access type Open Access

Variance Estimation after Mass Imputation Based on Combined Administrative and Survey Data

Published Online: 22 Jun 2021
Page range: 433 - 459
Received: 01 May 2019
Accepted: 01 Oct 2020
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
Abstract

This article discusses methods for evaluating the variance of estimated frequency tables based on mass imputation. We consider a general set-up in which data may be available from both administrative sources and a sample survey. Mass imputation involves predicting the missing values of a target variable for the entire population. The motivating application for this article is the Dutch virtual population census, for which it has been proposed to use mass imputation to estimate tables involving educational attainment. We present a new analytical design-based variance estimator for a frequency table based on mass imputation. We also discuss a more general bootstrap method that can be used to estimate this variance. Both approaches are compared in a simulation study on artificial data and in an application to real data of the Dutch census of 2011.

Keywords

Agresti, A. 2013. Categorical Data Analysis (Third Edition). New York: John Wiley and Sons.Search in Google Scholar

Bakker, B.F.M. 2011. “Micro-integration: State of the Art.” In ESSnet on Data Integration, Report on WP1: 77–107. Available at: http://ec.europa.eu/eurostat/cros/content/essnet-di-final-report-wp1_en (accessed October 2020).Search in Google Scholar

Bethlehem, J. 2008. “Surveys without Questions.” In International Handbook of Survey Methodology, edited by E.D. de Leeuw, J.J. Hox, and D.A. Dillman: 500–511. New York: Psychology Press.Search in Google Scholar

Bethlehem, J. 2009. Applied Survey Methods: A Statistical Perspective. Hoboken, NJ: John Wiley and Sons.Search in Google Scholar

Booth, J.G., R.W. Butler, and P. Hall. 1994. “Bootstrap Methods for Finite Populations.” Journal of the American Statistical Association 89: 1282–1289. DOI: http://doi.org/10.1080/01621459.1994.10476868.Search in Google Scholar

Canty, A.J., and A.C. Davison. 1999. “Resampling-based Variance Estimation for Labour Force Surveys.” The Statistician 48: 379–391. DOI: http://doi.org/10.1111/1467-9884.00196.Search in Google Scholar

Chambers, R.L., and C.J. Skinner, eds. 2003. Analysis of Survey Data. Chicester: John Wiley and Sons.Search in Google Scholar

Chauvet, G. 2007. Méthodes de Bootstrap en Population Finie. PhD Thesis, Rennes: ENSAI. Available at: http://pastel.archives-ouvertes.fr/tel-00267689/document. (accessed October 2020).Search in Google Scholar

Daalmans, J. 2017. Mass Imputation for Census Estimation. Discussion Paper, The Hague: Statistics Netherlands. Available at: http://www.cbs.nl/en-gb/background/2017/11/mass-imputation-for-census-estimation. (accessed October 2020).Search in Google Scholar

Daalmans, J. 2018. “Divide-and-Conquer Solutions for Estimating Large Consistent Table Sets.” Statistical Journal of the IAOS 34: 223–233. DOI: http://doi.org/10.3233/SJI-170375.Search in Google Scholar

De Waal, T., J. Daalmans, and F. Linder. 2018. Mass Imputation for Census Estimation: Methodology. Report, The Hague: Statistics Netherlands. Available at: http://ec.europa.eu/eurostat/cros/system/files/admin_wp6_2016_nl.pdf (accessed October 2020).Search in Google Scholar

De Waal, T., J. Pannekoek, and S. Scholtus. 2011. Handbook of Statistical Data Editing and Imputation. Hoboken, NJ: John Wiley and Sons.Search in Google Scholar

Dowle, M., et al. 2019. data.table: Extension of data.frame. R package version 1.12.0. Available at: http://cran.R-project.org/package=data.table. (accessed October 2020).Search in Google Scholar

Efron, B. 1979. “Bootstrap Methods: Another Look at the Jackknife.” The Annals of Statistics 7: 1–26. DOI: http://doi.org/10.1214/aos/1176344552.Search in Google Scholar

Efron, B., and R.J. Tibshirani. 1993. An Introduction to the Bootstrap. London: Chapman & Hall/CRC.Search in Google Scholar

Gross, S.T. 1980. “Median Estimation in Sample Surveys.” In Proceedings of the Section on Survey Research Methods: American Statistical Association, August 11–14, 1980: 181–184. Houston, Texas: American Statistical Association. Available at: http://www.asasrms.org/Proceedings/papers/1980_037.pdf (accessed October 2020).Search in Google Scholar

Kim, J.K., S. Park, Y. Chen, and C. Wu. 2020. “Combining Non-probability and Probability Survey Samples Through Mass Imputation.” Unpublished manuscript. Available at: https://arxiv.org/abs/1812.10694 (accessed October 2020).Search in Google Scholar

Knottnerus, P., and C. van Duin. 2006. “Variances in Repeated Weighting with an Application to the Dutch Labour Force Survey.” Journal of Official Statistics 22: 565 – 584. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/variances-in-repeated-weighting-with-an-application-to-the-dutch-labour-force-survey.pdf (accessed March 2021).Search in Google Scholar

Kuijvenhoven, L., and S. Scholtus. 2011. Bootstrapping Combined Estimators based on Register and Sample Survey Data. Discussion Paper, The Hague: Statistics Netherlands. Available at: http://www.cbs.nl/nl-nl/achtergrond/2011/39/bootstrapping-combined-estimator-based-on-register-and-sample-survey-data. (accessed October 2020).Search in Google Scholar

Lumley, T. 2018. survey: Analysis of Complex Survey Samples. R package version 3.35. Available at: http://cran.R-project.org/package=survey. (accessed October 2020).Search in Google Scholar

Mashreghi, Z., D. Haziza, and C. Léger. 2016. “A Survey of Bootstrap Methods in Finite Population Sampling.” Statistics Surveys 10: 1–52. DOI: http://doi.org/10.1214/16-SS113.Search in Google Scholar

Rubin, D.B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons.Search in Google Scholar

Särndal, C.-E., B. Swensson, and J. Wretman. 1992. Model Assisted Survey Sampling. New York: Springer-Verlag.Search in Google Scholar

Scholtus, S. 2018. Variances of Census Tables after Mass Imputation of Educational Attainment. Discussion Paper, The Hague: Statistics Netherlands. Available at: http://www.cbs.nl/en-gb/background/2018/49/variances-of-census-tables-after-mass-imputation. (accessed October 2020).Search in Google Scholar

Schulte Nordholt, E., M. Hartgers, and R. Gircour, eds. 2004. The Dutch Virtual Census of 2001. Analysis and Methodology. Voorburg/Heerlen: Statistics Netherlands. Available at: http://www.cbs.nl/en-gb/publication/2005/43/the-dutch-virtual-census-of-2001. (accessed October 2020).Search in Google Scholar

Schulte Nordholt, E., J. van Zeijl, and L. Hoeksma (eds.). 2014. Dutch Census 2011. Analysis and Methodology. The Hague/Heerlen: Statistics Netherlands. Available at: http://www.cbs.nl/en-gb/publication/2014/47/dutch-census-2011 (accessed October 2020).Search in Google Scholar

Valliant, R., A.H. Dorfman, and R.M. Royall. 2000. Finite Population Sampling and Inference: A Prediction Approach. New York: John Wiley and Sons.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo