A Hybrid Technique for the Multiple Imputation of Survey Data

Arnold, B.C., and S.J. Press. 1989. “Compatible Conditional Distributions”. Journal of the American Statistical Association 84:152–156. DOI: https://doi.org/10.2307/2289858.10.2307/2289858Search in Google Scholar

Allison P.D. 2002. Missing Data. Thousand Oaks. CA: Sage Publications. DOI: https://dx.doi.org/10.4135/9781412985079.10.4135/9781412985079Search in Google Scholar

Abdella, M., and T. Marwala, 2005. “The use of genetic algorithms and neural networks to approximate missing data in database”. In Proceedings of the IEEE 3rd International Conference on Computational Cybernetics, 2005. 24: 207–212. DOI: DOI: https://doi.org/10.1109/ICCCYB.2005.1511574.10.1109/ICCCYB.2005.1511574Search in Google Scholar

Ankaiah, N., and V.Ravi. 2011. “A novel soft computing hybrid for data imputation”. In Proceedings of the 7th International Conference on Data Mining (DMIN). Las Vegas. USA. Available at: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.217.7984&rep=rep1&type=pdf.Search in Google Scholar

Akande, O., F. Li, and J. Reiter. 2017. “An empirical comparison of multiple imputation methods for categorical data”. The American Statistician 71: 162–170. DOI: https://doi.org/10.1080/00031305.2016.1277158.10.1080/00031305.2016.1277158Search in Google Scholar

Andridge, R.R., and R.J.A. Little. 2017. “A Review of Hot Deck Imputation for Survey Non-response”. International statistical review 78(1): 40–64. DOI: https://doi.org/10.1111/j.1751-5823.2010.00103.x.10.1111/j.1751-5823.2010.00103.x313033821743766Search in Google Scholar

Armina, R., A.M. Zain, N.A. Ali, and R. Sallehuddin, 2017. “A review on missing value estimation using imputation algorithm”. Journal of Physics: Conference Series 892(1). DOI: https://doi.org/10.1088/1742-6596/892/1/012004.10.1088/1742-6596/892/1/012004Search in Google Scholar

Bengio, Y., and F. Gingras. 1995. “Recurrent neural networks for missing or asynchronous data. In Touretzky, D.S., Mozer, M.C. and Hasselmo, M.E. editors”. Advances in Neural Information Processing Systems 8: 95–401. MIT Press, Cambridge, MA. Available at: https://proceedings.neurips.cc/paper/1995/file/ffeed84c7cb1ae7bf4ec4bd78275bb98-Paper.pdf.Search in Google Scholar

Barnard, J., and X. Meng. 1999. “Applications of multiple imputation in medical studies: From AIDS to NHANES”. Statistical Methods in Medical Research 8:17–36. DOI: https://doi.org/10.1177/096228029900800103.10.1177/09622802990080010310347858Search in Google Scholar

Breiman, L. 2001. “Random Forests”. Machine Learning 45(1): 5–32. DOI: https://doi.org/10.1023/A:1010933404324.10.1023/A:1010933404324Search in Google Scholar

Batista, G., and M.C. Monard. 2003. Experimental comparison of K-nearest neighbour and mean or mode imputation methods with the internal strategies used by C4.5 and CN2 to treat missing data. University of Sao Paulo. Available at: https://www.semanticscholar.org/paper/Experimental-comparison-pf-K-NEAREST-NEIGHBOUR-and-BatistaMonard/35346d559d1bcfdf27acff66267e8f1d67190f23.Search in Google Scholar

Burton, A., D. G. Altman, P. Royston, and R.L. Holder. 2006. “The design of simulation studies in medical statistics”. Statistics in Medicine 25: 4279–4292. DOI: https://doi.org/10.1002/sim.2673.10.1002/sim.267316947139Search in Google Scholar

Chung, D., and F.L. Merat. 1996. Neural network based sensor array signal processing. In: Proc Int Conf Multisens Fusion Integr Intell Syst. Washington. USA: 757–764. DOI: https://doi.org/10.1109/MFI.1996.572313.10.1109/MFI.1996.572313Search in Google Scholar

Chandra, A., G.M. Martinez, W.D. Mosher, J.C. Abma, and J. Jones. 2005. “Fertility, family planning, and reproductive health of U.S. women: data from the 2002 National Survey of Family Growth”. Vital Health Stat 23: 1–160. Available at: https://pubmed.ncbi.nlm.nih.gov/16532609/10.1037/e414702008-001Search in Google Scholar

Corsi, D.J., J.M. Perkins, and S.V. Subramanian. 2017. “Child anthropometry data quality from Demographic and Health Surveys, Multiple Indicator Cluster Surveys, and National Nutrition Surveys in the West Central Africa region: are we comparing apples and oranges?”. Global Health Action. DOI: https://doi.org/10.1080/16549716.2017.1328185.10.1080/16549716.2017.1328185549606328641057Search in Google Scholar

Dunson, D.B., and C. Xing. 2009. “Nonparametric Bayes modeling of multivariate categorical data”. Journal of the American Statistical Association 104: 1042–1051. DOI: https://doi.org/10.1198/jasa.2009.tm08439.10.1198/jasa.2009.tm08439363037823606777Search in Google Scholar

Gelman, A., and T.P. Speed. 1993. “Characterizing a joint probability distribution by conditionals”. Journal of the Royal Statistical Society Series B: Statistical Methodology 55: 85–188. DOI: https://doi.org/10.1111/j.2517-6161.1993.tb01477.x.10.1111/j.2517-6161.1993.tb01477.xSearch in Google Scholar

Graham, J.W., and J.L. Schafer. 1999. “On the performance of multiple imputation for multivariate data with small sample size. In R. Hoyle (Ed.)”. Statistical strategies for small sample research: 1–29.Search in Google Scholar

Gulliford, M.C., O.C. Ukoumunne, and, S. Chinn. 1999. “Components of Variance and Intra class Correlations for the Design of Community-based Surveys and Intervention Studies: Data from the Health Survey for England”. American Journal of Epidemiology 149(9): 876–883. DOI: https://doi.org/10.1.1.565.7897.10.1093/oxfordjournals.aje.a00990410221325Search in Google Scholar

Harel, O., and X.H. Zhou. 2007. “Multiple imputation: Review of theory, implementation and Software”. Statistics in Medicine 26: 3057–3077. DOI: https://doi.org/10.1002/-sim.2787.Search in Google Scholar

Horton, N.J., and K.P. Kleinman. 2007. “Much ado about nothing: a comparison of missing data methods and software to fit incomplete regression models”. The American Statistician 61: 79–90. DOI: https://doi.org/10.1198/000313007X172556.10.1198/000313007X172556183999317401454Search in Google Scholar

Honaker, J., G. King, and M. Blackwell. 2011. “Amelia II: A program for missing data”. Journal of Statistical Software 45(7): 1–47. DOI: https://doi.org/10.18637/jss.v045.i07.10.18637/jss.v045.i07Search in Google Scholar

Hardt, J., M. Herke, and R. Leonhart. 2012. “Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research”. BMC Medical Research Methodology 12(1). DOI: https://doi.org/10.1186/1471-2288-12-184.10.1186/1471-2288-12-184353866623216665Search in Google Scholar

Kohonen, T. 1995. Self-Organizing Maps. Springer. Heidelberg. Available at: https://www.springer.com/gp/book/9783642976100.10.1007/978-3-642-97610-0Search in Google Scholar

Lazarsfeld, P.F. 1950. The logical and mathematical foundation of latent structure analysis. In S. A. Stouffer, L. Guttman, E. A. Suchman, P. F. Lazarsfeld, S. A. Star, & J. A. Clausen, Studies in social psychology in World War II: Vol. 4. Measurement and prediction.Chap. 10: 362–412. Princeton, NJ: Princeton University Press. Available at: https://psycnet.apa.org/record/1951-03037-000.Search in Google Scholar

Li, F., Y. Yu, and D.B. Rubin. 2012. Imputing missing data by fully conditional models: some cautionary examples and guidelines. Duke University Department of Statistical Science Discussion Paper: 11–24. Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.228.7010.Search in Google Scholar

Little, R.J.A. 1988. “A Test of Missing Completely at Random for Multivariate Data with Missing Values”. Journal of the American Statistical Association 83(404): 1198–1202. DOI: https://doi.org/10.1080/01621459.1988.10478722.10.1080/01621459.1988.10478722Search in Google Scholar

Little, R.J. 2018. “On Algorithmic and Modeling Approaches to Imputation in Large Data Sets”. Statistica Sinica. http://www3.stat.sinica.edu.tw/statistica/J30N4/J30N401/J30N401.htmlSearch in Google Scholar

Little, R.J.A., and D.B. Rubin. 2002. Statistical analysis with missing data (2nd edition.). New York: Wiley. Available at: https://www.wiley.com/en-us/Statistical+Analysis+with+Missing+Data%2C+2nd+Edition-p-9781119013563.10.1002/9781119013563Search in Google Scholar

McLachlan, G.J., and D. Peel. 2000. Finite mixture models. New York: Wiley. DOI: http://dx.doi.org/10.1002/0471721182.10.1002/0471721182Search in Google Scholar

Marseguerra, M., and A. Zoia. 2005. “The autoassociative neural network in signal analysis. II. Application to on-line monitoring of a simulated BWR component”. Annals of Nuclear Energy 32(11): 1207–1223. DOI: https://doi.org/10.1016/j.anucene.2005.03.005.10.1016/j.anucene.2005.03.005Search in Google Scholar

Marwala, T., and S. Chakraverty. 2006. “Fault classification in structures with incomplete measured data using auto associative neural networks and genetic algorithm”. Current Science India 90(4): 542-548. JSTOR. Available at: www.jstor.org/stable/24088946.Search in Google Scholar

Morris, T.P., R.W. Ian, and R. Patrick. 2014. “Tuning Multiple Imputation by Predictive Mean Matching and Local Residual Draws. BMC Medical Research Methodology 14 (1): 75. DOI: https://doi.org/10.1186/1471-2288-14-75.10.1186/1471-2288-14-75405196424903709Search in Google Scholar

Murray, J.S., and J.P. Reiter. 2016. “Multiple imputation of missing categorical and continuous values via Bayesian mixture models with local dependence”. Journal of the American Statistical Association 111: 1466–1479. DOI: https://doi.org/10.1080/01621459.2016.1174132.10.1080/01621459.2016.1174132Search in Google Scholar

Narayanan, S., J.L.Vian, J. Choi, M. El-Sharkawi, and B.B.Thompson. 2002. Set constraint discovery: missing sensor data restoration using auto-associative regression machines. In Proceedings of the international Joint Conference on Neural Networks (IJCNN): 2872–2877. DOI: https://doi.org/10.1109/IJCNN.2002.1007604.10.1109/IJCNN.2002.1007604Search in Google Scholar

Oja, E., and S. Kaski. 1999. Kohonen Maps. Elsevier. Amsterdam. Available at: https://www.elsevier.com/books/kohonen-maps/oja/978-0-444-50270-4.Search in Google Scholar

Oba, S., M. Sato, I. Takemasa, M. Monden, K. Matsubara, and S. Ishii. 2003. “A Bayesian missing value estimation method for gene expression profile data”. Bioinformatics 19: 2088–2096. DOI: https://doi.org/10.1093/bioinformatics/btg287.10.1093/bioinformatics/btg28714594714Search in Google Scholar

Pyle, D. 1999. Data preparation for data mining. Morgan Kaufmann Publishers Inc. San Francisco. Available at: https://dl.acm.org/doi/book/10.5555/299577.Search in Google Scholar

Pérez, A., R.J. Dennis, J.F. Gil, M.A. Rondón, and A. López. 2002. “Use of the mean, hot deck and multiple imputation techniques to predict outcome in intensive care unit patients in Colombia”. Statistics in Medicine 21: 3885–3896. DOI: https://doi.org/10.1002/sim.1391.10.1002/sim.139112483773Search in Google Scholar

Quanli, W., M.V. Danial, J.P. Reiter, and H. Jigchen. 2018. NPBayesImputeCat: Non-Parametric Bayesian Multiple Imputation for Categorical Data. R package version 0.1, Available at: https://CRAN.R-project.org/package=NPBayesImputeCat.Search in Google Scholar

Rubin, D.B. 1976. “Inference and Missing Data”. Biometrika 63: 581–590. DOI: https://doi.org/10.2307/2335739.10.2307/2335739Search in Google Scholar

Rubin, D.B. 1987. Multiple Imputation for Nonresponse in Surveys. Wiley, New York. Available at: https://www.wiley.com/en-us/Multiple+Imputation+for+Nonresponse+in+Surveys-p-9780471655749.Search in Google Scholar

Roth, P.L. 1994. “Missing data: A conceptual review for applied psychologysts”. Personnel Psychology 47: 537–560. DOI: https://doi.org/10.1111/j.1744-6570.1994.tb01736.x.10.1111/j.1744-6570.1994.tb01736.xSearch in Google Scholar

Rubin, D.B. 1996. “Multiple imputation after 18 + years”. Journal of the American Statistical Association 91: 473–489. DOI: https://doi.org/10.1080/01621459.1996.10476908.10.1080/01621459.1996.10476908Search in Google Scholar

Raghunathan, T.W., J.M. Lepkowksi, J. van Hoewyk, and P.A. Solenbeger. 2001. “Multivariate technique for multiply imputing missing values using a sequence of regression models”. Survey Methodology 27: 85–95. Available at: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.405.4540.Search in Google Scholar

Reiter, J.P., T.E. Raghunathan, and S. Kinney. 2006. “The importance of modeling the survey design in multiple imputation for missing data”. Survey Methodology 32: 143–149. Available at: http://www2.stat.duke.edu/~jerry/Papers/SM06.pdf.Search in Google Scholar

Royston, P., and I.R. White. 2011. “Multiple imputation by chained equations (mice): Implementation in Stata”. Journal of Statistical Software 45(4): 1–20. DOI: https://doi.org/10.18637/jss.v045.i04.10.18637/jss.v045.i04Search in Google Scholar

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org.Search in Google Scholar

Sharpe, P.K., and R.J. Solly. 1995. “Dealing with missing values in neural network-based diagnostic systems”. Neural Computing and Applications 3(2): 73–77. DOI: https://doi.org/10.1007/BF01421959.10.1007/BF01421959Search in Google Scholar

Schafer, J.L. 1997. Analysis of incomplete multivariate data. London: Chapman and Hall. DOI: https://doi.org/10.1201/9780367803025Search in Google Scholar

Schafer, J.L. and J.W. Graham. 2002. “Missing data: Our view of the state of the art”. Psychological methods 7: 147–177. DOI: https://doi.org/10.1037/1082-989X.7.2.147.10.1037/1082-989X.7.2.147Search in Google Scholar

Schlomer, G.L., S. Bauman, and N.A. Card. 2010. “Best Practices for Missing Data Management in Counseling Psychology”. Journal of Counseling Psychology 57(1): 1–10. DOI: https://doi.org/10.1037/a0018082.10.1037/a001808221133556Search in Google Scholar

Si, Y., and J.P. Reiter. 2013. “Nonparametric Bayesian multiple imputation for incomplete categorical variables in large-scale assessment surveys”. Journal of Educational and Behavioral Statistics 38: 499–521. DOI: https://doi.org/10.3102/1076998613480394.10.3102/1076998613480394Search in Google Scholar

Templ, M., A. Andreas, K. Alexander, and P. Bernd. 2012. VIM: Visualization and Imputation of Missing Values. Available at: http://cran.r-project.org/web/packages/VIM/VIM.pdf.Search in Google Scholar

Van Buuren, S. 2007. “Multiple imputation of discrete and continuous data by fully conditional specification”. Statistical Methods in Medical Research 16: 219–242. DOI: https://doi.org/10.1177/0962280206074463.10.1177/096228020607446317621469Search in Google Scholar

Van Buuren, S. 2012. Flexible Imputation of Missing Data, London: Chapman and Hall/CRC. DOI: https://doi.org/10.1201/b11826.10.1201/b11826Search in Google Scholar

Van Buuren, S., and K. Groothuis-Oudshoorn. 1999. Flexible multivariate imputation by MICE. TNO Prevention and Health. Leiden. Available at: https://stefvanbuuren.name/publications/Flexible%20multivariate%20-%20TNO99054%201999.pdf.Search in Google Scholar

Van Buuren, S., and K. Groothuis-Oudshoorn. 2011. “mice: Multivariate imputation by chained equations”. R. Journal of Statistical Software 45(3): 1–67. DOI: https://doi.org/10.18637/jss.v045.i03.10.18637/jss.v045.i03Search in Google Scholar

Van Ginkel, J.R. 2007. Multiple imputation for incomplete test, questionnaire and survey data. Ph.D. dissertation. Tilburg University. Department of Methodology and Statistics. Available at: https://pure.uvt.nl/ws/portalfiles/portal/839209/224433.pdf.Search in Google Scholar

Vermunt, J.K., J.R. van Ginkel, L.A. van der Ark, and K. Sijtsma. 2008. “Multiple imputation of incomplete categorical data using latent class analysis”. Sociological Methodology 38: 369–397. DOI: https://doi.org/10.1111/j.1467-9531.2008.00202.x.10.1111/j.1467-9531.2008.00202.xSearch in Google Scholar

WHO (World Health Organization). 2003. Community-based Strategies for Breastfeeding Promotion and Support in Developing Countries, 2003. Dept. of child and adolescent health and development. Geneva. Available at: https://www.who.int/maternal_child_adolescent/documents/9241591218/en/.Search in Google Scholar

Wilkinson, L., and Task Force on Statistical Inference. 1999. “Statistical methods in psychology journals: Guidelines and explanations”. American Psychologist 54: 594–604. DOI: https://doi.org/10.1037/0003-066X.54.8.594.10.1037/0003-066X.54.8.594Search in Google Scholar

Zhu, J., and T.E. Raghunathan. 2016. “Convergence Properties of a Sequential Regression Multiple Imputation Algorithm”. Journal of the American Statistical Association 110(511): 1112–1124. DOI: https://doi.org/10.1080/01621459.2014.948117.10.1080/01621459.2014.948117Search in Google Scholar

eISSN:: 2001-7367
Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 4 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Mathematik, Wahrscheinlichkeitstheorie und Statistik

Zeitschrift RSS Feed

A Hybrid Technique for the Multiple Imputation of Survey Data

Online veröffentlicht: 22. Juni 2021

Seitenbereich: 505 - 531

Eingereicht: 01. März 2019

Akzeptiert: 01. Dez. 2020

DOI: https://doi.org/10.1515/jos-2021-0022

SchlüsselwörterComplex dependencies, MICE, multiple indicator cluster surveys

© 2021 Humera Razzak et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Schlüsselwörter
Complex dependencies, MICE, multiple indicator cluster surveys