Papers by Joost van Ginkel
Journal of Statistical Computation and Simulation, Jul 25, 2017
Three-mode analysis is a generalization of principal component analysis to three-mode data. While... more Three-mode analysis is a generalization of principal component analysis to three-mode data. While two-mode data consist of cases that are measured on several variables, three-mode data consist of cases that are measured on several variables at several occasions. As any other statistical technique, the results of three-mode analysis may be influenced by missing data. Three-mode software packages generally use the expectation-maximization (EM) algorithm for dealing with missing data. However, there are situations in which the EM algorithm is expected to break down. Alternatively, multiple imputation may be used for dealing with missing data. In this study we investigated the influence of eight different multiple-imputation methods on the results of three-mode analysis, more specifically, a Tucker2 analysis, and compared the results with those of the EM algorithm. Results of the simulations show that multilevel imputation with the mode with the most levels nested within cases and the mode with the least levels represented as variables gives the best results for a Tucker2 analysis. Thus, this may be a good alternative for the EM algorithm in handling missing data in a Tucker2 analysis.
Data supporting the manuscript "Kavelaars, Van Buuren, and Van Ginkel (2020) Multiple imputa... more Data supporting the manuscript "Kavelaars, Van Buuren, and Van Ginkel (2020) Multiple imputation in data that grow over time".<br><br>
Current Analytical Chemistry, 2012
In this paper we present a four-step procedure for performing a Tucker2 analysis on three-way dat... more In this paper we present a four-step procedure for performing a Tucker2 analysis on three-way data in the case a number of observations are missing. The procedure consists of (1) creating multiple complete data sets via multiple imputation of the missing data, (2) analysing these data sets with the Tucker2 model, (3) combining both the row and the column component matrices of these analyses to create centroid solutions using Generalised Procrustes analyses, and (4) using these centroid solutions to find the associated core slices appertaining to the centroid solutions. This procedure will produce both the basic parameters of the Tucker2 model and estimates of their variability due to the missing data. Chromatographic data are used as an illustration.
Applied Psychological Measurement, 2021
Supplemental material, sj-pdf-3-apm-10.1177_0146621621990757 for SPSS Syntax for Combining Result... more Supplemental material, sj-pdf-3-apm-10.1177_0146621621990757 for SPSS Syntax for Combining Results of Principal Component Analysis of Multiply Imputed Data Sets using Generalized Procrustes Analysis by Bart van Wingerde and Joost van Ginkel in Applied Psychological Measurement
Journal of Personality Assessment
Missing data is a problem that occurs frequently in many scientific areas. The most sophisticated... more Missing data is a problem that occurs frequently in many scientific areas. The most sophisticated method for dealing with this problem is multiple imputation. Contrary to other methods, like listwise deletion, this method does not throw away information, and partly repairs the problem of systematic dropout. Although from a theoretical point of view multiple imputation is considered to be the optimal method, many applied researchers are reluctant to use it because of persistent misconceptions about this method. Instead of providing an(other) overview of missing data methods, or extensively explaining how multiple imputation works, this article aims specifically at rebutting these misconceptions, and provides applied researchers with practical arguments supporting them in the use of multiple imputation.
International Journal of Behavioral Development, 2015
In this article, we test the hypothesis that beliefs about the ideal mother are convergent across... more In this article, we test the hypothesis that beliefs about the ideal mother are convergent across cultures and that these beliefs overlap considerably with attachment theory’s notion of the sensitive mother. In a sample including 26 cultural groups from 15 countries around the globe, 751 mothers sorted the Maternal Behavior Q-Set to reflect their ideas about the ideal mother. The results show strong convergence between maternal beliefs about the ideal mother and attachment theory’s description of the sensitive mother across groups. Cultural group membership significantly predicted variations in maternal sensitivity belief scores, but this effect was substantially accounted for by group variations in socio-demographic factors. Mothers living in rural versus urban areas, with a low family income, and with more children, were less likely to describe the ideal mother as highly sensitive. Cultural group membership did remain a significant predictor of variations in maternal sensitivity b...
Multivariate Behavioral Research
Whenever multiple regression is applied to a multiply imputed data set, several methods for combi... more Whenever multiple regression is applied to a multiply imputed data set, several methods for combining significance tests for R 2 and the change in R 2 across imputed data sets may be used: the combination rules by Rubin, the Fisher z-test for R 2 by Harel, and F-tests for the change in R 2 by Chaurasia and Harel. For pooling R 2 itself, Harel proposed a method based on a Fisher z transformation. In the current article, it is argued that the pooled R 2 based on the Fisher z transformation, the Fisher z-test for R 2 , and the F-test for the change in R 2 have some theoretical flaws. An argument is made for using Rubin's method for pooling significance tests for R 2 instead, and alternative procedures for pooling R 2 are proposed: simple averaging and a pooled R 2 constructed from the pooled significance test by Rubin. Simulations show that the Fisher z-test and Chaurasia and Harel's F-tests generally give inflated type-I error rates, whereas the type-I error rates of Rubin's method are correct. Of the methods for pooling the point estimates of R 2 no method clearly performs best, but it is argued that the average of R 2 's across imputed data set is preferred.
Children and Youth Services Review
Empirical research has shown an elevated risk for externalizing behavior problems in internationa... more Empirical research has shown an elevated risk for externalizing behavior problems in international adoptees. To address the extent to which this risk exists for more serious externalizing problems we compared the rates of registered criminal offending of internationally adopted adolescents with those of non-adopted adolescents in the Netherlands. In a large population-based cohort study (N = 3,758,506 including n = 10,563 international adoptees) on Dutch youth with ages up to 19 years we examined registrations in the program on juvenile crime and in the national police system from 2005 to 2013. Controlling for time lapse and background variables we found that international adoptees had been in contact with the criminal justice system more frequently than nonadoptees. However, the findings differed across region of adoption: Adoptees from South America and from Africa had been in contact with the criminal justice system most frequently (and more often than non-adoptees), whereas adoptees from China (total n = 4569) had the least contacts (and less often than non-adoptees). The percentages of criminal offending of adoptees ranged between 1.16% and 15.83% across regions of adoption (versus 10.86% in non-adoptees). The large majority of adoptees-including those from South America and Africa-were not involved in criminal acts. We hypothesize that the higher and lower risks of criminal offending found for adoptees from certain countries are associated with the varying levels of pre-adoption adversity (e.g., neglect and abuse) that the adoptees have experienced.
Psychometrika
Whenever statistical analyses are applied to multiply imputed datasets, specific formulas are nee... more Whenever statistical analyses are applied to multiply imputed datasets, specific formulas are needed to combine the results into one overall analysis, also called combination rules. In the context of regression analysis, combination rules for the unstandardized regression coefficients, the t-tests of the regression coefficients, and the F-tests for testing $$R^{2}$$R2 for significance have long been established. However, there is still no general agreement on how to combine the point estimators of $$R^{2}$$R2 in multiple regression applied to multiply imputed datasets. Additionally, no combination rules for standardized regression coefficients and their confidence intervals seem to have been developed at all. In the current article, two sets of combination rules for the standardized regression coefficients and their confidence intervals are proposed, and their statistical properties are discussed. Additionally, two improved point estimators of $$R^{2}$$R2 in multiply imputed data ar...
Journal of Statistical Computation and Simulation, 2013
ABSTRACT Principal component analysis (PCA) is a widely used statistical technique for determinin... more ABSTRACT Principal component analysis (PCA) is a widely used statistical technique for determining subscales in questionnaire data. As in any other statistical technique, missing data may both complicate its execution and interpretation. In this study, six methods for dealing with missing data in the context of PCA are reviewed and compared: listwise deletion (LD), pairwise deletion, the missing data passive approach, regularized PCA, the expectation-maximization algorithm, and multiple imputation. Simulations show that except for LD, all methods give about equally good results for realistic percentages of missing data. Therefore, the choice of a procedure can be based on the ease of application or purely the convenience of availability of a technique.
Multivariate Behavioral Research, 2014
As a procedure for handling missing data, Multiple imputation consists of estimating the missing ... more As a procedure for handling missing data, Multiple imputation consists of estimating the missing data multiple times to create several complete versions of an incomplete data set. All these data sets are analyzed by the same statistical procedure, and the results are pooled for interpretation. So far, no explicit rules for pooling F-tests of (repeated-measures) analysis of variance have been defined. In this paper we outline the appropriate procedure for the results of analysis of variance for multiply imputed data sets. It involves both reformulation of the ANOVA model as a regression model using effect coding of the predictors and applying already existing combination rules for regression models. The proposed procedure is illustrated using three example data sets. The pooled results of these three examples provide plausible F- and p-values.
Journal of developmental and behavioral pediatrics : JDBP, 2015
Journal of Statistical Computation and Simulation, Jul 8, 2014
ABSTRACT Principal component analysis (PCA) is a widely used statistical technique for determinin... more ABSTRACT Principal component analysis (PCA) is a widely used statistical technique for determining subscales in questionnaire data. As in any other statistical technique, missing data may both complicate its execution and interpretation. In this study, six methods for dealing with missing data in the context of PCA are reviewed and compared: listwise deletion (LD), pairwise deletion, the missing data passive approach, regularized PCA, the expectation-maximization algorithm, and multiple imputation. Simulations show that except for LD, all methods give about equally good results for realistic percentages of missing data. Therefore, the choice of a procedure can be based on the ease of application or purely the convenience of availability of a technique.
Psychological Reports, May 1, 2002
In 1993 Albach investigated the long-term consequences of sexual abuse on psychological health. A... more In 1993 Albach investigated the long-term consequences of sexual abuse on psychological health. A group of abused women a a control group of nonabused women were asked to fill in a questionnaire assessing posttraumatic stress disorder (PTSD). For ethical reasons, the abused women were warned that filling in the questionnaire might be emotionally stressful. The control group did not receive this warning. The abused women scored higher on the questionnaire than the nonabused women. The warning they received may have influenced their reports. Our experiment investigated this. 101 psychology students were divided into two groups, one who received a warning and a control group who did not. The hypothesis was that people who had been previously warned would score higher on a PTSD questionaire than people who had not. There were, however, no significant differences in mean PTSD scores and no known initial differences between groups.
Proefschrift Universiteit van Tilburg. Lit. opg. - Met samenvatting in het Nederlands.
Background: Infections due to carbapenem resistant gram negative bacteria’s (CRGNB) are increasin... more Background: Infections due to carbapenem resistant gram negative bacteria’s (CRGNB) are increasing and are associated with a very high mortality. Synergistic effects of combination therapy with polymyxins, a carbapenem and rifampin (PCR) are observed in in-vitro studies. Clinical data to support this is limited. We performed a prospective observational cohort study to study risk factors and outcomes. Methods: A prospective observational cohort study was performed from November 2009 to November 2010. All patients older than 18 years with a CRGNB infection were included. Patients were followed until 6 months after discharge or expiration. Results: 104 patients were studied. The mean age was 77 years, 60% were male, 73% received recent antibiotics, 67% were recently hospitalized and 47% lived in a nursing facility. The mean Charlson Index was 8.1, the mean APACHE IV score of the 38 ICU patients was 74. Infection with a CRGNB occurred after a median hospitalization of 16 days. The most ...
Journal of Classification, 2014
ABSTRACT Multiple imputation is one of the most highly recommended procedures for dealing with mi... more ABSTRACT Multiple imputation is one of the most highly recommended procedures for dealing with missing data. However, to date little attention has been paid to methods for combining the results from principal component analyses applied to a multiply imputed data set. In this paper we propose Generalized Procrustes analysis for this purpose, of which its centroid solution can be used as a final estimate for the component loadings. Convex hulls based on the loadings of the imputed data sets can be used to represent the uncertainty due to the missing data. In two simulation studies, the performance of Generalized Procrustes approach is evaluated and compared with other methods. More specifically it is studied how these methods behave when order changes of components and sign reversals of component loadings occur, such as in case of near-equal eigenvalues, or data having almost as many counterindicative items as indicative items. The simulations show that other proposed methods either may run into serious problems or are not able to adequately assess the accuracy due to the presence of missing data. However, when the above situations do not occur, all methods will provide adequate estimates for the PCA loadings.
Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 2010
The focus of this study was the incidence of different kinds of missing-data problems in personal... more The focus of this study was the incidence of different kinds of missing-data problems in personality research and the handling of these problems. Missing-data problems were reported in approximately half of more than 800 articles published in three leading personality journals. In these articles, unit nonresponse, attrition, and planned missingness were distinguished but missing item scores in trait measurement were reported most frequently. Listwise deletion was the most frequently used method for handling all missing-data problems. Listwise deletion is known to reduce the accuracy of parameter estimates and the power of statistical tests and often to produce biased statistical analysis results. This study proposes a simple alternative method for handling missing item scores, known as two-way imputation, which leaves the sample size intact and has been shown to produce almost unbiased results based on multi-item questionnaire data.
Sociological Methodology, 2008
We propose using latent class analysis as an alternative to loglinear analysis for the multiple i... more We propose using latent class analysis as an alternative to loglinear analysis for the multiple imputation of incomplete categorical data. Similar to log-linear models, latent class models can be used to describe complex association structures between the variables used in the imputation model. However, unlike loglinear models, latent class models can be used to build large imputation models containing more than a few categorical variables. To obtain imputations reflecting uncertainty about the unknown model parameters, we use a nonparametric bootstrap procedure as an alternative to the more common full Bayesian approach. The proposed multiple imputation method, which is implemented in Latent GOLD software for latent class analysis, is illustrated with two examples. In a simulated data example, we compare the new method to well-established methods such as maximum likelihood
Uploads
Papers by Joost van Ginkel