Package cSEM': March 29, 2020

Download as pdf or txt
Download as pdf or txt
You are on page 1of 92

Package ‘cSEM’

March 29, 2020


Title Composite-Based Structural Equation Modeling
Version 0.2.0
Date 2020-03-30
Maintainer Manuel E. Rademaker <[email protected]>
Depends R (>= 3.5.0)
Description Estimate, assess, test, and study linear, nonlinear, hierarchical
and multigroup structural equation models using composite-based approaches
and procedures, including estimation techniques such as partial least squares
path modeling (PLS-PM) and its derivatives (PLSc, ordPLSc, robustPLSc),
generalized structured component analysis (GSCA), generalized structured
component analysis with uniqueness terms (GSCAm), generalized canonical
correlation analysis (GCCA), principal component analysis (PCA),
factor score regression (FSR) using sum score, regression or
bartlett scores (including bias correction using Croon’s approach),
as well as several tests and typical postestimation procedures
(e.g., verify admissibility of the estimates, assess the model fit,
test the model fit etc.).

BugReports https://github.com/M-E-Rademaker/cSEM/issues

URL https://github.com/M-E-Rademaker/cSEM,
https://m-e-rademaker.github.io/cSEM/
License GPL-3
Encoding UTF-8
LazyData true
Imports abind, alabama, cli, crayon, expm, future.apply, future,
lavaan, magrittr, MASS, Matrix, matrixcalc, matrixStats,
polycor, psych, purrr, Rdpack, stats, symmoments, utils
RdMacros Rdpack
RoxygenNote 7.0.2
Suggests dplyr, tidyr, knitr, nnls, prettydoc, plotly, rmarkdown,
listviewer, testthat, ggplot2

1
2 R topics documented:

VignetteBuilder knitr
NeedsCompilation no
Author Manuel E. Rademaker [aut, cre]
(<https://orcid.org/0000-0002-8902-3561>),
Florian Schuberth [aut] (<https://orcid.org/0000-0002-2110-9086>),
Tamara Schamberger [ctb] (<https://orcid.org/0000-0002-7845-784X>),
Michael Klesel [ctb],
Theo K. Dijkstra [ctb],
Jörg Henseler [ctb] (<https://orcid.org/0000-0002-9736-3048>)
Repository CRAN
Date/Publication 2020-03-29 11:00:20 UTC

R topics documented:
Anime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
args_default . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
assess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
BergamiBagozzi2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
calculateAVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
calculateDf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
calculatef2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
calculateGoF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
calculateHTMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
calculateVIFModeB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
calculateWeightsGSCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
calculateWeightsGSCAm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
calculateWeightsKettenring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
calculateWeightsPCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
calculateWeightsPLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
calculateWeightsUnit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
csem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
dgp_2ndorder_cf_of_c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
distance_measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
doFloodlightAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
doRedundancyAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
doSurfaceAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
fit_measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
getConstructScores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ITFlex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
LancelotMiltgenetal2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
parseModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
plot.cSEMFloodlight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
plot.cSEMSurface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
PoliticalDemocracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
predict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Anime 3

reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
resamplecSEMResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
resampleData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Russett . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
satisfaction_gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Sigma_Summers_composites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
summarize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
testHausman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
testMGD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
testMICOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
testOMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
threecommonfactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
verify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Yooetal2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Index 91

Anime Data: Anime

Description

A data frame with 183 observations and 13 variables.

Usage

Anime

Format

An object of class data.frame with 183 rows and 13 columns.

Details

The data set for the example on github.com/ISS-Analytics/pls-predict with irrelevant variables re-
moved.

Source

Original source: github.com/ISS-Analytics/pls-predict


4 assess

args_default Show argument defaults or candidates

Description

Show all arguments used by package functions including default or candidate values. For argument
descriptions see: csem_arguments.

Usage

args_default(.choices = FALSE)

Arguments

.choices Logical. Should candidate values for the arguments be returned? Defaults to
FALSE.

Details

By default args_default()returns a list of default values by argument name. If the list of accepted
candidate values is required instead, use .choices = TRUE.

Value

A named list of argument names and defaults or accepted candidates.

See Also

handleArgs(), csem_arguments, csem(), foreman()

assess Assess model

Description

Assess a model using common quality criteria. See the Postestimation: Assessing a model article
on the cSEM website for details.
assess 5

Usage
assess(
.object = NULL,
.quality_criterion = c("all", "ave", "rho_C", "rho_C_mm", "rho_C_weighted",
"rho_C_weighted_mm", "cronbachs_alpha",
"cronbachs_alpha_weighted", "dg", "dl", "dml", "df",
"effects", "f2", "chi_square", "chi_square_df",
"cfi", "gfi", "ifi", "nfi", "nnfi",
"reliability",
"rmsea", "rms_theta", "srmr",
"gof", "htmt", "r2", "r2_adj",
"rho_T", "rho_T_weighted", "vif",
"vifmodeB", "fl_criterion"),
.only_common_factors = TRUE,
...
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.quality_criterion
Character string. A single character string or a vector of character strings nam-
ing the quality criterion to compute. See the Details section for a list of possible
candidates. Defaults to "all" in which case all possible quality criteria are com-
puted.
.only_common_factors
Logical. Should only concepts modeled as common factors be included when
calculating one of the following quality critera: AVE, the Fornell-Larcker crite-
rion, HTMT, and all reliability estimates. Defaults to TRUE.
... Further arguments passed to functions called by assess(). See args_assess_dotdotdot
for a complete list of available arguments.

Details
The function is essentially a wrapper around a number of internal functions that perform an "as-
sessment task" (called a quality criterion in cSEM parlance) like computing reliability estimates,
the effect size (Cohen’s f^2), the heterotrait-monotrait ratio of correlations (HTMT) etc.
By default every possible quality criterion is calculated (.quality_criterion = "all"). If only a
subset of quality criteria are needed a single character string or a vector of character strings naming
the criteria to be computed may be supplied to assess() via the .quality_criterion argument.
Currently, the following quality criteria are implemented (in alphabetical order):

Average variance extracted (AVE); "ave" An estimate of the amount of variation in the indica-
tors that is due to the underlying latent variable. Practically, it is calculated as the ratio of
the (indicator) true score variances (i.e., the sum of the squared loadings) relative to the sum
of the total indicator variances. The AVE is inherently tied to the common factor model.
It is therefore unclear how to meaningfully interpret AVE results for constructs modeled as
composites. It is possible to report the AVE for constructs modeled as composites by setting
6 assess

.only_common_factors = FALSE, however, result should be interpreted with caution as they


may not have a conceptual meaning. Calculation is done by calculateAVE().
Congeneric reliability; "rho_C", "rho_C_mm,", "rho_C_weighted", "rho_C_weighted_mm"
An estimate of the reliability assuming a congeneric measurement model (i.e., loadings are
allowed to differ) and a test score (proxy) based on unit weights. There are four different ver-
sions implemented. See the Methods and Formulae section of the Postestimation: Assessing
a model article on the cSEM website for details. Alternative but synonemmous names for
"rho_C" are: composite reliability, construct reliablity, reliability coefficient, Joereskog’s rho,
coefficient omega, or Dillon-Goldstein’s rho. For "rho_C_weighted": (Dijkstra-Henselers)
rhoA. rho_C_mm and rho_C_weighted_mm have no corresponding names. The former uses
unit weights scaled by (w’Sw)^(-1/2) and the latter weights scaled by (w’Sigma_hat w)^(-
1/2) where Sigma_hat is the model-implied indicator correlation matrix. The Congeneric
reliability is inherently tied to the common factor model. It is therefore unclear how to mean-
ingfully interpret congeneric reliability estimates for constructs modeled as composites. It is
possible to report the congeneric reliability for constructs modeled as composites by setting
.only_common_factors = FALSE, however, result should be interpreted with caution as they
may not have a conceptual meaning. Calculation is done by calculateRhoC().
Cronbach’s alpha; "cronbachs_alpha" An estimate of the reliability assuming a tau-equivalent
measurement model (i.e., a measurement model with equal loadings) and a test score (proxy)
based on unit weights. To compute Cronbach’s alpha based on a score that uses the weights of
the weight approach used to obtain .object, use "cronbachs_alpha_weighted" instead.
Cronbach’s alpha is an alias for "rho_T" the tau-equivalent reliability which is the pref-
ered name for this kind of reliability in cSEM, as it clearly states what it actually estimates
(the tau-equivalent reliability as opposed to the congeneric reliability). "rho_T" and "cron-
bachs_alpha" are therefore always identical. The tau-equivalent reliability (Cronbach’s alpha)
is inherently tied to the common factor model. It is therefore unclear how to meaningfully in-
terpret tau-equivalent reliability estimates for constructs modeled as composites. It is possible
to report tau-equivalent reliability estimates for constructs modeled as composites by setting
.only_common_factors = FALSE, however, result should be interpreted with caution as they
may not have a conceptual meaning. Calculation is done by calculateRhoT()
Distance measures; "dg", "dl", "dml" Measures of the distance between the model-implied and
the empirical indicator correlation matrix. Currently, the geodesic distance ("dg"), the squared
Euclidian distance ("dl") and the the maximum likelihood-based distance function are imple-
mented ("dml"). Calculation is done by calculateDL(), calculateDG(), and calculateDML().
Degrees of freedom, "df" Returns the degrees of freedom. Calculation is done by calculateDf().
Effects; "effects" Total and indirect effect estimates. Additionally, the variance accounted for
(VAF) is computed. The VAF is defined as the ratio of a variables indirect effect to its total
effect. Calculation is done by calculateEffects().
Effect size; "f2" An index of the effect size of an independent variable in a structural regression
equation. This measure is commonly known as Cohen’s f^2. The effect size of the k’th
independent variable in this case is definied as the ratio (R2_included - R2_excluded)/(1 -
R2_included), where R2_included and R2_excluded are the R squares of the original struc-
tural model regression equation (R2_included) and the alternative specification with the k’th
variable dropped (R2_excluded). Calculation is done by calculatef2().
Fit indices; "chi_square", "chi_square_df", "cfi", "gfi", "ifi", "nfi", "nnfi", "rmsea", "rms_theta", "srmr"
Several absolute and incremental fit indices. Note that their suitability for models containing
constructs modeled as composites is still an open research question. Also note that fit indices
assess 7

are not tests in a hypothesis testing sense and decisions based on common one-size-fits-all cut-
offs proposed in the literature suffer from serious statistical drawbacks. Calculation is done by
calculateChiSquare(), calculateChiSquareDf(), calculateCFI(), calculateGFI(),
calculateIFI(), calculateNFI(), calculateNNFI(), calculateRMSEA(), calculateRMSTheta()
and calculateSRMR().
Fornell-Larcker criterion; "fl_criterion" A rule suggested by Fornell and Larcker (1981) to as-
sess discriminant validity. The Fornell-Larcker criterion is a decision rule based on a compari-
son between the squared construct correlations and the average variance extracted. FL returns
a matrix with the squared construct correlations on the off-diagonal and the AVE’s on the main
diagonal. Calculation is done by assess().
Goodness of Fit (GoF); "gof" The GoF is defined as the square root of the mean of the R squares
of the structural model times the mean of the variances in the indicators that are explained by
their related constructs (i.e., the average over all lambda^2_k). For the latter, only constructs
modeled as common factors are considered as they explain their indicator variance in contrast
to a composite where indicators actually build the construct. Note that, contrary to what the
name suggests, the GoF is not a measure of model fit in a Chi-square fit test sense. Calculation
is done by calculateGoF().
Heterotrait-monotrait ratio of correlations (HTMT); "htmt" An estimate of the correlation be-
tween latent variables. The HTMT is used to assess convergent and/or discriminant validity of
a construct. The HTMT is inherently tied to the common factor model. If the model contains
less than two constructs modeled as common factors and .only_common_factors = TRUE, NA
is returned. It is possible to report the HTMT for constructs modeled as composites by setting
.only_common_factors = FALSE, however, result should be interpreted with caution as they
may not have a conceptual meaning. Calculation is done by calculateHTMT().
Reliability: "reliability" As described in the Methods and Formulae section of the Postestima-
tion: Assessing a model article on the cSEM website there are many different estimators for
the (internal consistency) reliability. Choosing .quality_criterion = "reliability" com-
putes the three most common measures, namely: "Cronbachs alpha" (identical to "rho_T"),
"Jöreskogs rho" (identical to "rho_C_mm"), and "Dijkstra-Henselers rho A" (identical to
"rho_C_weighted_mm"). Reliability is inherently tied to the common factor model. It is
therefore unclear how to meaningfully interpret reliability estimates for constructs modeled
as composites. It is possible to report the three common reliability estimates for constructs
modeled as composites by setting .only_common_factors = FALSE, however, result should
be interpreted with caution as they may not have a conceptual meaning.
R square and R square adjusted; "r2", "r2_adj" The R square and the adjusted R square for
each structural regression equation. Calculated when running csem().
Tau-equivalent reliability; "rho_T" An estimate of the reliability assuming a tau-equivalent mea-
surement model (i.e. a measurement model with equal loadings) and a test score (proxy) based
on unit weights. Tau-equivalent reliability is the preferred name for reliability estimates that
assume a tau-equivalent measurment model such as Cronbach’s alpha. The tau-equivalent
reliability (Cronbach’s alpha) is inherently tied to the common factor model. It is therefore
unclear how to meaningfully interpret tau-equivalent reliability estimates for constructs mod-
eled as composites. It is possible to report tau-equivalent reliability estimates for constructs
modeled as composites by setting .only_common_factors = FALSE, however, result should
be interpreted with caution as they may not have a conceptual meaning. Calculation is done
by calculateRhoT().
8 assess

Variance inflation factors (VIF); "vif" An index for the amount of (multi-) collinearity between
independent variables of a regression equation. Computed for each structural equation. Prac-
tically, VIF_k is defined as the ratio of 1 over (1 - R2_k) where R2_k is the R squared from a
regression of the k’th independent variable on all remaining independent variables. Calculated
when running csem().
Variance inflation factors for PLS-PM mode B (VIF-ModeB); "vifmodeB" An index for the amount
of (multi-) collinearity between independent variables (indicators) in mode B regression equa-
tions. Computed only if .object was obtained using .weight_approach = "PLS-PM" and at
least one mode was mode B. Practically, VIF-ModeB_k is defined as the ratio of 1 over (1 -
R2_k) where R2_k is the R squared from a regression of the k’th indicator of block j on all
remaining indicators of the same block. Calculation is done by calculateVIFModeB().

For details on the most important quality criteria see the Methods and Formulae section of the
Postestimation: Assessing a model article on the on the cSEM website.
Some of the quality criteria are inherently tied to the classical common factor model and therefore
only meaningfully interpreted within a common factor model (see the Postestimation: Assessing a
model article for details). It is possible to force computation of all quality criteria for constructs
modeled as composites by setting .only_common_factors = FALSE, however, we explicitly warn
to interpret quality criteria in analogy to the common factor model in this case, as the interpretation
often does not carry over to composite models.

Resampling: To resample a given quality criterion supply the name of the function that calculates
the desired quality criterion to csem()’s .user_funs argument. See resamplecSEMResults()
for details.

Value
A named list of quality criteria. Note that if only a single quality criteria is computed the return
value is still a list!

See Also
csem(), resamplecSEMResults()

Examples
# ===========================================================================
# Using the threecommonfactors dataset
# ===========================================================================
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# Each concept is measured by 3 indicators, i.e., modeled as latent variable


eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
BergamiBagozzi2000 9

res <- csem(threecommonfactors, model)


a <- assess(res) # computes all quality criteria (.quality_criterion = "all")
a

## The return value is a named list


a$HTMT

# You may also just compute a subset of the quality criteria


assess(res, .quality_criterion = c("ave", "rho_C", "htmt"))

## Resampling ---------------------------------------------------------------
# To resample a given quality criterion use csem()'s .user_funs argument
# Note: The output of the quality criterion needs to be a vector or a matrix.
# Matrices will be vectorized columnwise.
res <- csem(threecommonfactors, model,
.resample_method = "bootstrap",
.R = 40,
.user_funs = cSEM:::calculateHTMT
)

## Look at the resamples


res$Estimates$Estimates_resample$Estimates1$User_fun$Resampled[1:4, ]

## Use infer() to compute e.g. the 95% percentile confidence interval


res_infer <- infer(res, .quantity = "CI_percentile")
res_infer$User_fun

## Several quality criteria can be resampeled simultaneously


res <- csem(threecommonfactors, model,
.resample_method = "bootstrap",
.R = 40,
.user_funs = list(
"HTMT" = cSEM:::calculateHTMT,
"SRMR" = cSEM:::calculateSRMR,
"RMS_theta" = cSEM:::calculateRMSTheta
),
.tolerance = 1e-04
)
res$Estimates$Estimates_resample$Estimates1$HTMT$Resampled[1:4, ]
res$Estimates$Estimates_resample$Estimates1$RMS_theta$Resampled[1:4]

BergamiBagozzi2000 Data: BergamiBagozzi2000

Description

A data frame containing 22 variables with 305 observations.


10 BergamiBagozzi2000

Usage
BergamiBagozzi2000

Format
An object of class data.frame with 305 rows and 22 columns.

Details
The dataset contains 22 variables and originates from a larger survey among South Korean employ-
ees conducted and reported by Bergami and Bagozzi (2000). It is also used in Hwang and Takane
(2014) and Henseler (2020) for demonstration purposes (Tutorial 6).

Source
Survey among South Korean employees conducted and reported by Bergami and Bagozzi (2000).

References
Bergami M, Bagozzi RP (2000). “Self-categorization, affective commitment and group self-esteem
as distinct aspects of social identity in the organization.” British Journal of Social Psychology,
39(4), 555–577. doi: 10.1348/014466600164633.

Henseler J (2020). Composite-Based Structural Equation Modeling: An Introduction to Partial


Least Squares & Co. Using ADANCO. Guilford Press.

Hwang H, Takane Y (2014). Generalized Structured Component Analysis: A Component-Based


Approach to Structural Equation Modeling, Chapman & Hall/CRC Statistics in the Social and Be-
havioral Sciences. Chapman and Hall/CRC.

Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_Bergami_Bagozzi="
# Measurement models
OrgPres =~ cei1 + cei2 + cei3 + cei4 + cei5 + cei6 + cei7 + cei8
OrgIden =~ ma1 + ma2 + ma3 + ma4 + ma5 + ma6
AffLove =~ orgcmt1 + orgcmt2 + orgcmt3 + orgcmt7
AffJoy =~ orgcmt5 + orgcmt8
Gender <~ gender

# Structural model
OrgIden ~ OrgPres
AffLove ~ OrgPres + OrgIden + Gender
AffJoy ~ OrgPres + OrgIden + Gender
"

out <- csem(.data = BergamiBagozzi2000, .model = model_Bergami_Bagozzi,


.PLS_weight_scheme_inner = 'factorial',
calculateAVE 11

.tolerance = 1e-06
)

calculateAVE Average variance extracted (AVE)

Description
Calculate the average variance extracted (AVE) as proposed by Fornell and Larcker (1981). For
details see the cSEM website

Usage
calculateAVE(
.object = NULL,
.only_common_factors = TRUE
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.only_common_factors
Logical. Should only concepts modeled as common factors be included when
calculating one of the following quality critera: AVE, the Fornell-Larcker crite-
rion, HTMT, and all reliability estimates. Defaults to TRUE.

Details
The AVE is inherently tied to the common factor model. It is therefore unclear how to meaningfully
interpret the AVE in the context of a composite model. It is possible, however, to force computation
of the AVE for constructs modeled as composites by setting .only_common_factors = FALSE.

Value
A named vector of numeric values (the AVEs). If .object is a list of cSEMResults objects, a list
of AVEs is returned.

References
Fornell C, Larcker DF (1981). “Evaluating structural equation models with unobservable variables
and measurement error.” Journal of Marketing Research, XVIII, 39–50.

See Also
assess(), cSEMResults
12 calculateDf

calculateDf Degrees of freedom

Description

Calculate the degrees of freedom for a given model from a cSEMResults object.

Usage

calculateDf(
.object = NULL,
.null_model = FALSE,
...
)

Arguments

.object An R object of class cSEMResults resulting from a call to csem().


.null_model Logical. Should the degrees of freedom for the null model be computed? De-
faults to FALSE.
... Ignored.

Details

Although, composite-based estimators always retrieve parameters of the postulated models via the
estimation of a composite model, the computation of the degrees of freedom depends on the postu-
lated model.
See: cSEM website for details on how the degrees of freedom are calculated.
To compute the degrees of freedom of the null model use .null_model = TRUE. The degrees of
freedom of the null model are identical to the number of non-redundant off-diagonal elements of
the empirical indicator correlation matrix. This implicitly assumes a null model with model-implied
indicator correlation matrix equal to the identity matrix.

Value

A single numeric value.

See Also

assess(), cSEMResults
calculatef2 13

calculatef2 Calculate Cohens f^2

Description
Calculate the effect size for regression analysis (Cohen 1992) known as Cohen’s f^2.

Usage
calculatef2(.object = NULL)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().

Value
A matrix with as many rows as there are structural equations. The number of columns is equal to
the total number of right-hand side variables of these equations.

See Also
assess(), csem, cSEMResults

calculateGoF Goodness of Fit (GoF)

Description
Calculate the Goodness of Fit (GoF) proposed by Tenenhaus et al. (2004). Note that, contrary
to what the name suggests, the GoF is not a measure of model fit in the sense of SEM. See e.g.
Henseler and Sarstedt (2012) for a discussion.

Usage
calculateGoF(
.object = NULL
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().

Details
The GoF is inherently tied to the common factor model. It is therefore unclear how to meaningfully
interpret the GoF in the context of a model that contains constructs modeled as composites.
14 calculateHTMT

Value
A single numeric value.

References
Henseler J, Sarstedt M (2012). “Goodness-of-fit Indices for Partial Least Squares Path Modeling.”
Computational Statistics, 28(2), 565–580. doi: 10.1007/s0018001203171.

Tenenhaus M, Amanto S, Vinzi VE (2004). “A Global Goodness-of-Fit Index for PLS Structural
Equation Modelling.” In Proceedings of the XLII SIS Scientific Meeting, 739–742.

See Also
assess(), cSEMResults

calculateHTMT HTMT

Description
Compute the heterotrait-monotrait ratio of correlations (HTMT) based on Henseler et al. (2015).
The HTMT is a consistent estimator for the construct correlations of tau-equivalent measurement
model. It is used to assess discriminant validity.

Usage
calculateHTMT(
.object = NULL,
.absolute = TRUE,
.alpha = 0.05,
.handle_inadmissibles = c("drop", "ignore", "replace"),
.inference = FALSE,
.only_common_factors = TRUE,
.R = 499,
.seed = NULL
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.absolute Logical. Should the absolute HTMT values be returned? Defaults to TRUE .
.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "drop",
"ignore", or "replace". If "drop", all replications/resamples yielding an inadmis-
sible result will be dropped (i.e. the number of results returned will potentially
calculateVIFModeB 15

be less than .R). For "ignore" all results are returned even if all or some of the
replications yielded inadmissible results (i.e. number of results returned is equal
to .R). For "replace" resampling continues until there are exactly .R admissi-
ble solutions. Depending on the frequency of inadmissible solutions this may
significantly increase computing time. Defaults to "drop".
.inference Logical. Should critical values be computed? Defaults to FALSE.
.only_common_factors
Logical. Should only concepts modeled as common factors be included when
calculating one of the following quality critera: AVE, the Fornell-Larcker crite-
rion, HTMT, and all reliability estimates. Defaults to TRUE.
.R Integer. The number of bootstrap replications. Defaults to 499.
.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!

Details
Computation of the HTMT assumes that all intra-block and inter-block correlations between indica-
tors are either all-positive or all-negative. A warning is given if this is not the case. If all intra-block
or inter-block correlations are negative the absolute HTMT values are returned (.absolute = TRUE).
To obtain the alpha%-quantile of the bootstrap distribution for each HTMT value set .inference
= TRUE.
Since the HTMT is defined with respect to a classical true score measurement model only concepts
modeled as common factors are considered by default. For concepts modeled as composites the
HTMT may be computed by setting .only_common_factors = FALSE, however, it is unclear how
to interpret values in this case.

Value
A lower tringular matrix of HTMT values. If .inference = TRUE the upper tringular part is the
.alpha%-quantile of the HTMT’s bootstrap distribution.

See Also
assess(), csem, cSEMResults

calculateVIFModeB Calculate variance inflation factors (VIF) for weights obtained by PLS
Mode B

Description
Calculate the variance inflation factor (VIF) for weights obtained by PLS-PM’s Mode B.

Usage
calculateVIFModeB(.object = NULL)
16 calculateWeightsGSCA

Arguments
.object An R object of class cSEMResults resulting from a call to csem().

Details
Weight estimates obtained by Mode B can suffer from multicollinearity. VIF values are commonly
used to assess the severity of multicollinearity.
The function is only applicable to objects of class cSEMResults_default. For other object classes
use assess().

Value
A named list of vectors containing the VIF values. Each list name is the name of a construct whose
weights were obtained by Mode B. The vectors contain the VIF values obtained from a regression
of each explanatory variable of a given construct on the remaining explanatory variables of that
construct.
If the weighting approach is not "PLS-PM" or for none of the constructs Mode B is used, the function
silently returns NA.

References
There are no references for Rd macro \insertAllCites on this help page.

See Also
assess(), cSEMResults

calculateWeightsGSCA Calculate composite weights using GSCA

Description
Calculate composite weights using generalized structure component analysis (GSCA). The first ver-
sion of this approach was presented in Hwang and Takane (2004). Since then, several advancements
have been proposed. The latest version of GSCA can been found in Hwang and Takane (2014). This
is the version cSEMs implementation is based on.

Usage
calculateWeightsGSCA(
.X = args_default()$.X,
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.conv_criterion = args_default()$.conv_criterion,
.iter_max = args_default()$.iter_max,
.starting_values = args_default()$.starting_values,
.tolerance = args_default()$.tolerance
)
calculateWeightsGSCA 17

Arguments

.X A matrix of processed data (scaled, cleaned and ordered).


.S The (K x K) empirical indicator correlation matrix.
.csem_model A (possibly incomplete) cSEMModel-list.
.conv_criterion
Character string. The criterion to use for the convergence check. One of:
"diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute".
.iter_max Integer. The maximum number of iterations allowed. If iter_max = 1 and
.approach_weights = "PLS-PM" one-step weights are returned. If the algo-
rithm exceeds the specified number, weights of iteration step .iter_max -1 will
be returned with a warning. Defaults to 100.
.starting_values
A named list of vectors where the list names are the construct names whose
indicator weights the user wishes to set. The vectors must be named vectors
of "indicator_name" = value pairs, where value is the (scaled or unscaled)
starting weight. Defaults to NULL.
.tolerance Double. The tolerance criterion for convergence. Defaults to 1e-05.

Value

A named list. J stands for the number of constructs and K for the number of indicators.

$W A (J x K) matrix of estimated weights.


$E NULL
$Modes A named vector of Modes used for the outer estimation, for GSCA the mode is automati-
cally set to "gsca".
$Conv_status The convergence status. TRUE if the algorithm has converged and FALSE otherwise.
$Iterations The number of iterations required.

References

Hwang H, Takane Y (2004). “Generalized Structured Component Analysis.” Psychometrika, 69(1),


81–99.

Hwang H, Takane Y (2014). Generalized Structured Component Analysis: A Component-Based


Approach to Structural Equation Modeling, Chapman & Hall/CRC Statistics in the Social and Be-
havioral Sciences. Chapman and Hall/CRC.
18 calculateWeightsGSCAm

calculateWeightsGSCAm Calculate weights using GSCAm

Description
Calculate composite weights using generalized structured component analysis with uniqueness
terms (GSCAm) proposed by Hwang et al. (2017).

Usage
calculateWeightsGSCAm(
.X = args_default()$.X,
.csem_model = args_default()$.csem_model,
.conv_criterion = args_default()$.conv_criterion,
.iter_max = args_default()$.iter_max,
.starting_values = args_default()$.starting_values,
.tolerance = args_default()$.tolerance
)

Arguments
.X A matrix of processed data (scaled, cleaned and ordered).
.csem_model A (possibly incomplete) cSEMModel-list.
.conv_criterion
Character string. The criterion to use for the convergence check. One of:
"diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute".
.iter_max Integer. The maximum number of iterations allowed. If iter_max = 1 and
.approach_weights = "PLS-PM" one-step weights are returned. If the algo-
rithm exceeds the specified number, weights of iteration step .iter_max -1 will
be returned with a warning. Defaults to 100.
.starting_values
A named list of vectors where the list names are the construct names whose
indicator weights the user wishes to set. The vectors must be named vectors
of "indicator_name" = value pairs, where value is the (scaled or unscaled)
starting weight. Defaults to NULL.
.tolerance Double. The tolerance criterion for convergence. Defaults to 1e-05.

Details
If there are only constructs modeled as common factors calling csem() with .appraoch_weights =
"GSCA" will automatically call calculateWeightsGSCAm() unless .disattenuate = FALSE. GSCAm
currently only works for pure common factor models. The reason is that the implementation in
cSEM is based on (the appendix) of Hwang et al. (2017). Following the appendix, GSCAm fails
if there is at least one construct modeled as a composite because calculating weight estimates with
GSCAm leads to a product involving the measurement matrix. This matrix does not have full rank
if a construct modeled as a composite is present. The reason is that the measurement matrix has
calculateWeightsKettenring 19

a zero row for every construct which is a pure composite (i.e. all related loadings are zero) and,
therefore, leads to a non-invertible matrix when multiplying it with its transposed.

Value
A list with the elements

$W A (J x K) matrix of estimated weights.


$C The (J x K) matrix of estimated loadings.
$B The (J x J) matrix of estimated path coefficients.
$E NULL
$Modes A named vector of Modes used for the outer estimation, for GSCA the mode is automati-
cally set to ’gsca’.
$Conv_status The convergence status. TRUE if the algorithm has converged and FALSE otherwise.
$Iterations The number of iterations required.

References
Hwang H, Takane Y, Jung K (2017). “Generalized Structured Component Analysis with Uniqueness
Terms for Accommodating Measurement Error.” Frontiers in Psychology, 8(2137), 1–12.

calculateWeightsKettenring
Calculate composite weights using GCCA

Description
Calculates composite weights according to one of the the five criteria "SUMCORR", "MAXVAR",
"SSQCORR", "MINVAR", and "GENVAR" suggested by Kettenring (1971).

Usage
calculateWeightsKettenring(
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.approach_gcca = args_default()$.approach_gcca
)

Arguments
.S The (K x K) empirical indicator correlation matrix.
.csem_model A (possibly incomplete) cSEMModel-list.
.approach_gcca Character string. The Kettenring approach to use for GCCA. One of "SUM-
CORR", "MAXVAR", "SSQCORR", "MINVAR" or "GENVAR". Defaults to "SUM-
CORR".
20 calculateWeightsPCA

Value
A named list. J stands for the number of constructs and K for the number of indicators.
$W A (J x K) matrix of estimated weights.
$E NULL
$Modes The GCCA mode used for the estimation.
$Conv_status The convergence status. TRUE if the algorithm has converged and FALSE otherwise.
For .approach_gcca = "MINVAR" or .approach_gcca = "MAXVAR" the convergence status is
NULL since both are closed-form estimators.
$Iterations The number of iterations required. 0 for .approach_gcca = "MINVAR" or .approach_gcca
= "MAXVAR"

References
Kettenring JR (1971). “Canonical Analysis of Several Sets of Variables.” Biometrika, 58(3), 433–
451.

calculateWeightsPCA Calculate composite weights using principal component analysis


(PCA)

Description
Calculate weights for each block by extracting the first principal component of the indicator corre-
lation matrix S_jj for each blocks, i.e., weights are the simply the first eigenvector of S_jj.

Usage
calculateWeightsPCA(
.S = args_default()$.S,
.csem_model = args_default()$.csem_model
)

Arguments
.S The (K x K) empirical indicator correlation matrix.
.csem_model A (possibly incomplete) cSEMModel-list.

Value
A named list. J stands for the number of constructs and K for the number of indicators.
$W A (J x K) matrix of estimated weights.
$E NULL
$Modes The mode used. Always "PCA".
$Conv_status NULL as there are no iterations
$Iterations 0 as there are no iterations
calculateWeightsPLS 21

calculateWeightsPLS Calculate composite weights using PLS-PM

Description
Calculate composite weights using the partial least squares path modeling (PLS-PM) algorithm
(Wold 1975).

Usage
calculateWeightsPLS(
.data = args_default()$.data,
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.conv_criterion = args_default()$.conv_criterion,
.iter_max = args_default()$.iter_max,
.PLS_ignore_structural_model = args_default()$.PLS_ignore_structural_model,
.PLS_modes = args_default()$.PLS_modes,
.PLS_weight_scheme_inner = args_default()$.PLS_weight_scheme_inner,
.starting_values = args_default()$.starting_values,
.tolerance = args_default()$.tolerance
)

Arguments
.data A data.frame or a matrix of standardized or unstandardized data (indica-
tors/items/manifest variables). Possible column types or classes of the data pro-
vided are: "logical", "numeric" ("double" or "integer"), "factor" ("ordered"
and/or "unordered"), "character" (converted to factor), or a mix of several
types.
.S The (K x K) empirical indicator correlation matrix.
.csem_model A (possibly incomplete) cSEMModel-list.
.conv_criterion
Character string. The criterion to use for the convergence check. One of:
"diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute".
.iter_max Integer. The maximum number of iterations allowed. If iter_max = 1 and
.approach_weights = "PLS-PM" one-step weights are returned. If the algo-
rithm exceeds the specified number, weights of iteration step .iter_max -1 will
be returned with a warning. Defaults to 100.
.PLS_ignore_structural_model
Logical. Should the structural model be ignored when calculating the inner
weights of the PLS-PM algorithm? Defaults to FALSE. Ignored if .approach_weights
is not PLS-PM.
.PLS_modes Either a named list specifying the mode that should be used for each construct in
the form "construct_name" = mode, a single character string giving the mode
22 calculateWeightsUnit

that should be used for all constructs, or NULL. Possible choices for mode are:
"modeA", "modeB", "modeBNNLS", "unit", "PCA", a single integer or a vector
of fixed weights of the same length as there are indicators for the construct given
by "construct_name". If only a single number is provided this is identical to
using unit weights, as weights are rescaled such that the related composite has
unit variance. Defaults to NULL. If NULL the appropriate mode according to the
type of construct used is chosen. Ignored if .approach_weight is not PLS-PM.
.PLS_weight_scheme_inner
Character string. The inner weighting scheme used by PLS-PM. One of: "cen-
troid", "factorial", or "path". Defaults to "path". Ignored if .approach_weight
is not PLS-PM.
.starting_values
A named list of vectors where the list names are the construct names whose
indicator weights the user wishes to set. The vectors must be named vectors
of "indicator_name" = value pairs, where value is the (scaled or unscaled)
starting weight. Defaults to NULL.
.tolerance Double. The tolerance criterion for convergence. Defaults to 1e-05.

Value

A named list. J stands for the number of constructs and K for the number of indicators.

$W A (J x K) matrix of estimated weights.


$E A (J x J) matrix of inner weights.
$Modes A named vector of modes used for the outer estimation.
$Conv_status The convergence status. TRUE if the algorithm has converged and FALSE otherwise.
If one-step weights are used via .iter_max = 1 or a non-iterative procedure was used, the
convergence status is set to NULL.
$Iterations The number of iterations required.

References

Wold H (1975). “Path models with latent variables: The NIPALS approach.” In Blalock H, Agan-
begian A, Borodkin F, Boudon R, Capecchi V (eds.), Quantitative Sociology, International Perspec-
tives on Mathematical and Statistical Modeling, 307–357. Academic Press, New York.

calculateWeightsUnit Calculate composite weights using unit weights

Description

Calculate unit weights for all blocks, i.e., each indicator of a block is equally weighted.
csem 23

Usage
calculateWeightsUnit(
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.starting_values = args_default()$.starting_values
)

Arguments
.S The (K x K) empirical indicator correlation matrix.
.csem_model A (possibly incomplete) cSEMModel-list.
.starting_values
A named list of vectors where the list names are the construct names whose
indicator weights the user wishes to set. The vectors must be named vectors
of "indicator_name" = value pairs, where value is the (scaled or unscaled)
starting weight. Defaults to NULL.

Value
A named list. J stands for the number of constructs and K for the number of indicators.

$W A (J x K) matrix of estimated weights.


$E NULL
$Modes The mode used. Always "unit".
$Conv_status NULL as there are no iterations
$Iterations 0 as there are no iterations

csem Composite-based SEM

Description
Estimate linear, nonlinear, hierarchical or multigroup structural equation models using a composite-
based approach. In cSEM any method or approach that involves linear compounds (scores/proxies/composites)
of observables (indicators/items/manifest variables) is defined as composite-based. See the Get
started section of the cSEM website for a general introduction to composite-based SEM and cSEM.

Usage
csem(
.data = NULL,
.model = NULL,
.approach_2ndorder = c("2stage", "mixed"),
.approach_nl = c("sequential", "replace"),
.approach_paths = c("OLS", "2SLS"),
24 csem

.approach_weights = c("PLS-PM", "SUMCORR", "MAXVAR", "SSQCORR",


"MINVAR", "GENVAR", "GSCA", "PCA",
"unit", "bartlett", "regression"),
.disattenuate = TRUE,
.id = NULL,
.instruments = NULL,
.normality = FALSE,
.reliabilities = NULL,
.starting_values = NULL,
.resample_method = c("none", "bootstrap", "jackknife"),
.resample_method2 = c("none", "bootstrap", "jackknife"),
.R = 499,
.R2 = 199,
.handle_inadmissibles = c("drop", "ignore", "replace"),
.user_funs = NULL,
.eval_plan = c("sequential", "multiprocess"),
.seed = NULL,
.sign_change_option = c("none", "individual", "individual_reestimate",
"construct_reestimate"),
...
)

Arguments
.data A data.frame or a matrix of standardized or unstandardized data (indica-
tors/items/manifest variables). Additionally, a list of data sets (data frames or
matrices) is accepted in which case estimation is repeated for each data set. Pos-
sible column types or classes of the data provided are: "logical", "numeric"
("double" or "integer"), "factor" ("ordered" and/or "unordered"), "character"
(will be converted to factor), or a mix of several types.
.model A model in lavaan model syntax or a cSEMModel list.
.approach_2ndorder
Character string. Approach used for models containing second-order constructs.
One of: "2stage", or "mixed". Defaults to "2stage".
.approach_nl Character string. Approach used to estimate nonlinear structural relationships.
One of: "sequential" or "replace". Defaults to "sequential".
.approach_paths
Character string. Approach used to estimate the structural coefficients. One of:
"OLS" or "2SLS". If "2SLS", instruments need to be supplied to .instruments.
Defaults to "OLS".
.approach_weights
Character string. Approach used to obtain composite weights. One of: "PLS-
PM", "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR", "GENVAR", "GSCA",
"PCA", "unit", "bartlett", or "regression". Defaults to "PLS-PM".
.disattenuate Logical. Should composite/proxy correlations be disattenuated to yield consis-
tent loadings and path estimates if at least one of the construct is modeled as a
common factor? Defaults to TRUE.
csem 25

.id Character string or integer. A character string giving the name or an integer of
the position of the column of .data whose levels are used to split .data into
groups. Defaults to NULL.
.instruments A named list of vectors of instruments. The names of the list elements are
the names of the dependent (LHS) constructs of the structural equation whose
explanatory variables are endogenous. The vectors contain the names of the
instruments corresponding to each equation. Note that exogenous variables of
a given equation must be supplied as instruments for themselves. Defaults to
NULL.
.normality Logical. Should joint normality of [η1:p ; ζ; ] be assumed in the nonlinear model?
See (Dijkstra and Schermelleh-Engel 2014) for details. Defaults to FALSE. Ig-
nored if the model is not nonlinear.
.reliabilities A character vector of "name" = value pairs, where value is a number between
0 and 1 and "name" a character string of the corresponding construct name, or
NULL. Reliabilities may be given for a subset of the constructs. Defaults to NULL
in which case reliabilities are estimated by csem(). Currently, only supported
for .approach_weights = "PLS-PM".
.starting_values
A named list of vectors where the list names are the construct names whose
indicator weights the user wishes to set. The vectors must be named vectors
of "indicator_name" = value pairs, where value is the (scaled or unscaled)
starting weight. Defaults to NULL.
.resample_method
Character string. The resampling method to use. One of: "none", "bootstrap" or
"jackknife". Defaults to "none".
.resample_method2
Character string. The resampling method to use when resampling from a resam-
ple. One of: "none", "bootstrap" or "jackknife". For "bootstrap" the number of
draws is provided via .R2. Currently, resampling from each resample is only
required for the studentized confidence intervall ("CI_t_interval") computed by
the infer() function. Defaults to "none".
.R Integer. The number of bootstrap replications. Defaults to 499.
.R2 Integer. The number of bootstrap replications to use when resampling from a
resample. Defaults to 199.
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "drop",
"ignore", or "replace". If "drop", all replications/resamples yielding an inadmis-
sible result will be dropped (i.e. the number of results returned will potentially
be less than .R). For "ignore" all results are returned even if all or some of the
replications yielded inadmissible results (i.e. number of results returned is equal
to .R). For "replace" resampling continues until there are exactly .R admissi-
ble solutions. Depending on the frequency of inadmissible solutions this may
significantly increase computing time. Defaults to "drop".
.user_funs A function or a (named) list of functions to apply to every resample. The func-
tions must take .object as its first argument (e.g., myFun <- function(.object, ...) {body-
of-the-function}). Function output should preferably be a (named) vector but
26 csem

matrices are also accepted. However, the output will be vectorized (column-
wise) in this case. See the examples section for details.
.eval_plan Character string. The evaluation plan to use. One of "sequential" or "multipro-
cess". In the latter case all available cores will be used. Defaults to "sequential".
.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!
.sign_change_option
Character string. Which sign change option should be used to handle flipping
signs when resampling? One of "none","individual", "individual_reestimate",
"construct_reestimate". Defaults to "none".
... Further arguments to be passed down to lower level functions of csem(). See
args_csem_dotdotdot for a complete list of available arguments.

Details
csem() estimates linear, nonlinear, hierarchical or multigroup structural equation models using a
composite-based approach.

Data and model:: The .data and .model arguments are required. .data must be given a matrix
or a data.frame with column names matching the indicator names used in the model description.
Alternatively, a list of data sets (matrices or data frames) may be provided in which case es-
timation is repeated for each data set. Possible column types/classes of the data provided are:
"logical", "numeric" ("double" or "integer"), "factor" ("ordered" and/or "unordered"),
"character", or a mix of several types. Character columns will be treated as (unordered) factors.
Depending on the type/class of the indicator data provided cSEM computes the indicator correla-
tion matrix in different ways. See calculateIndicatorCor() for details.
In the current version .data must not contain missing values. Future versions are likely to handle
missing values as well.
To provide a model use the lavaan model syntax. Note, however, that cSEM currently only
supports the "standard" lavaan model syntax (Types 1, 2, 3, and 7 as described on the help page).
Therefore, specifying e.g., a threshold or scaling factors is ignored. Alternatively, a standardized
(possibly incomplete) cSEMModel-list may be supplied. See parseModel() for details.

Weights and path coefficients:: By default weights are estimated using the partial least squares
path modeling algorithm ("PLS-PM"). A range of alternative weighting algorithms may be sup-
plied to .approach_weights. Currently, the following approaches are implemented
1. (Default) Partial least squares path modeling ("PLS-PM"). The algorithm can be customized.
See calculateWeightsPLS() for details.
2. Generalized structured component analysis ("GSCA") and generalized structured component
analysis with uniqueness terms (GSCAm). The algorithms can be customized. See calculateWeightsGSCA()
and calculateWeightsGSCAm() for details. Note that GSCAm is called indirectly when the
model contains constructs modeled as common factors only and .disattenuate = TRUE. See
below.
3. Generalized canonical correlation analysis (GCCA), including "SUMCORR", "MAXVAR", "SSQCORR",
"MINVAR", "GENVAR".
4. Principal component analysis ("PCA")
csem 27

5. Factor score regression using sum scores ("unit"), regression ("regression") or bartlett
scores ("bartlett")
It is possible to supply starting values for the weighting algorithm via .starting_values. The
argument accepts a named list of vectors where the list names are the construct names whose in-
dicator weights the user wishes to set. The vectors must be named vectors of "indicator_name"
= value pairs, where value is the starting weight. See the examples section below for details.
Composite-indicator and composite-composite correlations are properly disattenuated by default
to yield consistent loadings, construct correlations, and path coefficients if any of the concepts are
modeled as a common factor.
For PLS-PM disattenuation is done using PLSc (Dijkstra and Henseler 2015). For GSCA disat-
tenuation is done implicitly by using GSCAm (Hwang et al. 2017). Weights obtained by GCCA,
unit, regression, bartlett or PCA are disattenuated using Croon’s approach (Croon 2002). Disat-
tenuation my be suppressed by setting .disattenuate = FALSE. Note, however, that quantities in
this case are inconsistent estimates for their construct level counterparts if any of the constructs in
the structural model are modeled as a common factor!
By default path coefficients are estimated using ordinary least squares (.approach_path = "OLS").
For linear models, two-stage least squares ("2SLS") is available, however, only if instruments are
internal, i.e., part of the structural model. Future versions will add support for external instru-
ments if possible. Instruments must be supplied to .instruments as a named list where the
names of the list elements are the names of the dependent constructs of the structural equations
whose explanatory variables are believed to be endogenous. The list consists of vectors of names
of instruments corresponding to each equation. Note that exogenous variables of a given equation
must be supplied as instruments for themselves.
If reliabilities are known they can be supplied as "name" = value pairs to .reliabilities, where
value is a numeric value between 0 and 1. Currently, only supported for "PLS-PM".

Nonlinear models:: If the model contains nonlinear terms csem() estimates a polynomial struc-
tural equation model using a non-iterative method of moments approach described in Dijkstra and
Schermelleh-Engel (2014). Nonlinear terms include interactions and exponential terms. The lat-
ter is described in model syntax as an "interaction with itself", e.g., xi^3 = xi.xi.xi. Currently
only exponential terms up to a power of three (e.g., three-way interactions or cubic terms) are
allowed.
The current version of the package allows two kinds of estimation: estimation of the reduced form
equation (.approach_nl = "replace") and sequential estimation (.approach_nl = "sequential",
the default). The latter does not allow for multivariate normality of all exogenous variables, i.e.,
the latent variables and the error terms.
Distributional assumptions are kept to a minimum (an i.i.d. sample from a population with finite
moments for the relevant order); for higher order models, that go beyond interaction, we work
in this version with the assumption that as far as the relevant moments are concerned certain
combinations of measurement errors behave as if they were Gaussian. For details see: Dijkstra
and Schermelleh-Engel (2014).

Second-order model: Second-order models are specified using the operators =~ and <~. These
operators are usually used with indicators on their right-hand side. For second-order models the
right-hand side variables are constructs instead. If c1, and c2 are constructs forming or measuring
a higher order construct, a model would look like this:
my_model <- "
28 csem

# Structural model
SAT ~ QUAL
VAL ~ SAT

# Measurement/composite model
QUAL =~ qual1 + qual2
SAT =~ sat1 + sat2

c1 =~ x11 + x12
c2 =~ x21 + x22

# Second-order term (in this case a second-order composite build by common


# factors)
VAL <~ c1 + c2
"
Currently, two approaches are explicitly implemented:
• (Default) "2stage". The (disjoint) two stage approach as proposed by Agarwal and Kara-
hanna (2000).
• "mixed". The mixed repeated indicators/two-stage approach as proposed by Ringle et al.
(2012).
The repeated indicators approach as proposed by Joereskog and Wold (1982) and the extension
proposed by Becker et al. (2012) are not directly implemented as they simply require a respec-
ification of the model. In the above example the repeated indicators approach would require to
change the model and to append the repeated indicators to the data supplied to .data. Note that
the indicators need to be renamed in this case as csem() does not allow for one indicator to be
attached to multiple constructs.
my_model <- "
# Structural model
SAT ~ QUAL
VAL ~ SAT

VAL ~ c1 + c2

# Measurement/composite model
QUAL =~ qual1 + qual2
SAT =~ sat1 + sat2
VAL =~ x11_temp + x12_temp + x21_temp + x22_temp

c1 =~ x11 + x12
c2 =~ x21 + x22
"
According to the extended approach indirect effects of QUAL on VAL via c1 and c2 would have to
be specified as well.

Multigroup analysis: To perform multigroup analysis provide either a list of data sets or one data
set containing a group-identifier-column whose column name must be provided to .id. Values of
this column are taken as levels of a factor and are interpreted as group identifiers. csem() will
csem 29

split the data by levels of that column and run the estimation for each level separately. Note that
the more levels the group-identifier-column has, the more estimation runs are required. This can
considerably slow down estimation, especially if resampling is requested. For the latter it will
generally be faster to use .eval_plan = "multiprocess".

Inference:: Inference is done via resampling. See resamplecSEMResults() and infer() for
details.

Value
An object of class cSEMResults with methods for all postestimation generics. Technically, a
call to csem() results in an object with at least two class attributes. The first class attribute
is always cSEMResults. The second is one of cSEMResults_default, cSEMResults_multi, or
cSEMResults_2ndorder and depends on the estimated model and/or the type of data provided to
the .model and .data arguments. The third class attribute cSEMResults_resampled is only added
if resampling was conducted. For a details see the cSEMResults helpfile .

Postestimation
assess() Assess results using common quality criteria, e.g., reliability, fit measures, HTMT, R2
etc.
infer() Calculate common inferential quantities, e.g., standard errors, confidence intervals.
predict() Predict endogenous indicator scores and compute common prediction metrics.
summarize() Summarize the results. Mainly called for its side-effect the print method.
verify() Verify/Check admissibility of the estimates.
Tests are performed using the test-family of functions. Currently the following tests are imple-
mented:
testOMF() Bootstrap-based test for overall model fit based on Beran and Srivastava (1985)
testMICOM() Permutation-based test for measurement invariance of composites proposed by Henseler
et al. (2016)
testMGD() Several (mainly) permutation-based tests for multi-group comparisons.
testHausman() Regression-based Hausman test to test for endogeneity.
Other miscellaneous postestimation functions belong do the do-family of functions. Currently two
do functions are implemented:
doFloodlightAnalysis() Perform a floodlight analysis as described in Spiller et al. (2013)
doRedundancyAnalysis() Perform a redundancy analysis (RA) as proposed by Hair et al. (2016)
with reference to Chin (1998)

References
Agarwal R, Karahanna E (2000). “Time Flies When You’re Having Fun: Cognitive Absorption and
Beliefs about Information Technology Usage.” MIS Quarterly, 24(4), 665.

Becker J, Klein K, Wetzels M (2012). “Hierarchical Latent Variable Models in PLS-SEM: Guide-
lines for Using Reflective-Formative Type Models.” Long Range Planning, 45(5-6), 359–394.
30 csem

doi: 10.1016/j.lrp.2012.10.001.

Beran R, Srivastava MS (1985). “Bootstrap Tests and Confidence Regions for Functions of a Co-
variance Matrix.” The Annals of Statistics, 13(1), 95–115. doi: 10.1214/aos/1176346579.

Chin WW (1998). “Modern Methods for Business Research.” In Marcoulides GA (ed.), chap-
ter The Partial Least Squares Approach to Structural Equation Modeling, 295–358. Mahwah, NJ:
Lawrence Erlbaum.

Croon M (2002). “Using predicted latent scores in general latent structure models.” In Marcoulides
G, Moustaki I (eds.), Latent Variable and Latent Structure Models, chapter 10, 195–224. Lawrence
Erlbaum. ISBN 080584046X, Pagination: 288.

Dijkstra TK, Henseler J (2015). “Consistent and Asymptotically Normal PLS Estimators for Linear
Structural Equations.” Computational Statistics & Data Analysis, 81, 10–23.

Dijkstra TK, Schermelleh-Engel K (2014). “Consistent Partial Least Squares For Nonlinear Struc-
tural Equation Models.” Psychometrika, 79(4), 585–604.

Hair JF, Hult GTM, Ringle C, Sarstedt M (2016). A Primer on Partial Least Squares Structural
Equation Modeling (PLS-SEM). Sage publications.

Henseler J, Ringle CM, Sarstedt M (2016). “Testing Measurement Invariance of Composites Using
Partial Least Squares.” International Marketing Review, 33(3), 405–431. doi: 10.1108/imr092014-
0304.

Hwang H, Takane Y, Jung K (2017). “Generalized Structured Component Analysis with Unique-
ness Terms for Accommodating Measurement Error.” Frontiers in Psychology, 8(2137), 1–12.

Joereskog KG, Wold HO (1982). Systems under Indirect Observation: Causality, Structure, Pre-
diction - Part II, volume 139. North Holland.

Ringle CM, Sarstedt M, Straub D (2012). “A Critical Look at the Use of PLS-SEM in MIS Quar-
terly.” MIS Quarterly, 36(1), iii–xiv.

Spiller SA, Fitzsimons GJ, Lynch JG, Mcclelland GH (2013). “Spotlights, Floodlights, and the
Magic Number Zero: Simple Effects Tests in Moderated Regression.” Journal of Marketing Re-
search, 50(2), 277–288. doi: 10.1509/jmr.12.0420.

See Also
args_default(), cSEMArguments, cSEMResults, foreman(), resamplecSEMResults(), assess(),
infer(), predict(), summarize(), verify(), testOMF(), testMGD(), testMICOM(), testHausman()

Examples
# ===========================================================================
# Basic usage
# ===========================================================================
csem 31

### Linear model ------------------------------------------------------------


# Most basic usage requires a dataset and a model. We use the
# `threecommonfactors` dataset.

## Take a look at the dataset


#?threecommonfactors

## Specify the (correct) model


model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# (Reflective) measurement model


eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"

## Estimate
res <- csem(threecommonfactors, model)

## Postestimation
verify(res)
summarize(res)
assess(res)

# Notes:
# 1. By default no inferential quantities (e.g. Std. errors, p-values, or
# confidence intervals) are calculated. Use resampling to obtain
# inferential quantities. See "Resampling" in the "Extended usage"
# section below.
# 2. `summarize()` prints the full output by default. For a more condensed
# output use:
print(summarize(res), .full_output = FALSE)

## Dealing with endogeneity -------------------------------------------------

# See: ?testHausman()

### Models containing second constructs--------------------------------------


## Take a look at the dataset
#?dgp_2ndorder_cf_of_c

model <- "


# Path model / Regressions
c4 ~ eta1
eta2 ~ eta1 + c4

# Reflective measurement model


c1 <~ y11 + y12
c2 <~ y21 + y22 + y23 + y24
c3 <~ y31 + y32 + y33 + y34 + y35 + y36 + y37 + y38
32 csem

eta1 =~ y41 + y42 + y43


eta2 =~ y51 + y52 + y53

# Composite model (second order)


c4 =~ c1 + c2 + c3
"

res_2stage <- csem(dgp_2ndorder_cf_of_c, model, .approach_2ndorder = "2stage")


res_mixed <- csem(dgp_2ndorder_cf_of_c, model, .approach_2ndorder = "mixed")

# The standard repeated indicators approach is done by 1.) respecifying the model
# and 2.) adding the repeated indicators to the data set

# 1.) Respecify the model


model_RI <- "
# Path model / Regressions
c4 ~ eta1
eta2 ~ eta1 + c4
c4 ~ c1 + c2 + c3

# Reflective measurement model


c1 <~ y11 + y12
c2 <~ y21 + y22 + y23 + y24
c3 <~ y31 + y32 + y33 + y34 + y35 + y36 + y37 + y38
eta1 =~ y41 + y42 + y43
eta2 =~ y51 + y52 + y53

# c4 is a common factor measured by composites


c4 =~ y11_temp + y12_temp + y21_temp + y22_temp + y23_temp + y24_temp +
y31_temp + y32_temp + y33_temp + y34_temp + y35_temp + y36_temp +
y37_temp + y38_temp
"

# 2.) Update data set


data_RI <- dgp_2ndorder_cf_of_c
coln <- c(colnames(data_RI), paste0(colnames(data_RI), "_temp"))
data_RI <- data_RI[, c(1:ncol(data_RI), 1:ncol(data_RI))]
colnames(data_RI) <- coln

# Estimate
res_RI <- csem(data_RI, model_RI)
summarize(res_RI)

### Multigroup analysis -----------------------------------------------------

# See: ?testMGD()

# ===========================================================================
# Extended usage
# ===========================================================================
# `csem()` provides defaults for all arguments except `.data` and `.model`.
# Below some common options/tasks that users are likely to be interested in.
# We use the threecommonfactors data set again:
csem 33

model <- "


# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# (Reflective) measurement model


eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"

### PLS vs PLSc and disattenuation


# In the model all concepts are modeled as common factors. If
# .approach_weights = "PLS-PM", csem() uses PLSc to disattenuate composite-indicator
# and composite-composite correlations.
res_plsc <- csem(threecommonfactors, model, .approach_weights = "PLS-PM")
res$Information$Model$construct_type # all common factors

# To obtain "original" (inconsistent) PLS estimates use `.disattenuate = FALSE`


res_pls <- csem(threecommonfactors, model,
.approach_weights = "PLS-PM",
.disattenuate = FALSE
)

s_plsc <- summarize(res_plsc)


s_pls <- summarize(res_pls)

# Compare
data.frame(
"Path" = s_plsc$Estimates$Path_estimates$Name,
"Pop_value" = c(0.6, 0.4, 0.35), # see ?threecommonfactors
"PLSc" = s_plsc$Estimates$Path_estimates$Estimate,
"PLS" = s_pls$Estimates$Path_estimates$Estimate
)

### Resampling --------------------------------------------------------------


## Not run:
## Basic resampling
res_boot <- csem(threecommonfactors, model, .resample_method = "bootstrap")
res_jack <- csem(threecommonfactors, model, .resample_method = "jackknife")

# See ?resamplecSEMResults for more examples

### Choosing a different weightning scheme ----------------------------------

res_gscam <- csem(threecommonfactors, model, .approach_weights = "GSCA")


res_gsca <- csem(threecommonfactors, model,
.approach_weights = "GSCA",
.disattenuate = FALSE
)

s_gscam <- summarize(res_gscam)


34 dgp_2ndorder_cf_of_c

s_gsca <- summarize(res_gsca)

# Compare
data.frame(
"Path" = s_gscam$Estimates$Path_estimates$Name,
"Pop_value" = c(0.6, 0.4, 0.35), # see ?threecommonfactors
"GSCAm" = s_gscam$Estimates$Path_estimates$Estimate,
"GSCA" = s_gsca$Estimates$Path_estimates$Estimate
)
## End(Not run)
### Fine-tuning a weighting scheme ------------------------------------------
## Setting starting values

sv <- list("eta1" = c("y12" = 10, "y13" = 4, "y11" = 1))


res <- csem(threecommonfactors, model, .starting_values = sv)

## Choosing a different inner weighting scheme


#?args_csem_dotdotdot

res <- csem(threecommonfactors, model, .PLS_weight_scheme_inner = "factorial",


.PLS_ignore_structural_model = TRUE)

## Choosing different modes for PLS


# By default, concepts modeled as common factors uses PLS Mode A weights.
modes <- list("eta1" = "unit", "eta2" = "modeB", "eta3" = "unit")
res <- csem(threecommonfactors, model, .PLS_modes = modes)
summarize(res)

dgp_2ndorder_cf_of_c Data: Second order common factor of composites

Description
A dataset containing 500 standardized observations on 19 indicator generated from a population
model with 6 concepts, three of which (c1-c3) are composites forming a second order common
factor (c4). The remaining two (eta1, eta2) are concepts modeled as common factors .

Usage
dgp_2ndorder_cf_of_c

Format
A matrix with 500 rows and 19 variables:
y11-y12 Indicators attached to c1. Population weights are: 0.8; 0.4. Population loadings are:
0.925; 0.65
y21-y24 Indicators attached to c2. Population weights are: 0.5; 0.3; 0.2; 0.4. Population loadings
are: 0.804; 0.68; 0.554; 0.708
distance_measures 35

y31-y38 Indicators attached to c3. Population weights are: 0.3; 0.3; 0.1; 0.1; 0.2; 0.3; 0.4; 0.2.
Population loadings are: 0.496; 0.61; 0.535; 0.391; 0.391; 0.6; 0.5285; 0.53
y41-y43 Indicators attached to eta1. Population loadings are: 0.8; 0.7; 0.7
y51-y53 Indicators attached to eta1. Population loadings are: 0.8; 0.8; 0.7
The model is:
‘c4‘ = gamma1 ∗ ‘eta1‘ + zeta1
‘eta2‘ = gamma2 ∗ ‘eta1‘ + beta ∗ ‘c4‘ + zeta2

with population values gamma1 = 0.6, gamma2 = 0.4 and beta = 0.35. The second order common
factor is

‘c4‘ = lambdac1 ∗ ‘c1‘ + lambdac2 ∗ ‘c2‘ + lambdac3 ∗ ‘c3‘ + epsilon

distance_measures Calculate difference between S and Sigma_hat

Description
Calculate the difference between the empirical (S) and the model-implied indicator variance-covariance
matrix (Sigma_hat) using different distance measures.

Usage
calculateDG(
.object = NULL,
.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)

calculateDL(
.object = NULL,
.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)

calculateDML(
.object = NULL,
.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)
36 doFloodlightAnalysis

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.matrix1 A matrix to compare.
.matrix2 A matrix to compare.
.saturated Logical. Should a saturated structural model be used? Defaults to FALSE.
... Ignored.

Details
The distances may also be computed for any two matrices A and B by supplying A and B directly
via the .matrix1 and .matrix2 arguments. If A and B are supplied .object is ignored.

Value
A single numeric value giving the distance between two matrices.

Functions
• calculateDG: The geodesic distance (dG).
• calculateDL: The squared Euclidian distance
• calculateDML: The distance measure (fit function) used by ML

doFloodlightAnalysis Do a floodlight analysis

Description
Calculate the effect of an independent variable on a dependent variable conditional on the values of
a (continous) moderator variable to perform a floodlight analysis (Spiller et al. 2013). Moreover,
the Johnson-Neyman points are calculated, i.e. the value(s) of the moderator for which lower or
upper boundary of the confidence interval of the effect estimate of the independent variable on the
depedent variable switches signs.

Usage
doFloodlightAnalysis(
.object = NULL,
.alpha = 0.05,
.dependent = NULL,
.moderator = NULL,
.independent = NULL,
.n_steps = 100
)
doRedundancyAnalysis 37

Arguments

.object An R object of class cSEMResults resulting from a call to csem().


.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.dependent Character string. The name of the dependent variable. Defaults to NULL.
.moderator Character string. The name of the moderator variable. Defaults to NULL.
.independent Character string. The name of the independent variable. Defaults to NULL.
.n_steps Integer. A numeric value giving the number of steps, e.g., in surface analysis or
floodlight analysis the spotlights (= values of .moderator) between min(.moderator)
and max(.moderator) to use. Defaults to 100.

Value

A list of class cSEMFloodlight with a corresponding method for plot(). See: plot.cSEMFloodlight().

References

Spiller SA, Fitzsimons GJ, Lynch JG, Mcclelland GH (2013). “Spotlights, Floodlights, and the
Magic Number Zero: Simple Effects Tests in Moderated Regression.” Journal of Marketing Re-
search, 50(2), 277–288. doi: 10.1509/jmr.12.0420.

See Also

csem(), cSEMResults, plot.cSEMFloodlight()

doRedundancyAnalysis Do a redundancy analysis

Description

Perform a redundancy analysis (RA) as proposed by Hair et al. (2016) with reference to Chin
(1998).

Usage

doRedundancyAnalysis(.object = NULL)

Arguments

.object An R object of class cSEMResults resulting from a call to csem().


38 doSurfaceAnalysis

Details
RA is confined to PLS-PM, specifically PLS-PM with at least one construct whose weights are
obtained by mode B. In cSEM this is the case if the construct is modeled as a composite or if
argument .PLS_modes was explicitly set to mode B for at least one construct. Hence RA is only
conducted if .approach_weights = "PLS-PM" and if at least one construct’s mode is mode B.
The principal idea of RA is to take two different measures of the same construct and regress the
scores obtained for each measure on each other. If they are similar they are likely to measure the
same "thing" which is then taken as evidence that both measurement models actually measure what
they are supposed to measure (validity).
There are several issues with the terminology and the reasoning behind this logic. RA is there-
fore only implemented since reviewers are likely to demand its computation, however, its actual
application for validity assessment is discouraged.
Currently, the function is not applicable to models containing second-order constructs.

Value
A named numeric vector of correlations. If the weighting approach used to obtain .object is not
"PLS-PM" or non of the PLS outer modes was mode B, the function silently returns NA.

References
Chin WW (1998). “Modern Methods for Business Research.” In Marcoulides GA (ed.), chap-
ter The Partial Least Squares Approach to Structural Equation Modeling, 295–358. Mahwah, NJ:
Lawrence Erlbaum.

Hair JF, Hult GTM, Ringle C, Sarstedt M (2016). A Primer on Partial Least Squares Structural
Equation Modeling (PLS-SEM). Sage publications.

See Also
cSEMResults

doSurfaceAnalysis Do a surface analysis

Description
Based on a nonlinear model, the dependent variable of a certain equation is predicted by two inde-
pendent variable, i.e., .independent_1 and .independent_2 including their higher-order terms.

Usage
doSurfaceAnalysis(
.object = NULL,
.alpha = 0.05,
.dependent = NULL,
fit 39

.independent_1 = NULL,
.independent_2 = NULL,
.n_steps = 100
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.dependent Character string. The name of the dependent variable. Defaults to NULL.
.independent_1 Character string. The name of the first independent variable. Defaults to NULL.
.independent_2 Character string. The name of the second independent variable. Defaults to
NULL.
.n_steps Integer. A numeric value giving the number of steps, e.g., in surface analysis or
floodlight analysis the spotlights (= values of .moderator) between min(.moderator)
and max(.moderator) to use. Defaults to 100.

Value
A list of class cSEMSurface with a corresponding method for plot(). See: plot.cSEMSurface().

See Also
csem(), cSEMResults, plot.cSEMSurface()

fit Model-implied indicator or construct variance-covariance matrix

Description
Calculate the model-implied indicator or construct variance-covariance (VCV) matrix. Currently
only the model-implied VCV for recursive linear models is implemented (including models con-
taining second order constructs).

Usage
fit(
.object = NULL,
.saturated = args_default()$.saturated,
.type_vcv = args_default()$.type_vcv
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.saturated Logical. Should a saturated structural model be used? Defaults to FALSE.
.type_vcv Character string. Which model-implied correlation matrix is calculated? One of
"indicator" or "construct". Defaults to "indicator".
40 fit_measures

Details
Notation is taken from Bollen (1989). If .saturated = TRUE the model-implied variance-covariance
matrix is calculated for a saturated structural model (i.e., the VCV of the constructs is replaced by
their correlation matrix). Hence: V(eta) = WSW’ (possibly disattenuated).

Value
Either a (K x K) matrix or a (J x J) matrix depending on the type_vcv.

References
Bollen KA (1989). Structural Equations with Latent Variables. Wiley-Interscience. ISBN 978-
0471011712.

See Also
csem(), foreman(), cSEMResults

fit_measures Model fit measures

Description
Calculate fit measures.

Usage
calculateChiSquare(.object)

calculateChiSquareDf(.object)

calculateCFI(.object)

calculateGFI(.object, .type = c("ML", "ULS"))

calculateIFI(.object)

calculateNFI(.object)

calculateNNFI(.object)

calculateRMSEA(.object)

calculateRMSTheta(.object)

calculateSRMR(
.object = NULL,
fit_measures 41

.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)

Arguments

.object An R object of class cSEMResults resulting from a call to csem().


.type Character string. Which fitting function should the GFI be based on? One of
"ML" for the maximum likelihood fitting function or "ULS" for the unweighted
least squares fitting function (same as the squared Euclidian distance). Defaults
to "ML".
.matrix1 A matrix to compare.
.matrix2 A matrix to compare.
.saturated Logical. Should a saturated structural model be used? Defaults to FALSE.
... Ignored.

Details

See the Fit indices section of the cSEM website for details on the implementation.

Value

A single numeric value.

Functions

• calculateChiSquare: The chi square statistic.


• calculateChiSquareDf: The ChiSquare statistic divided by its degrees of freedom.
• calculateCFI: The comparative fit index (CFI).
• calculateGFI: The goodness of fit index (GFI).
• calculateIFI: The incremental fit index (IFI).
• calculateNFI: The normed fit index (NFI).
• calculateNNFI: The non-normed fit index (NNFI). Also called the Tucker-Lewis index (TLI).
• calculateRMSEA: The root mean square error of approximation (RMSEA).
• calculateRMSTheta: The root mean squared residual covariance matrix of the outer model
residuals (RMS theta).
• calculateSRMR: The standardized root mean square residual (SRMR).
42 ITFlex

getConstructScores Get construct scores

Description

Get the standardized or unstandardized construct scores.

Usage

getConstructScores(
.object = NULL,
.standardized = TRUE
)

Arguments

.object An R object of class cSEMResults resulting from a call to csem().


.standardized Logical. Should standardized scores be returned? Defaults to TRUE.

Value

A matrix of construct scores.

See Also

csem(), cSEMResults

ITFlex Data: ITFlex

Description

A data frame containing 16 variables with 100 observations.

Usage

ITFlex
ITFlex 43

Format
A data frame containing the following variables:

ITCOMP1 Software applications can be easily transported and used across multiple platforms.
ITCOMP2 Our firm provides multiple interfaces or entry points (e.g., web access) for external end
users.
ITCOMP3 Our firm establishes corporate rules and standards for hardware and operating systems to
ensure platform compatibility.
ITCOMP4 Data captured in one part of our organization are immediately available to everyone in the
firm.
ITCONN1 Our organization has electronic links and connections throughout the entire firm.
ITCONN2 Our firm is linked to business partners through electronic channels (e.g., websites, e-mail,
wireless devices, electronic data interchange).
ITCONN3 All remote, branch, and mobile offices are connected to the central office.
ITCONN4 There are very few identifiable communications bottlenecks within our firm.
MOD1 Our firm possesses a great speed in developing new business applications ormodifying exist-
ing applications.
MOD2 Our corporate database is able to communicate in several different protocols.
MOD3 Reusable software modules are widely used in new systems development.
MOD4 IT personnel use object-oriented and prepackaged modular tools to create software applica-
tions.
ITPSF1 Our IT personnel have the ability to work effectively in cross-functional teams.
ITPSF2 Our IT personnel are able to interpret business problems and develop appropriate technical
solutions.
ITPSF3 Our IT personnel are self-directed and proactive.
ITPSF4 Our IT personnel are knowledgeable about the key success factors in our firm.

Details
The dataset was studied by Benitez et al. (2018) and is used in Henseler (2020) for demonstration
purposes (Tutorial 8). All questionnaire items are measured on a 5-point scale.

Source
The data was collected through a survey by Benitez et al. (2018).

References
Benitez J, Ray G, Henseler J (2018). “Impact of Information Technology Infrastructure Flexibility
on Mergers and Acquisitions.” MIS Quarterly, 42(1), 25–43.

Henseler J (2020). Composite-Based Structural Equation Modeling: An Introduction to Partial


Least Squares & Co. Using ADANCO. Guilford Press.
44 LancelotMiltgenetal2016

Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_IT_Fex="
# Composite models
ITComp <~ ITCOMP1 + ITCOMP2 + ITCOMP3 + ITCOMP4
Modul <~ MOD1 + MOD2 + MOD3 + MOD4
ITConn <~ ITCONN1 + ITCONN2 + ITCONN3 + ITCONN4
ITPers <~ ITPSF1 + ITPSF2 + ITPSF3 + ITPSF4

# Saturated structural model


ITPers ~ ITComp + Modul + ITConn
Modul ~ ITComp + ITConn
ITConn ~ ITComp
"

out <- csem(.data = ITFlex, .model = model_IT_Fex,


.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06,
.PLS_ignore_structural_model = TRUE)

LancelotMiltgenetal2016
Data: LancelotMiltgenetal2016

Description

A data frame containing 10 variables with 1090 observations.

Usage

LancelotMiltgenetal2016

Format

An object of class data.frame with 1090 rows and 11 columns.

Details

The data was analysed by Lancelot-Miltgen et al. (2016) to study young consumers’ adoption
intentions of a location tracker technology in the light of privacy concerns. It is also used in Henseler
(2020) for demonstration purposes (Tutorial 9).
parseModel 45

Source

This data has been collected through a cooperation with the European Commission Joint Research
Center Institute for Prospective Technological Studies, contract “Young People and Emerging Dig-
ital Services: An Exploratory Survey on Motivations, Perceptions, and Acceptance of Risk” (EC
JRC Contract IPTS No: 150876-2007 F1ED-FR).

References

Henseler J (2020). Composite-Based Structural Equation Modeling: An Introduction to Partial


Least Squares & Co. Using ADANCO. Guilford Press.

Lancelot-Miltgen C, Henseler J, Gelhard C, Popovic A (2016). “Introducing new products that


affect consumer privacy: A mediation model.” Journal of Business Research, 69(10), 4659–4666.
doi: 10.1016/j.jbusres.2016.04.015.

Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_Med <- "
# Reflective measurement model
Trust =~ trust1 + trust2
PrCon =~ privcon1 + privcon2 + privcon3 + privcon4
Risk =~ risk1 + risk2 + risk3
Int =~ intent1 + intent2

# Structural model
Int ~ Trust + PrCon + Risk
Risk ~ Trust + PrCon
Trust ~ PrCon
"

out <- csem(.data = LancelotMiltgenetal2016, .model = model_Med,


.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06
)

parseModel Parse lavaan model

Description

Turns a model written in lavaan model syntax into a cSEMModel list.


46 parseModel

Usage
parseModel(
.model = NULL,
.instruments = NULL,
.check_errors = TRUE
)

Arguments
.model A model in lavaan model syntax or a cSEMModel list.
.instruments A named list of vectors of instruments. The names of the list elements are
the names of the dependent (LHS) constructs of the structural equation whose
explanatory variables are endogenous. The vectors contain the names of the
instruments corresponding to each equation. Note that exogenous variables of
a given equation must be supplied as instruments for themselves. Defaults to
NULL.
.check_errors Logical. Should the model to parse be checked for correctness in a sense that all
necessary components to estimate the model are given? Defaults to TRUE.

Details
Instruments must be supplied separately as a named list of vectors of instruments. The names of the
list elements are the names of the dependent constructs of the structural equation whose explanatory
variables are endogenous. The vectors contain the names of the instruments corresponding to each
equation. Note that exogenous variables of a given equation must be supplied as instruments for
themselves.
By default parseModel() attempts to check if the model provided is correct in a sense that all
necessary components required to estimate the model are specified (e.g., a construct of the structural
model has at least 1 item). To prevent checking for errors use .check_errors = FALSE.

Value
An object of class cSEMModel is a standardized list containing the following components. J stands
for the number of constructs and K for the number of indicators.

$structural A matrix mimicking the structural relationship between constructs. If constructs are
only linearly related, structural is of dimension (J x J) with row- and column names equal
to the construct names. If the structural model contains nonlinear relationships structural is
(J x (J + J*)) where J* is the number of nonlinear terms. Rows are ordered such that exogenous
constructs are always first, followed by constructs that only depend on exogenous constructs
and/or previously ordered constructs.
$measurement A (J x K) matrix mimicking the measurement/composite relationship between con-
structs and their related indicators. Rows are in the same order as the matrix $structural with
row names equal to the construct names. The order of the columns is such that $measurement
forms a block diagonal matrix.
$error_cor A (K x K) matrix mimicking the measurement error correlation relationship. The row
and column order is identical to the column order of $measurement.
parseModel 47

$cor_specified A matrix indicating the correlation relationships between any variables of the model
as specified by the user. Mainly for internal purposes. Note that $cor_specified may also
contain inadmissible correlations such as a correlation between measurement errors indicators
and constructs.
$construct_type A named vector containing the names of each construct and their respective type
("Common factor" or "Composite").
$construct_order A named vector containing the names of each construct and their respective order
("First order" or "Second order").
$model_type The type of model ("Linear" or "Nonlinear").
$instruments Only if instruments are supplied: a list of structural equations relating endogenous
RHS variables to instruments.
$indicators The names of the indicators (i.e., observed variables and/or first-order constructs)
$cons_exo The names of the exogenous constructs of the structural model (i.e., variables that do
not appear on the LHS of any structural equation)
$cons_endo The names of the endogenous constructs of the structural model (i.e., variables that
appear on the LHS of at least one structural equation)
$vars_2nd The names of the constructs modeled as second orders.
$vars_attached_to_2nd The names of the constructs forming or building a second order construct.
$vars_not_attached_to_2nd The names of the constructs not forming or building a second order
construct.
It is possible to supply an incomplete list to parseModel(), resulting in an incomplete cSEM-
Model list which can be passed to all functions that require .csem_model as a mandatory argument.
Currently, only the structural and the measurement matrix are required. However, specifying an
incomplete cSEMModel list may lead to unexpected behavior and errors. Use with care.

Examples
# ===========================================================================
# Providing a model in lavaan syntax
# ===========================================================================
model <- "
# Structural model
y1 ~ y2 + y3

# Measurement model
y1 =~ x1 + x2 + x3
y2 =~ x4 + x5
y3 =~ x6 + x7

# Error correlation
x1 ~~ x2
"

m <- parseModel(model)
m

# ===========================================================================
48 plot.cSEMFloodlight

# Providing a complete model in cSEM format (class cSEMModel)


# ===========================================================================
# If the model is already a cSEMModel object, the model is returned as is:

identical(m, parseModel(m)) # TRUE

# ===========================================================================
# Providing a list
# ===========================================================================
# It is possible to provide a list that contains at least the
# elements "structural" and "measurement". This is generally discouraged
# as this may cause unexpected errors.

m_incomplete <- m[c("structural", "measurement", "construct_type")]


parseModel(m_incomplete)

# Providing a list containing list names that are not part of a `cSEMModel`
# causes an error:

## Not run:
m_incomplete[c("name_a", "name_b")] <- c("hello world", "hello universe")
parseModel(m_incomplete)

## End(Not run)

# Failing to provide "structural" or "measurement" also causes an error:

## Not run:
m_incomplete <- m[c("structural", "construct_type")]
parseModel(m_incomplete)

## End(Not run)

plot.cSEMFloodlight cSEMFloodlight method for plot()

Description

Plot the direct effect of an independent variable (z) on a dependent variable (y) conditional on
the values of a moderator variable (x), including the confidence interval and the Johnson-Neyman
points.

Usage

## S3 method for class 'cSEMFloodlight'


plot(x, ...)
plot.cSEMSurface 49

Arguments
x An R object of class cSEMFloodlight.
... Currently ignored.

plot.cSEMSurface cSEMSurface method for plot()

Description
Plot the predicted values of an independent variable (z) The values are predicted based on a certain
moderator and a certain independent variable including all their higher-order terms.

Usage
## S3 method for class 'cSEMSurface'
plot(x, .plot_type = c("plotly"), ...)

Arguments
x An R object of class cSEMSurface.
.plot_type A character vector indicating the plot package used. Options are "plotly", "rsm",
and "persp". Defaults to "plotly".
... Additional parameters that can be passed to graphics::persp, e.g., to rotate
the plot.

See Also
doSurfaceAnalysis()

PoliticalDemocracy Data: political democracy

Description
The Industrialization and Political Democracy dataset. This dataset is used throughout Bollen’s
1989 book (see pages 12, 17, 36 in chapter 2, pages 228 and following in chapter 7, pages 321
and following in chapter 8; Bollen (1989)). The dataset contains various measures of political
democracy and industrialization in developing countries.

Usage
PoliticalDemocracy
50 PoliticalDemocracy

Format
A data frame of 75 observations of 11 variables.

y1 Expert ratings of the freedom of the press in 1960


y2 The freedom of political opposition in 1960
y3 The fairness of elections in 1960
y4 The effectiveness of the elected legislature in 1960
y5 Expert ratings of the freedom of the press in 1965
y6 The freedom of political opposition in 1965
y7 The fairness of elections in 1965
y8 The effectiveness of the elected legislature in 1965
x1 The gross national product (GNP) per capita in 1960
x2 The inanimate energy consumption per capita in 1960
x3 The percentage of the labor force in industry in 1960

Source
The lavaan package (version 0.6-3).

References
Bollen KA (1989). Structural Equations with Latent Variables. Wiley-Interscience. ISBN 978-
0471011712.

Examples
#============================================================================
# Example is taken from the lavaan website
#============================================================================
# Note: example is modified. Across-block correlations are removed
model <- "
# Measurement model
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + y8

# Regressions / Path model


dem60 ~ ind60
dem65 ~ ind60 + dem60

# residual correlations
y2 ~~ y4
y6 ~~ y8
"

aa <- csem(PoliticalDemocracy, model)


predict 51

predict Predict indicator scores

Description
Predict the indicator scores of endogenous constructs.

Usage
predict(
.object = NULL,
.benchmark = c("lm", "unit", "PLS-PM", "GSCA", "PCA", "MAXVAR"),
.cv_folds = 10,
.handle_inadmissibles = c("stop", "ignore", "set_NA"),
.r = 10,
.test_data = NULL
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.benchmark Character string. The procedure to obtain benchmark predictions. One of "lm",
"unit", "PLS-PM", "GSCA", "PCA", or "MAXVAR". Default to "lm".
.cv_folds Integer. The number of cross-validation folds to use. Setting .cv_folds to N
(the number of observations) produces leave-one-out cross-validation samples.
Defaults to 10.
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "stop",
"ignore", or "set_NA". If "stop", predict() will stop immediatly if estimation
yields an inadmissible result. For "ignore" all results are returned even if all
or some of the estimates yielded inadmissible results. For "set_NA" predictions
based on inadmissible parameter estimates are set to NA. Defaults to "stop"
.r Integer. The number of repetitions to use. Defaults to 10.
.test_data A matrix of test data with the same column names as the training data.

Details
Predict uses the procedure introduced by Shmueli et al. (2016) in the context of PLS (commonly
called: "PLSPredict" (Shmueli et al. 2019)). Predict uses k-fold cross-validation to randomly split
the data into training and test data and subsequently predicts the relevant values in the test data
based on the model parameter estimates obtained using the training data. The number of cross-
validation folds is 10 by default but may be changed using the .cv_folds argument. By default,
the procedure is repeated .r = 10 times to avoid irregularities due to a particular split. See Shmueli
et al. (2019) for details.
Alternatively, users may supply a matrix of .test_data with the same column names as those in
the data used to obtain .object (the training data). In this case, arguments .cv_folds and .r
52 predict

are ignored and predict uses the estimated coefficients from .object to predict the values in the
columns of .test_data.
In Shmueli et al. (2016) PLS-based predictions for indicator i are compared to the predictions
based on a multiple regression of indicator i on all available exogenous indicators (.benchmark
= "lm") and a simple mean-based prediction summarized in the Q2_predict metric. predict()
is more general in that is allows users to compare the predictions based on a so-called target
model/specificiation to predictions based on an alternative benchmark. Available benchmarks in-
clude predictions based on a linear model, PLS-PM weights, unit weights (i.e. sum scores), GSCA
weights, PCA weights, and MAXVAR weights.
Each estimation run is checked for admissibility using verify(). If the estimation yields inadmis-
sible results, predict() stops with an error ("stop"). Users may choose to "ignore" inadmissible
results or to simply set predictions to NA ("set_NA") for the particular run that failed.

Value
An object of class cSEMPredict with print and plot methods. Technically, cSEMPredict is a named
list containing the following list elements:

$Actual A matrix of the actual values/indicator scores of the endogenous constructs.


$Prediction_target A matrix of the predicted indicator scores of the endogenous constructs based
on the target model. Target refers to procedure used to estimate the parameters in .object.
$Residuals_target A matrix of the residual indicator scores of the endogenous constructs based on
the target model.
$Residuals_benchmark A matrix of the residual indicator scores of the endogenous constructs
based on a model estimated by the procedure given to .benchmark.
$Prediction_metrics A data frame containing the predictions metrics MAE, RMSE, and Q2_predict.
$Information A list with elements Target, Benchmark, Number_of_observations_training,
Number_of_observations_test, Number_of_folds, Number_of_repetitions, and Handle_inadmissibles.

References
Shmueli G, Ray S, Estrada JMV, Chatla SB (2016). “The Elephant in the Room: Predictive
Performance of PLS Models.” Journal of Business Research, 69(10), 4552–4564. doi: 10.1016/
j.jbusres.2016.03.049.

Shmueli G, Sarstedt M, Hair JF, Cheah J, Ting H, Vaithilingam S, Ringle CM (2019). “Predictive
Model Assessment in PLS-SEM: Guidelines for Using PLSpredict.” European Journal of Market-
ing, 53(11), 2322–2347. doi: 10.1108/ejm0220190189.

See Also
csem, cSEMResults

Examples
### Anime example taken from https://github.com/ISS-Analytics/pls-predict

# Load data
predict 53

data(Anime) # data is similar to the Anime.csv found on


# https://github.com/ISS-Analytics/pls-predict but with irrelevant
# columns removed

# Split into training and data the same way as it is done on


# https://github.com/ISS-Analytics/pls-predict
set.seed(123)

index <- sample.int(dim(Anime)[1], 83, replace = FALSE)


dat_train <- Anime[-index, ]
dat_test <- Anime[index, ]

# Specify model
model <- "
# Structural model

ApproachAvoidance ~ PerceivedVisualComplexity + Arousal

# Measurement/composite model

ApproachAvoidance =~ AA0 + AA1 + AA2 + AA3


PerceivedVisualComplexity <~ VX0 + VX1 + VX2 + VX3 + VX4
Arousal <~ Aro1 + Aro2 + Aro3 + Aro4
"

# Estimate (replicating the results of the `simplePLS()` function)


res <- csem(dat_train,
model,
.disattenuate = FALSE, # original PLS
.iter_max = 300,
.tolerance = 1e-07,
.PLS_weight_scheme_inner = "factorial"
)

# Predict using a user-supplied training data set


pp <- predict(res, .test_data = dat_test)
pp$Predictions_target[1:6, ]
pp

### Compute prediction metrics ------------------------------------------------


res2 <- csem(Anime, # whole data set
model,
.disattenuate = FALSE, # original PLS
.iter_max = 300,
.tolerance = 1e-07,
.PLS_weight_scheme_inner = "factorial"
)

# Predict using 10-fold cross-validation with 5 repetitions


## Not run:
pp2 <- predict(res, .benchmark = "lm")
pp2
## There is a plot method available
54 reliability

plot(pp2)
## End(Not run)

reliability Reliability

Description
Compute several reliability estimates. See the Reliability section of the cSEM website for details.

Usage
calculateRhoC(
.object = NULL,
.model_implied = TRUE,
.only_common_factors = TRUE,
.weighted = FALSE
)

calculateRhoT(
.object = NULL,
.alpha = 0.05,
.closed_form_ci = FALSE,
.only_common_factors = TRUE,
.output_type = c("vector", "data.frame"),
.weighted = FALSE,
...
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.model_implied Logical. Should weights be scaled using the model-implied indicator correlation
matrix? Defaults to TRUE.
.only_common_factors
Logical. Should only concepts modeled as common factors be included when
calculating one of the following quality critera: AVE, the Fornell-Larcker crite-
rion, HTMT, and all reliability estimates. Defaults to TRUE.
.weighted Logical. Should estimation be based on a score that uses the weights of the
weight approach used to obtain .object?. Defaults to FALSE.
.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.closed_form_ci
Logical. Should a closed-form confidence interval be computed? Defaults to
FALSE.
.output_type Character string. The type of output. One of "vector" or "data.frame". Defaults
to "vector".
... Ignored.
resamplecSEMResults 55

Details
Since reliability is defined with respect to a classical true score measurement model only concepts
modeled as common factors are considered by default. For concepts modeled as composites relia-
bility may be estimated by setting .only_common_factors = FALSE, however, it is unclear how to
interpret reliability in this case.
Reliability is traditionally based on a test score (proxy) based on unit weights. To compute con-
generic and tau-equivalent reliability based on a score that uses the weights of the weight approach
used to obtain .object use .weighted = TRUE instead.
For the tau-equivalent reliability ("rho_T" or "cronbachs_alpha") a closed-form confidence in-
terval may be computed (Trinchera et al. 2018) by setting .closed_form_ci = TRUE (default is
FALSE). If .alpha is a vector several CI’s are returned.

Value
For calculateRhoC() and calculateRhoT() (if .output_type = "vector") a named numeric
vector containing the reliability estimates. If .output_type = "data.frame" calculateRhoT()
returns a data.frame with as many rows as there are constructs modeled as common factors in
the model (unless .only_common_factors = FALSE in which case the number of rows equals the
total number of constructs in the model). The first column contains the name of the construct. The
second column the reliability estimate. If .closed_form_ci = TRUE the remaining columns contain
lower and upper bounds for the (1 - .alpha) confidence interval(s).

Functions
• calculateRhoC: Calculate the congeneric reliability
• calculateRhoT: Calculate the tau-equivalent reliability

References
Trinchera L, Marie N, Marcoulides GA (2018). “A Distribution Free Interval Estimate for Co-
efficient Alpha.” Structural Equation Modeling: A Multidisciplinary Journal, 25(6), 876–887.
doi: 10.1080/10705511.2018.1431544.

See Also
assess(), cSEMResults

resamplecSEMResults Resample cSEMResults

Description
Resample a cSEMResults object using bootstrap or jackknife resampling. The function is called by
csem() if the user sets csem(...,.resample_method = "bootstrap") or csem(...,.resample_method
= "jackknife") but may also be called directly.
56 resamplecSEMResults

Usage
resamplecSEMResults(
.object = NULL,
.resample_method = c("bootstrap", "jackknife"),
.resample_method2 = c("none", "bootstrap", "jackknife"),
.R = 499,
.R2 = 199,
.handle_inadmissibles = c("drop", "ignore", "replace"),
.user_funs = NULL,
.eval_plan = c("sequential", "multiprocess"),
.force = FALSE,
.seed = NULL,
.sign_change_option = c("none","individual","individual_reestimate",
"construct_reestimate"),
...
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.resample_method
Character string. The resampling method to use. One of: "bootstrap" or "jack-
knife". Defaults to "bootstrap".
.resample_method2
Character string. The resampling method to use when resampling from a resam-
ple. One of: "none", "bootstrap" or "jackknife". For "bootstrap" the number of
draws is provided via .R2. Currently, resampling from each resample is only
required for the studentized confidence intervall ("CI_t_interval") computed by
the infer() function. Defaults to "none".
.R Integer. The number of bootstrap replications. Defaults to 499.
.R2 Integer. The number of bootstrap replications to use when resampling from a
resample. Defaults to 199.
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "drop",
"ignore", or "replace". If "drop", all replications/resamples yielding an inadmis-
sible result will be dropped (i.e. the number of results returned will potentially
be less than .R). For "ignore" all results are returned even if all or some of the
replications yielded inadmissible results (i.e. number of results returned is equal
to .R). For "replace" resampling continues until there are exactly .R admissi-
ble solutions. Depending on the frequency of inadmissible solutions this may
significantly increase computing time. Defaults to "drop".
.user_funs A function or a (named) list of functions to apply to every resample. The func-
tions must take .object as its first argument (e.g., myFun <- function(.object, ...) {body-
of-the-function}). Function output should preferably be a (named) vector but
matrices are also accepted. However, the output will be vectorized (column-
wise) in this case. See the examples section for details.
resamplecSEMResults 57

.eval_plan Character string. The evaluation plan to use. One of "sequential" or "multipro-
cess". In the latter case all available cores will be used. Defaults to "sequential".
.force Logical. Should .object be resampled even if it contains resamples already?.
Defaults to FALSE.
.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!
.sign_change_option
Character string. Which sign change option should be used to handle flipping
signs when resampling? One of "none","individual", "individual_reestimate",
"construct_reestimate". Defaults to "none".
... Further arguments passed to functions supplied to .user_funs.

Details
Given M resamples (for bootstrap M = .R and for jackknife M = N, where N is the number of ob-
servations) based on the data used to compute the cSEMResults object provided via .object,
resamplecSEMResults() essentially calls csem() on each resample using the arguments of the
origianl call (ignoring any arguments related to resampling) and returns estimates for each of a
subset of practically useful resampled parameters/statistics computed by csem(). Currently, the
following estimates are computed and returned by default based on each resample: Path estimates,
Loading estimates, Weight estimates.
In practical application users may need to resample a specific statistic (e.g, the heterotrait-monotrait
ratio of correlations (HTMT) or differences between path coefficients such as beta_1 - beta_2).
Such statistics may be provided by a function fun(.object,...) or a list of such functions via the
.user_funs argument. The first argument of these functions must always be .object. Internally,
the function will be applied on each resample to produce the desired statistic. Hence, arbitrary
complicated statistics may be resampled as long as the body of the function draws on elements
contained in the cSEMResults object only. Output of fun(.object,...) should preferably be a
(named) vector but matrices are also accepted. However, the output will be vectorized (columnwise)
in this case. See the examples section for details.
Both resampling the origianl cSEMResults object (call it "first resample") and resampling based
on a resampled cSEMResults object (call it "second resample") are supported. Choices for the
former are "bootstrap" and "jackknife". Resampling based on a resample is turned off by default
(.resample_method2 = "none") as this significantly increases computation time (there are now M *
M2 resamples to compute, where M2 is .R2 or N). Resamples of a resample are required, e.g., for the
studentized confidence interval computed by the infer() function. Typically, bootstrap resamples
are used in this case (Davison and Hinkley 1997).
As csem() accepts a single data set, a list of data sets as well as data sets that contain a column
name used to split the data into groups, the cSEMResults object may contain multiple data sets.
In this case, resampling is done by data set or group. Note that depending on the number of data
sets/groups, the computation may be considerably slower as resampling will be repeated for each
data set/group. However, apart from speed considerations users don not need to worry about the
type of input used to compute the cSEMResults object as resamplecSEMResults() is able to deal
with each case.
The number of bootstrap runs for the first and second run are given by .R and .R2. The default is
499 for the first and 199 for the second run but should be increased in real applications. See e.g.,
58 resamplecSEMResults

Hesterberg (2015), p.380, Davison and Hinkley (1997), and Efron and Hastie (2016) for recom-
mendations. For jackknife .R are .R2 are ignored.
Resampling may produce inadmissble results (as checked by verify()). By default these results
are dropped however users may choose to "ignore" or "replace" inadmissble results in which
resampling continious until the necessary number of admissble results is reached.
The cSEM package supports (multi)processing via the future framework (Bengtsson 2018). Users
may simply choose an evaluation plan via .eval_plan and the package takes care of all the com-
plicated backend issues. Currently, users may choose between standard single-core/single-session
evaluation ("sequential") and multiprocessing ("multiprocess"). The future package provides
other options (e.g., "cluster" or "remote"), however, they probably will not be needed in the
context of the cSEM package as simulations usually do not require high-performance clusters. De-
peding on the operating system, the future package will manage to distribute tasks to multiple R
sessions (Windows) or multiple cores. Note that multiprocessing is not necessary always faster
when only a "small" number of replications is required as the overhead of initializing new sessions
or distributing tasks to different cores will not immediatley be compensated by the avaiability of
multiple sessions/cores.
Random number generation (RNG) uses the L’Ecuyer-CRMR RGN stream as implemented in the
future.apply package (Bengtsson 2018). It is independent of the evaluation plan. Hence, setting
e.g., .seed = 123 will generate the same random number and replicates for both .eval_plan =
"sequential" and .eval_plan = "multiprocess". See ?future_lapply for details.

Value
The core structure is the same structure as that of .object with the following elements added:
• $Estimates_resamples: A list containing the .R resamples and the original estimates for each
of the resampled quantities (Path_estimates, Loading_estimates, Weight_estimates, user de-
fined functions). Each list element is a list containing elements $Resamples and $Original.
$Resamples is a (.R x K) matrix with each row representing one resample for each of the K
parameters/statistics. $Original contains the original estimates (vectorized by column if the
output of the user provided function is a matrix.
• $Information_resamples: A list containing addtional information.
Use str(<.object>, list.len = 3) on the resulting object for an overview.

References
Bengtsson H (2018). future: Unified Parallel and Distributed Processing in R for Everyone. R
package version 1.10.0, https://CRAN.R-project.org/package=future.

Bengtsson H (2018). future.apply: Apply Function to Elements in Parallel using Futures. R pack-
age version 1.0.1, https://CRAN.R-project.org/package=future.apply.

Davison AC, Hinkley DV (1997). Bootstrap Methods and their Application. Cambridge University
Press. doi: 10.1017/cbo9780511802843.

Efron B, Hastie T (2016). Computer Age Statistical Inference. Cambridge University Pr. ISBN
1107149894.
resamplecSEMResults 59

Hesterberg TC (2015). “What Teachers Should Know About the Bootstrap: Resampling in the
Undergraduate Statistics Curriculum.” The American Statistician, 69(4), 371–386. doi: 10.1080/
00031305.2015.1089789.

See Also
csem, summarize(), infer(), cSEMResults

Examples
## Not run:
# Note: example not run as resampling is time consuming
# ===========================================================================
# Basic usage
# ===========================================================================
model <- "
# Structural model
QUAL ~ EXPE
EXPE ~ IMAG
SAT ~ IMAG + EXPE + QUAL + VAL
LOY ~ IMAG + SAT
VAL ~ EXPE + QUAL

# Measurement model
EXPE =~ expe1 + expe2 + expe3 + expe4 + expe5
IMAG =~ imag1 + imag2 + imag3 + imag4 + imag5
LOY =~ loy1 + loy2 + loy3 + loy4
QUAL =~ qual1 + qual2 + qual3 + qual4 + qual5
SAT =~ sat1 + sat2 + sat3 + sat4
VAL =~ val1 + val2 + val3 + val4
"

## Estimate the model without resampling


a <- csem(satisfaction, model)

## Bootstrap and jackknife estimation


boot <- resamplecSEMResults(a)
jack <- resamplecSEMResults(a, .resample_method = "jackknife")

## Alternatively use .resample_method in csem()


boot_csem <- csem(satisfaction, model, .resample_method = "bootstrap")
jack_csem <- csem(satisfaction, model, .resample_method = "jackknife")

# ===========================================================================
# Extended usage
# ===========================================================================
### Double resampling ------------------------------------------------------
# The confidence intervals (e.g. the bias-corrected and accelearated CI)
# require double resampling. Use .resample_method2 for this.

boot1 <- resamplecSEMResults(


.object = a,
60 resampleData

.resample_method = "bootstrap",
.R = 50,
.resample_method2 = "bootstrap",
.R2 = 20,
.seed = 1303
)

## Again, this is identical to using csem


boot1_csem <- csem(
.data = satisfaction,
.model = model,
.resample_method = "bootstrap",
.R = 50,
.resample_method2 = "bootstrap",
.R2 = 20,
.seed = 1303
)

identical(boot1, boot1_csem) # only true if .seed was set

### Inference ---------------------------------------------------------------


# To get inferencial quanitites such as the estimated standard error or
# the percentile confidence intervall for each resampled quantity use
# postestimation function infer()

inference <- infer(boot1)


inference$Path_estimates$sd
inference$Path_estimates$CI_percentile

# As usual summarize() can be called directly


summarize(boot1)

# In the example above .R x .R2 = 50 x 20 = 1000. Multiprocessing will be


# faster on most systems here and is therefore recommended. Note that multiprocessing
# does not affect the random number generation

boot2 <- resamplecSEMResults(


.object = a,
.resample_method = "bootstrap",
.R = 50,
.resample_method2 = "bootstrap",
.R2 = 20,
.eval_plan = "multiprocess",
.seed = 1303
)

identical(boot1, boot2)
## End(Not run)

resampleData Resample data


resampleData 61

Description
Resample data from a data set using common resampling methods. For bootstrap or jackknife re-
sampling, package users usually do not need to call this function but directly use resamplecSEMResults()
instead.

Usage
resampleData(
.object = NULL,
.data = NULL,
.resample_method = c("bootstrap", "jackknife", "permutation",
"cross-validation"),
.cv_folds = 10,
.id = NULL,
.R = 499,
.seed = NULL
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.data A data.frame, a matrix or a list of data of either type. Possible column
types or classes of the data provided are: "logical", "numeric" ("double" or
"integer"), "factor" (ordered and unordered) or a mix of several types. The
data may also include one character column whose column name must be given
to .id. This column is assumed to contain group identifiers used to split the data
into groups. If .data is provided, .object is ignored. Defaults to NULL.
.resample_method
Character string. The resampling method to use. One of: "bootstrap", "jack-
knife", "permutation", or "cross-validation". Defaults to "bootstrap".
.cv_folds Integer. The number of cross-validation folds to use. Setting .cv_folds to N
(the number of observations) produces leave-one-out cross-validation samples.
Defaults to 10.
.id Character string or integer. A character string giving the name or an integer of
the position of the column of .data whose levels are used to split .data into
groups. Defaults to NULL.
.R Integer. The number of bootstrap runs, permutation runs or cross-validation
repetitions to use. Defaults to 499.
.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!

Details
The function resampleData() is general purpose. It simply resamples data from a data set accord-
ing to the resampling method provided via the .resample_method argument and returns a list of
62 resampleData

resamples. Currently, bootstrap, jackknife, permutation, and cross-validation (both leave-


one-out (LOOCV) and k-fold cross-validation) are implemented.
The user may provide the data set to resample either explicitly via the .data argument or implicitly
by providing a cSEMResults objects to .object in which case the original data used in the call that
created the cSEMResults object is used for resampling. If both, a cSEMResults object and a data
set via .data are provided the former is ignored.
As csem() accepts a single data set, a list of data sets as well as data sets that contain a column
name used to split the data into groups, the cSEMResults object may contain multiple data sets.
In this case, resampling is done by data set or group. Note that depending on the number of data
sets/groups provided this computation may be slower as resampling will be repeated for each data
set/group.
To split data provided via the .data argument into groups, the column name or the column index of
the column containing the group levels to split the data must be given to .id. If data that contains
grouping is taken from a cSEMResults object, .id is taken from the object information. Hence,
providing .id is redundant in this case and therefore ignored.
The number of bootstrap or permutation runs as well as the number of cross-validation repetitions
is given by .R. The default is 499 but should be increased in real applications. See e.g., Hesterberg
(2015), p.380 for recommendations concerning the bootstrap. For jackknife .R is ignored as it is
based on the N leave-one-out data sets.
Choosing resample_method = "permutation" for ungrouped data causes an error as permutation
will simply reorder the observations which is usually not meaningful. If a list of data is provided
each list element is assumed to represent the observations belonging to one group. In this case, data
is pooled and group adherence permutated.
For cross-validation the number of folds (k) defaults to 10. It may be changed via the .cv_folds
argument. Setting k = 2 (not 1!) splits the data into a single training and test data set. Setting k =
N (where N is the number of observations) produces leave-one-out cross-validation samples. Note:
1.) At least 2 folds required (k > 1); 2.) k can not be larger than N; 3.) If N/k is not not an integer
the last fold will have less observations.
Random number generation (RNG) uses the L’Ecuyer-CRMR RGN stream as implemented in the
future.apply package (Bengtsson 2018). See ?future_lapply for details. By default a random seed
is chosen.

Value
The structure of the output depends on the type of input and the resampling method:
Bootstrap If a matrix or data.frame without grouping variable is provided (i.e., .id = NULL), the
result is a list of length .R (default 499). Each element of that list is a bootstrap (re)sample. If
a grouping variable is specified or a list of data is provided (where each list element is assumed
to contain data for one group), resampling is done by group. Hence, the result is a list of length
equal to the number of groups with each list element containing .R bootstrap samples based
on the N_g observations of group g.
Jackknife If a matrix or data.frame without grouping variable is provided (.id = NULL), the
result is a list of length equal to the number of observations/rows (N) of the data set provided.
Each element of that list is a jackknife (re)sample. If a grouping variable is specified or a
list of data is provided (where each list element is assumed to contain data for one group),
resampling is done by group. Hence, the result is a list of length equal to the number of group
resampleData 63

levels with each list element containing N jackknife samples based on the N_g observations of
group g.
Permutation If a matrix or data.frame without grouping variable is provided an error is returned
as permutation will simply reorder the observations. If a grouping variable is specified or a list
of data is provided (where each list element is assumed to contain data of one group), group
membership is permutated. Hence, the result is a list of length .R where each element of that
list is a permutation (re)sample.
Cross-validation If a matrix or data.frame without grouping variable is provided a list of length
.R is returned. Each list element contains a list containing the k splits/folds subsequently used
as test and training data sets. If a grouping variable is specified or a list of data is provided
(where each list element is assumed to contain data for one group), cross-validation is repeated
.R times for each group. Hence, the result is a list of length equal to the number of groups,
each containing .R list elements (the repetitions) which in turn contain the k splits/folds.

References
Bengtsson H (2018). future.apply: Apply Function to Elements in Parallel using Futures. R pack-
age version 1.0.1, https://CRAN.R-project.org/package=future.apply.

Hesterberg TC (2015). “What Teachers Should Know About the Bootstrap: Resampling in the
Undergraduate Statistics Curriculum.” The American Statistician, 69(4), 371–386. doi: 10.1080/
00031305.2015.1089789.

See Also
csem(), cSEMResults, resamplecSEMResults()

Examples
# ===========================================================================
# Using the raw data
# ===========================================================================
### Bootstrap (default) -----------------------------------------------------

res_boot1 <- resampleData(.data = satisfaction)


str(res_boot1, max.level = 3, list.len = 3)

## To replicate a bootstrap draw use .seed:


res_boot1a <- resampleData(.data = satisfaction, .seed = 2364)
res_boot1b <- resampleData(.data = satisfaction, .seed = 2364)

identical(res_boot1, res_boot1a) # TRUE

### Jackknife ---------------------------------------------------------------

res_jack <- resampleData(.data = satisfaction, .resample_method = "jackknife")


str(res_jack, max.level = 3, list.len = 3)

### Cross-validation --------------------------------------------------------


## Create dataset for illustration:
64 resampleData

dat <- data.frame(


"x1" = rnorm(100),
"x2" = rnorm(100),
"group" = sample(c("male", "female"), size = 100, replace = TRUE),
stringsAsFactors = FALSE)

## 10-fold cross-validation (repeated 100 times)


cv_10a <- resampleData(.data = dat, .resample_method = "cross-validation",
.R = 100)
str(cv_10a, max.level = 3, list.len = 3)

# Cross-validation can be done by group if a group identifyer is provided:


cv_10 <- resampleData(.data = dat, .resample_method = "cross-validation",
.id = "group", .R = 100)

## Leave-one-out-cross-validation (repeated 50 times)


cv_loocv <- resampleData(.data = dat[, -3],
.resample_method = "cross-validation",
.cv_folds = nrow(dat),
.R = 50)
str(cv_loocv, max.level = 2, list.len = 3)

### Permuation ---------------------------------------------------------------

res_perm <- resampleData(.data = dat, .resample_method = "permutation",


.id = "group")
str(res_perm, max.level = 2, list.len = 3)

# Forgetting to set .id causes an error


## Not run:
res_perm <- resampleData(.data = dat, .resample_method = "permutation")

## End(Not run)

# ===========================================================================
# Using a cSEMResults object
# ===========================================================================

model <- "


# Structural model
QUAL ~ EXPE
EXPE ~ IMAG
SAT ~ IMAG + EXPE + QUAL + VAL
LOY ~ IMAG + SAT
VAL ~ EXPE + QUAL

# Measurement model
EXPE =~ expe1 + expe2 + expe3 + expe4 + expe5
IMAG =~ imag1 + imag2 + imag3 + imag4 + imag5
LOY =~ loy1 + loy2 + loy3 + loy4
QUAL =~ qual1 + qual2 + qual3 + qual4 + qual5
SAT =~ sat1 + sat2 + sat3 + sat4
VAL =~ val1 + val2 + val3 + val4
Russett 65

"
a <- csem(satisfaction, model)

# Create bootstrap and jackknife samples


res_boot <- resampleData(a, .resample_method = "bootstrap", .R = 499)
res_jack <- resampleData(a, .resample_method = "jackknife")

# Since `satisfaction` is the dataset used the following approaches yield


# identical results.
res_boot_data <- resampleData(.data = satisfaction, .seed = 2364)
res_boot_object <- resampleData(a, .seed = 2364)

identical(res_boot_data, res_boot_object) # TRUE

Russett Data: Russett

Description
A data frame containing 10 variables with 47 observations.

Usage
Russett

Format
A data frame containing the following variables for 47 countries:

gini The Gini index of concentration


farm The percentage of landholders who collectively occupy one-half of all the agricultural land
(starting with the farmers with the smallest plots of land and working toward the largest)
rent The percentage of the total number of farms that rent all their land. Transformation: ln (x +
1)
gnpr The 1955 gross national product per capita in U.S. dollars. Transformation: ln (x)
labo The percentage of the labor force employed in agriculture. Transformation: ln (x)
inst Instability of personnel based on the term of office of the chief executive. Transformation:
exp (x - 16.3)
ecks The total number of politically motivated violent incidents, from plots to protracted guerrilla
warfare. Transformation: ln (x + 1)
deat The number of people killed as a result of internal group violence per 1,000,000 people.
Transformation: ln (x + 1)
stab One if the country has a stable democracy, and zero otherwise
dict One if the country experiences a dictatorship, and zero otherwise
66 satisfaction

Details

The dataset was initially compiled by Russett (1964), discussed and reprinted by Gifi (1990), and
partially transformed by Tenenhaus and Tenenhaus (2011). It is also used in Henseler (2020) for
demonstration purposes.

Source

From: Henseler (2020)

References

Gifi A (1990). Nonlinear multivariate analysis. Wiley.

Henseler J (2020). Composite-Based Structural Equation Modeling: An Introduction to Partial


Least Squares & Co. Using ADANCO. Guilford Press.

Russett BM (1964). “Inequality and Instability: The Relation of Land Tenure to Politics.” World
Politics, 16(3), 442–454. doi: 10.2307/2009581.

Tenenhaus A, Tenenhaus M (2011). “Regularized generalized canonical correlation analysis.” Psy-


chometrika, 76(2), 257–284.

Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_Russett="
# Composite model
AgrIneq <~ gini + farm + rent
IndDev <~ gnpr + labo
PolInst <~ inst + ecks + deat + stab + dict

# Structural model
PolInst ~ AgrIneq + IndDev
"

out <- csem(.data = Russett, .model = model_Russett,


.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06
)

satisfaction Data: satisfaction


satisfaction 67

Description

A data frame with 250 observations and 27 variables. Variables from 1 to 27 refer to six la-
tent concepts: IMAG=Image, EXPE=Expectations, QUAL=Quality, VAL=Value, SAT=Satisfaction, and
LOY=Loyalty.

imag1-imag5 Indicators attached to concept IMAG which is supposed to capture aspects such as the
institutions reputation, trustworthiness, seriousness, solidness, and caring about customer.
expe1-expe5 Indicators attached to concept EXPE which is supposed to capture aspects concerning
products and services provided, customer service, providing solutions, and expectations for
the overall quality.
qual1-qual5 Indicators attached to concept QUAL which is supposed to capture aspects concerning
reliability of products and services, the range of products and services, personal advice, and
overall perceived quality.
val1-val4 Indicators attached to concept VAL which is supposed to capture aspects related to bene-
ficial services and products, valuable investments, quality relative to price, and price relative
to quality.
sat1-sat4 Indicators attached to concept SAT which is supposed to capture aspects concerning over-
all rating of satisfaction, fulfillment of expectations, satisfaction relative to other banks, and
performance relative to customer’s ideal bank.
loy1-loy4 Indicators attached to concept LOY which is supposed to capture aspects concerning
propensity to choose the same bank again, propensity to switch to other bank, intention to
recommend the bank to friends, and the sense of loyalty.

Usage

satisfaction

Format

An object of class data.frame with 250 rows and 27 columns.

Details

This dataset contains the variables from a customer satisfaction study of a Spanish credit institution
on 250 customers. The data is identical to the dataset provided by the plspm package but with
the last column (gender) removed. If you are looking for the original dataset use the satisfac-
tion_gender dataset.

Source

The plspm package (version 0.4.9). Original source according to plspm: "Laboratory of Informa-
tion Analysis and Modeling (LIAM). Facultat d’Informatica de Barcelona, Universitat Politecnica
de Catalunya".
68 satisfaction_gender

satisfaction_gender Data: satisfaction including gender

Description
A data frame with 250 observations and 28 variables. Variables from 1 to 27 refer to six la-
tent concepts: IMAG=Image, EXPE=Expectations, QUAL=Quality, VAL=Value, SAT=Satisfaction, and
LOY=Loyalty.
imag1-imag5 Indicators attached to concept IMAG which is supposed to capture aspects such as the
institutions reputation, trustworthiness, seriousness, solidness, and caring about customer.
expe1-expe5 Indicators attached to concept EXPE which is supposed to capture aspects concerning
products and services provided, customer service, providing solutions, and expectations for
the overall quality.
qual1-qual5 Indicators attached to concept QUAL which is supposed to capture aspects concerning
reliability of products and services, the range of products and services, personal advice, and
overall perceived quality.
val1-val4 Indicators attached to concept VAL which is supposed to capture aspects related to bene-
ficial services and products, valuable investments, quality relative to price, and price relative
to quality.
sat1-sat4 Indicators attached to concept SAT which is supposed to capture aspects concerning over-
all rating of satisfaction, fulfillment of expectations, satisfaction relative to other banks, and
performance relative to customer’s ideal bank.
loy1-loy4 Indicators attached to concept LOY which is supposed to capture aspects concerning
propensity to choose the same bank again, propensity to switch to other bank, intention to
recommend the bank to friends, and the sense of loyalty.
gender The sex of the respondent.

Usage
satisfaction_gender

Format
An object of class data.frame with 250 rows and 28 columns.

Details
This data set contains the variables from a customer satisfaction study of a Spanish credit institution
on 250 customers. The data is taken from the plspm package. For convenience, there is a version
of the dataset with the last column (gender) removed: satisfaction.

Source
The plspm package (version 0.4.9). Original source according to plspm: "Laboratory of Informa-
tion Analysis and Modeling (LIAM). Facultat d’Informatica de Barcelona, Universitat Politecnica
de Catalunya".
Sigma_Summers_composites 69

Sigma_Summers_composites
Data: Summers

Description
A (18 x 18) indicator correlation matrix.

Usage
Sigma_Summers_composites

Format
An object of class matrix with 18 rows and 18 columns.

Details
The indicator correlation matrix for a modified version of Summers (1965) model. All constructs
are modeled as composites.

Source
Own calculation based on Dijkstra and Henseler (2015).

References
Dijkstra TK, Henseler J (2015). “Consistent and Asymptotically Normal PLS Estimators for Linear
Structural Equations.” Computational Statistics & Data Analysis, 81, 10–23.

Summers R (1965). “A Capital Intensive Approach to the Small Sample Properties of Various
Simultaneous Equation Estimators.” Econometrica, 33(1), 1–41.

Examples

require(cSEM)

model <- "


ETA1 ~ ETA2 + XI1 + XI2
ETA2 ~ ETA1 + XI3 +XI4

ETA1 ~~ ETA2

XI1 <~ x1 + x2 + x3
XI2 <~ x4 + x5 + x6
XI3 <~ x7 + x8 + x9
XI4 <~ x10 + x11 + x12
ETA1 <~ y1 + y2 + y3
70 summarize

ETA2 <~ y4 + y5 + y6
"

## Generate data
summers_dat <- MASS::mvrnorm(n = 300, mu = rep(0, 18),
Sigma = Sigma_Summers_composites, empirical = TRUE)

## Estimate
res <- csem(.data = summers_dat, .model = model) # inconsistent

##
# 2SLS
res_2SLS <- csem(.data = summers_dat, .model = model, .approach_paths = "2SLS",
.instruments = list(ETA1 = c('XI1', 'XI2', 'XI3', 'XI4'),
ETA2 = c('XI1', 'XI2', 'XI3', 'XI4'))
)

summarize Summarize model

Description
The summary is mainly focused on estimated parameters. For quality criteria such as the average
variance extracted (AVE), reliability estimates, effect size estimates etc., use assess().

Usage
summarize(
.object = NULL,
.alpha = 0.05,
.ci = NULL,
...
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.ci A vector of character strings naming the confidence interval to compute. For
possible choices see infer().
... Further arguments to summarize(). Currently ignored.

Details
If .object contains resamples, standard errors, t-values and p-values (assuming estimates are stan-
dard normally distributed) are printed as well. By default the percentile confidence interval is given
as well. For other confidence intervals use the .ci argument. See infer() for possible choices and
a description.
summarize 71

Value
An object of class cSEMSummarize. A cSEMSummarize object has the same structure as the cSEM-
Results object with a couple differences:
1. Elements $Path_estimates, $Loadings_estimates, $Weight_estimates, $Weight_estimates, and
$Residual_correlation are standardized data frames instead of matrices.
2. Data frames $Effect_estimates, $Indicator_correlation, and $Exo_construct_correlation are
added to $Estimates.
The data frame format is usually much more convenient if users intend to present the results in e.g.,
a paper or a presentation.

See Also
csem, assess(), cSEMResults

Examples
## Take a look at the dataset
#?threecommonfactors

## Specify the (correct) model


model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# (Reflective) measurement model


eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"

## Estimate
res <- csem(threecommonfactors, model, .resample_method = "bootstrap", .R = 40)

## Postestimation
res_summarize <- summarize(res)
res_summarize

# Extract e.g. the loadings


res_summarize$Estimates$Loading_estimates

## By default only the 95% percentile confidence interval is printed. User


## can have several confidence interval computed, however, only the first
## will be printed.

res_summarize <- summarize(res, .ci = c("CI_standard_t", "CI_percentile"),


.alpha = c(0.05, 0.01))
res_summarize

# Extract the loading including both confidence intervals


72 Switching

res_summarize$Estimates$Path_estimates

Switching Data: Switching

Description
A data frame containing 26 variables with 767 observations.

Usage
Switching

Format
An object of class data.frame with 767 rows and 26 columns.

Details
The data contains variables about the consumers’ intention to switch a service provider. It is also
used in Henseler (2020) for demonstration purposes (Tutorial 12).

Source
The dataset is provided by Joerg Henseler.

References
Henseler J (2020). Composite-Based Structural Equation Modeling: An Introduction to Partial
Least Squares & Co. Using ADANCO. Guilford Press.

Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_Int <-"
# Measurement models
INV =~ INV1 + INV2 + INV3 +INV4
SAT =~ SAT1 + SAT2 + SAT3
INT =~ INT1 + INT2

# Structural model containing an interaction term.


INT ~ INV + SAT + INV.SAT
"

out <- csem(.data = Switching, .model = model_Int,


.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06)
testHausman 73

testHausman Regression-based Hausman test

Description
Calculates the regression-based Hausman test to be used to compare OLS to 2SLS estimates or
2SLS to 3SLS estimates. See e.g., Wooldridge (2010) (pages 131 f.) for details.

Usage
testHausman(
.object = NULL,
.eval_plan = c("sequential", "multiprocess"),
.handle_inadmissibles = c("drop", "ignore", "replace"),
.R = 499,
.resample_method = c("bootstrap", "jackknife"),
.seed = NULL
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.eval_plan Character string. The evaluation plan to use. One of "sequential" or "multipro-
cess". In the latter case all available cores will be used. Defaults to "sequential".
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "drop",
"ignore", or "replace". If "drop", all replications/resamples yielding an inadmis-
sible result will be dropped (i.e. the number of results returned will potentially
be less than .R). For "ignore" all results are returned even if all or some of the
replications yielded inadmissible results (i.e. number of results returned is equal
to .R). For "replace" resampling continues until there are exactly .R admissi-
ble solutions. Depending on the frequency of inadmissible solutions this may
significantly increase computing time. Defaults to "drop".
.R Integer. The number of bootstrap replications. Defaults to 499.
.resample_method
Character string. The resampling method to use. One of: "none", "bootstrap" or
"jackknife". Defaults to "none".
.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!

Details
The function is somewhat experimental. Only use if you know what you are doing.
74 testHausman

References
Wooldridge JM (2010). Econometric Analysis of Cross Section and Panel Data, 2 edition. MIT
Press.

See Also
csem(), cSEMResults

Examples
### Example from Dijkstra & Hensler (2015)
## Prepartion (values are from p. 15-16 of the paper)
Lambda <- t(kronecker(diag(6), c(0.7, 0.7, 0.7)))
Phi <- matrix(c(1.0000, 0.5000, 0.5000, 0.5000, 0.0500, 0.4000,
0.5000, 1.0000, 0.5000, 0.5000, 0.5071, 0.6286,
0.5000, 0.5000, 1.0000, 0.5000, 0.2929, 0.7714,
0.5000, 0.5000, 0.5000, 1.0000, 0.2571, 0.6286,
0.0500, 0.5071, 0.2929, 0.2571, 1.0000, sqrt(0.5),
0.4000, 0.6286, 0.7714, 0.6286, sqrt(0.5), 1.0000),
ncol = 6)

## Create population indicator covariance matrix


Sigma <- t(Lambda) %*% Phi %*% Lambda
diag(Sigma) <- 1
dimnames(Sigma) <- list(paste0("x", rep(1:6, each = 3), 1:3),
paste0("x", rep(1:6, each = 3), 1:3))

## Generate data
dat <- MASS::mvrnorm(n = 500, mu = rep(0, 18), Sigma = Sigma, empirical = TRUE)
# empirical = TRUE to show that 2SLS is in fact able to recover the true population
# parameters.

## Model to estimate
model <- "
## Structural model (nonrecurisve)
eta5 ~ eta6 + eta1 + eta2
eta6 ~ eta5 + eta3 + eta4

## Measurement model
eta1 =~ x11 + x12 + x13
eta2 =~ x21 + x22 + x23
eta3 =~ x31 + x32 + x33
eta4 =~ x41 + x42 + x43

eta5 =~ x51 + x52 + x53


eta6 =~ x61 + x62 + x63
"

library(cSEM)

## Estimate
res_ols <- csem(dat, .model = model, .approach_paths = "OLS")
testMGD 75

sum_res_ols <- summarize(res_ols)

# Note: For the example the model-implied indicator correlation is irrelevant


# the warnings can be ignored.

res_2sls <- csem(dat, .model = model, .approach_paths = "2SLS",


.instruments = list("eta5" = c('eta1','eta2','eta3','eta4'),
"eta6" = c('eta1','eta2','eta3','eta4')))
sum_res_2sls <- summarize(res_2sls)
# Note that exogenous constructs are supplied as instruments for themselves!

## Test for endogeneity


test_ha <- testHausman(res_2sls, .R = 200)
test_ha

testMGD Tests for multi-group comparisons

Description
This function performs various tests proposed in the context of multigroup analysis.
The following tests are implemented:

.approach_mgd = "Klesel": Approach suggested by Klesel et al. (2019) The model-implied variance-
covariance matrix (either indicator (.type_vcv = "indicator") or construct (.type_vcv =
"construct")) is compared across groups. If the model-implied indicator or construct cor-
relation matrix based on a saturated structural model should be compared, set .saturated =
TRUE. To measure the distance between the model-implied variance-covariance matrices, the
geodesic distance (dG) and the squared Euclidean distance (dL) are used. If more than two
groups are compared, the average distance over all groups is used.
.approach_mgd = "Sarstedt": Approach suggested by Sarstedt et al. (2011) Groups are com-
pared in terms of parameter differences across groups. Sarstedt et al. (2011) tests if parameter
k is equal across all groups. If several parameters are tested simultaneously it is recommended
to adjust the significance level or the p-values (in cSEM correction is done by p-value). By
default no multiple testing correction is done, however, several common adjustments are avail-
able via .approach_p_adjust. See stats::p.adjust() for details. Note: the test has some
severe shortcomings. Use with caution.
.approach_mgd = "Chin": Approach suggested by Chin and Dibbern (2010) Groups are com-
pared in terms of parameter differences across groups. Chin and Dibbern (2010) tests if
parameter k is equal between two groups. If more than two groups are tested for equality,
parameter k is compared between all pairs of groups. In this case, it is recommended to
adjust the significance level or the p-values (in cSEM correction is done by p-value) since
this is essentially a multiple testing setup. If several parameters are tested simultaneously,
correction is by group and number of parameters. By default no multiple testing correction
is done, however, several common adjustments are available via .approach_p_adjust. See
stats::p.adjust() for details.
76 testMGD

.approach_mgd = "Keil": Approach suggested by Keil et al. (2000) Groups are compared in terms
of parameter differences across groups. Keil et al. (2000) tests if parameter k is equal between
two groups. It is assumed, that the standard errors of the coefficients are equal across groups.
The calculation of the standard error of the parameter difference is adjusted as proposed by
Henseler et al. (2009). If more than two groups are tested for equality, parameter k is com-
pared between all pairs of groups. In this case, it is recommended to adjust the significance
level or the p-values (in cSEM correction is done by p-value) since this is essentially a mul-
tiple testing setup. If several parameters are tested simultaneously, correction is by group and
number of parameters. By default no multiple testing correction is done, however, several
common adjustments are available via .approach_p_adjust. See stats::p.adjust() for
details.
.approach_mgd = "Nitzl": Approach suggested by Nitzl (2010) Groups are compared in terms
of parameter differences across groups. Similarly to Keil et al. (2000), a single parameter k is
tested for equality between two groups. In contrast to Keil et al. (2000), it is assumed, that the
standard errors of the coefficients are unequal across groups (Sarstedt et al. 2011). If more than
two groups are tested for equality, parameter k is compared between all pairs of groups. In this
case, it is recommended to adjust the significance level or the p-values (in cSEM correction
is done by p-value) since this is essentially a multiple testing setup. If several parameters
are tested simultaneously, correction is by group and number of parameters. By default no
multiple testing correction is done, however, several common adjustments are available via
.approach_p_adjust. See stats::p.adjust() for details.
.approach_mgd = "CI_param": Approach mentioned in Sarstedt et al. (2011) This approach is
based on the confidence intervals constructed around the parameter estimates of the two
groups. If the parameter of one group falls within the confidence interval of the other group
and/or vice versa, it can be concluded that there is no group difference. Since it is based on
the confidence intervals .approach_p_adjust is ignored.
.approach_mgd = "CI_overlap": Approach mentioned in Sarstedt et al. (2011) This approach
is based on the confidence intervals (CIs) constructed around the parameter estimates of the
two groups. If the two CIs overlap, it can be concluded that there is no group difference. Since
it is based on the confidence intervals .approach_p_adjust is ignored.

Usage
testMGD(
.object = NULL,
.alpha = 0.05,
.approach_p_adjust = "none",
.approach_mgd = c("all", "Klesel", "Chin", "Sarstedt",
"Keil", "Nitzl", "Henseler", "CI_para","CI_overlap"),
.parameters_to_compare = NULL,
.handle_inadmissibles = c("replace", "drop", "ignore"),
.R_permutation = 499,
.R_bootstrap = 499,
.saturated = FALSE,
.seed = NULL,
.type_ci = "CI_percentile",
.type_vcv = c("indicator", "construct"),
.verbose = TRUE
testMGD 77

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.approach_p_adjust
Character string or a vector of character strings. Approach used to adjust the
p-value for multiple testing. See the methods argument of stats::p.adjust()
for a list of choices and their description. Defaults to "none".
.approach_mgd Character string or a vector of character strings. Approach used for the multi-
group comparison. One of: "all", "Klesel", "Chin", "Sarstedt", "Keil, "Nitzl",
"Henseler", "CI_para", or "CI_overlap". Default to "all" in which case all ap-
proaches are computed (if possible).
.parameters_to_compare
A model in lavaan model syntax indicating which parameters (i.e, path (~), load-
ings (=~), weights (<~), or correlations (~~)) should be compared across groups.
Defaults to NULL in which case all weights, loadings and path coefficients of the
originally specified model are compared.
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "drop",
"ignore", or "replace". If "drop", all replications/resamples yielding an inadmis-
sible result will be dropped (i.e. the number of results returned will potentially
be less than .R). For "ignore" all results are returned even if all or some of the
replications yielded inadmissible results (i.e. number of results returned is equal
to .R). For "replace" resampling continues until there are exactly .R admissible
solutions. Defaults to "replace" to accommodate all approaches.
.R_permutation Integer. The number of permutations. Defaults to 499
.R_bootstrap Integer. The number of bootstrap runs. Ignored if .object contains resamples.
Defaults to 499
.saturated Logical. Should a saturated structural model be used? Defaults to FALSE.
.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!
.type_ci Character string. It indicates which confidence interval should be calculated.
For possible choices, see the .quantity argument of the infer function. In the
test_mgd function default is to "CI_percentile".
.type_vcv Character string. Which model-implied correlation matrix is calculated? One of
"indicator" or "construct". Defaults to "indicator".
.verbose Logical. Should information (e.g., progress bar) be printed to the console? De-
faults to TRUE.

Details
Use .approach_mgd to choose the approach. By default all approaches are computed (.approach_mgd
= "all").
78 testMGD

By default, approaches based on parameter differences across groups compare all parameters (.parameters_to_compare
= NULL). To compare only a subset of parameters provide the parameters in lavaan model syntax just
like the model to estimate. Take the simple model:

model_to_estimate <- "


Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# Each concept os measured by 3 indicators, i.e., modeled as latent variable


eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"

If only the path from eta1 to eta3 and the loadings of eta1 are to be compared across groups, write:

to_compare <- "


Structural parameters to compare
eta3 ~ eta1

# Loadings to compare
eta1 =~ y11 + y12 + y13
"

Note that the "model" provided to .parameters_to_compare does not need to be an estimable
model!
Note also that compared to all other functions in cSEM using the argument, .handle_inadmissibles
defaults to "replace" to accomdate the Sarstedt et al. (2011) approach.
Argument .R_permuation is ignored for the "Nitzl" and the "Keil" approach. .R_bootstrap is
ignored if .object already contains resamples, i.e. has class cSEMResults_resampled and if only
the "Klesel" or the "Chin" approach are used.
The argument .saturated is used by "Klesel" only. If .saturated = TRUE the original structural
model is ignored and replaced by a saturated model, i.e. a model in which all constructs are allowed
to correlate freely. This is useful to test differences in the measurement models between groups in
isolation.

Value
A list of class cSEMTestMGD. Technically, cSEMTestMGD is a named list containing the following list
elements:

$Information Additional information.


$Klesel A list with elements, Test_statistic, P_value, and Decision
$Chin A list with elements, Test_statistic, P_value, Decision, and Decision_overall
$Sarstedt A list with elements, Test_statistic, P_value, Decision, and Decision_overall
$Keil A list with elements, Test_statistic, P_value, Decision, and Decision_overall
testMGD 79

$Nitzl A list with elements, Test_statistic, P_value, Decision, and Decision_overall


$Henseler A list with elements, Test_statistic, P_value, Decision, and Decision_overall
$CI_para A list with elements, Decision, and Decision_overall
$CI_overlap A list with elements, Decision, and Decision_overall

References
Chin WW, Dibbern J (2010). “An Introduction to a Permutation Based Procedure for Multi-Group
PLS Analysis: Results of Tests of Differences on Simulated Data and a Cross Cultural Analysis
of the Sourcing of Information System Services Between Germany and the USA.” In Handbook of
Partial Least Squares, 171–193. Springer Berlin Heidelberg. doi: 10.1007/9783540328278_8.

Henseler J, Ringle CM, Sinkovics RR (2009). “The use of partial least squares path modeling in
international marketing.” Advances in International Marketing, 20, 277–320. doi: 10.1108/S1474-
7979(2009)0000020014.

Keil M, Tan BC, Wei K, Saarinen T, Tuunainen V, Wassenaar A (2000). “A cross-cultural study
on escalation of commitment behavior in software projects.” MIS Quarterly, 24(2), 299–325.

Klesel M, Schuberth F, Henseler J, Niehaves B (2019). “A Test for Multigroup Comparison Using
Partial Least Squares Path Modeling.” Internet Research, 29(3), 464–477. doi: 10.1108/intr112017-
0418.

Nitzl C (2010). “Eine anwenderorientierte Einfuehrung in die Partial Least Square (PLS)-Methode.”
In Arbeitspapier, number 21. Universitaet Hamburg, Institut fuer Industrielles Management, Ham-
burg.

Sarstedt M, Henseler J, Ringle CM (2011). “Multigroup Analysis in Partial Least Squares (PLS)
Path Modeling: Alternative Methods and Empirical Results.” In Advances in International Market-
ing, 195–218. Emerald Group Publishing Limited. doi: 10.1108/s14747979(2011)0000022012.

See Also
csem(), cSEMResults, testMICOM(), testOMF()

Examples
## Not run:
# ===========================================================================
# Basic usage
# ===========================================================================
model <- "
# Structural model
QUAL ~ EXPE
EXPE ~ IMAG
SAT ~ IMAG + EXPE + QUAL + VAL
LOY ~ IMAG + SAT
VAL ~ EXPE + QUAL
80 testMGD

# Measurement model

EXPE <~ expe1 + expe2 + expe3 + expe4 + expe5


IMAG <~ imag1 + imag2 + imag3 + imag4 + imag5
LOY =~ loy1 + loy2 + loy3 + loy4
QUAL =~ qual1 + qual2 + qual3 + qual4 + qual5
SAT <~ sat1 + sat2 + sat3 + sat4
VAL <~ val1 + val2 + val3 + val4
"

## Create list of virtually identical data sets


dat <- list(satisfaction[-3,], satisfaction[-5, ], satisfaction[-10, ])
out <- csem(dat, model, .resample_method = "bootstrap", .R = 40)

## Test
testMGD(out, .R_permutation = 40,.verbose = FALSE)

# Notes:
# 1. .R_permutation (and .R in the call to csem) is small to make examples run quicker;
# should be higher in real applications.
# 2. Test will not reject their respective H0s since the groups are virtually
# identical.
# 3. Only exception is the approach suggested by Sarstedt et al. (2011), a
# sign that the test is unreliable.
# 4. As opposed to other functions involving the argument,
# '.handle_inadmissibles' the default is "replace" as this is
# required by Sarstedt et al. (2011)'s approach.

# ===========================================================================
# Extended usage
# ===========================================================================
### Test only a subset ------------------------------------------------------
# By default all parameters are compared. Select a subset by providing a
# model in lavaan model syntax:

to_compare <- "


# Path coefficients
QUAL ~ EXPE

# Loadings
EXPE <~ expe1 + expe2 + expe3 + expe4 + expe5
"

## Test
testMGD(out, .parameters_to_compare = to_compare, .R_permutation = 20,
.R_bootstrap = 20, .verbose = FALSE)

### Different p_adjustments --------------------------------------------------


# To adjust p-values to accommodate multiple testing use .approach_p_adjust.
# The number of tests to use for adjusting depends on the approach chosen. For
# the Chin approach for example it is the number of parameters to test times the
# number of possible group comparisons. To compare the results for different
# adjustments, a vector of p-adjustments may be chosen.
testMICOM 81

## Test
testMGD(out, .parameters_to_compare = to_compare,
.approach_p_adjust = c("none", "bonferroni"),
.R_permutation = 20, .R_bootstrap = 20, .verbose = FALSE)

## End(Not run)

testMICOM Test measurement invariance of composites

Description
This functions performs the test for measurement invariance of composites proposed by Henseler
et al. (2016).

Usage
testMICOM(
.object = NULL,
.alpha = 0.05,
.approach_p_adjust = "none",
.handle_inadmissibles = c("drop", "ignore", "replace"),
.R = 499,
.seed = NULL,
.verbose = TRUE
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.approach_p_adjust
Character string or a vector of character strings. Approach used to adjust the
p-value for multiple testing. See the methods argument of stats::p.adjust()
for a list of choices and their description. Defaults to "none".
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "drop",
"ignore", or "replace". If "drop", all replications/resamples yielding an inadmis-
sible result will be dropped (i.e. the number of results returned will potentially
be less than .R). For "ignore" all results are returned even if all or some of the
replications yielded inadmissible results (i.e. number of results returned is equal
to .R). For "replace" resampling continues until there are exactly .R admissi-
ble solutions. Depending on the frequency of inadmissible solutions this may
significantly increase computing time. Defaults to "drop".
.R Integer. The number of bootstrap replications. Defaults to 499.
82 testMICOM

.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!
.verbose Logical. Should information (e.g., progress bar) be printed to the console? De-
faults to TRUE.

Details
The test is only meaningful for concepts modeled as composites.
If more than two groups are to be compared issues related to multiple testing should be taken into
account.
Models containing second-order constructs are not supported yet.
The number of permutation runs defaults to args_default()$.R for performance reasons. Ac-
cording to Henseler et al. (2016) the number of permutations should be at least 5000 for assessment
to be sufficiently reliable.

Value
A named list of class cSEMTestMICOM containing the following list element:

$Step2 A list containing the results of the test for compositional invariance (Step 2).
$Step3 A list containing the results of the test for mean and variance equality (Step 3).
$Information A list of additional information on the test.

References
Henseler J, Ringle CM, Sarstedt M (2016). “Testing Measurement Invariance of Composites Using
Partial Least Squares.” International Marketing Review, 33(3), 405–431. doi: 10.1108/imr092014-
0304.

See Also
csem(), cSEMResults, testOMF(), testMGD()

Examples
## Not run:
# NOTE: to run the example. Download and load the newst version of cSEM.DGP
# from GitHub using devtools::install_github("M-E-Rademaker/cSEM.DGP").

# Create two data generating processes (DGPs) that only differ in how the composite
# X is build. Hence, the two groups are not compositionally invariant.
dgp1 <- "
# Structural model
Y ~ 0.6*X

# Measurement model
Y =~ 1*y1
X <~ 0.4*x1 + 0.8*x2
testOMF 83

x1 ~~ 0.3125*x2
"

dgp2 <- "


# Structural model
Y ~ 0.6*X

# Measurement model
Y =~ 1*y1
X <~ 0.8*x1 + 0.4*x2

x1 ~~ 0.3125*x2
"

g1 <- generateData(dgp1, .N = 399, .empirical = TRUE) # requires cSEM.DGP


g2 <- generateData(dgp2, .N = 200, .empirical = TRUE) # requires cSEM.DGP

# Model is the same for both DGPs


model <- "
# Structural model
Y ~ X

# Measurement model
Y =~ y1
X <~ x1 + x2
"

# Estimate
csem_results <- csem(.data = list("group1" = g1, "group2" = g2), model)

# Test
testMICOM(csem_results, .R = 50, .alpha = c(0.01, 0.05), .seed = 1987)

## End(Not run)

testOMF Test for overall model fit

Description
Bootstrap-based test for overall model fit originally proposed by Beran and Srivastava (1985). See
also Dijkstra and Henseler (2015) who first suggested the test in the context of PLS-PM.

Usage
testOMF(
.object = NULL,
.alpha = 0.05,
.fit_measures = FALSE,
84 testOMF

.handle_inadmissibles = c("drop", "ignore", "replace"),


.R = 499,
.saturated = FALSE,
.seed = NULL,
.verbose = TRUE
)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().
.alpha An integer or a numeric vector of significance levels. Defaults to 0.05.
.fit_measures Logical. (EXPERIMENTAL) Should additional fit measures be included? De-
faults to FALSE.
.handle_inadmissibles
Character string. How should inadmissible results be treated? One of "drop",
"ignore", or "replace". If "drop", all replications/resamples yielding an inadmis-
sible result will be dropped (i.e. the number of results returned will potentially
be less than .R). For "ignore" all results are returned even if all or some of the
replications yielded inadmissible results (i.e. number of results returned is equal
to .R). For "replace" resampling continues until there are exactly .R admissi-
ble solutions. Depending on the frequency of inadmissible solutions this may
significantly increase computing time. Defaults to "drop".
.R Integer. The number of bootstrap replications. Defaults to 499.
.saturated Logical. Should a saturated structural model be used? Defaults to FALSE.
.seed Integer or NULL. The random seed to use. Defaults to NULL in which case an
arbitrary seed is chosen. Note that the scope of the seed is limited to the body of
the function it is used in. Hence, the global seed will not be altered!
.verbose Logical. Should information (e.g., progress bar) be printed to the console? De-
faults to TRUE.

Details
By default, testOMF() tests the null hypothesis that the population indicator correlation matrix
equals the population model-implied indicator correlation matrix. Several discrepancy measures
may be used. By default, testOMF() uses four distance measures to assess the distance between the
sample indicator correlation matrix and the estimated model-implied indicator correlation matrix,
namely the geodesic distance, the squared Euclidean distance, the standardized root mean square
residual (SRMR), and the distance based on the maximum likelihood fit function. The reference
distribution for each test statistic is obtained by the bootstrap as proposed by Beran and Srivastava
(1985).
It is possible to perform the bootstrap-based test using fit measures such as the CFI, RMSEA or the
GFI if .fit_measures = TRUE. This is experimental. To the best of our knowledge the applicability
and usefulness of the fit measures for model fit assessment have not been formally (statistically)
assessed yet. Theoretically, the logic of the test applies to these fit indices as well. Hence, their
applicability is theoretically justified. Only use if you know what you are doing.
If .saturated = TRUE the original structural model is ignored and replaced by a saturated model,
i.e., a model in which all constructs are allowed to correlate freely. This is useful to test misspecifi-
cation of the measurement model in isolation.
testOMF 85

Value

A list of class cSEMTestOMF containing the following list elements:

$Test_statistic The value of the test statistics.


$Critical_value The corresponding critical values obtained by the bootstrap.
$Decision The test decision. One of: FALSE (Reject) or TRUE (Do not reject).
$Information The .R bootstrap values; The number of admissible results; The seed used and the
number of total runs.

References

Beran R, Srivastava MS (1985). “Bootstrap Tests and Confidence Regions for Functions of a Co-
variance Matrix.” The Annals of Statistics, 13(1), 95–115. doi: 10.1214/aos/1176346579.

Dijkstra TK, Henseler J (2015). “Consistent and Asymptotically Normal PLS Estimators for Linear
Structural Equations.” Computational Statistics & Data Analysis, 81, 10–23.

See Also

csem(), calculateSRMR(), calculateDG(), calculateDL(), cSEMResults, testMICOM(), testMGD()

Examples

# ===========================================================================
# Basic usage
# ===========================================================================
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# (Reflective) measurement model


eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"

## Estimate
out <- csem(threecommonfactors, model, .approach_weights = "PLS-PM")

## Test
testOMF(out, .R = 50, .verbose = FALSE, .seed = 320)
86 threecommonfactors

threecommonfactors Data: threecommonfactors

Description
A dataset containing 500 standardized observations on 9 indicator generated from a population
model with three concepts modeled as common factors.

Usage
threecommonfactors

Format
A matrix with 500 rows and 9 variables:
y11-y13 Indicators attachted to the first common factor (eta1). Population loadings are: 0.7; 0.7;
0.7
y21-y23 Indicators attachted to the second common factor (eta2). Population loadings are: 0.5;
0.7; 0.8
y31-y33 Indicators attachted to the third common factor (eta3). Population loadings are: 0.8; 0.75;
0.7
The model is:
‘eta2‘ = gamma1 ∗ ‘eta1‘ + zeta1
‘eta3‘ = gamma2 ∗ ‘eta1‘ + beta ∗ ‘eta2‘ + zeta2
with population values gamma1 = 0.6, gamma2 = 0.4 and beta = 0.35.

Examples
#============================================================================
# Correct model (the model used to generate the data)
#============================================================================
model_correct <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# Measurement model
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"

a <- csem(threecommonfactors, model_correct)

## The overall model fit is evidently almost perfect:


testOMF(a, .R = 30, .verbose = FALSE) # .R = 30 to speed up the example
verify 87

verify Verify admissibility

Description
Verify admissibility of the results obtained using csem().

Usage
verify(.object)

Arguments
.object An R object of class cSEMResults resulting from a call to csem().

Details
Results exhibiting one of the following defects are deemed inadmissible: non-convergence of the
algorithm used to obtain weights, loadings and/or (congeneric) reliabilities larger than 1, a construct
variance-covariance (VCV) and/or model-implied VCV matrix that is not positive semi-definite.
If .object is of class cSEMResults_2ndorder (i.e., estimates are based on a model containing
second-order constructs) both the first and the second stage are checked separately.
Currently, a model-implied indicator VCV matrix for nonlinear model is not available. verify()
therefore skips the check for positive definitness of the model-implied indicator VCV matrix for
nonlinear models and returns "ok".

Value
A logical vector indicating which (if any) problem occurred. A FALSE indicates that the specific
problem has not occured. For models containg second-order constructs estimated by a two stage
approach, a list of two such vectors (one for the first and one for the second stage) is returned. Status
codes are:

• 1: The algorithm has converged.


• 2: All absolute standardized loading estimates are smaller than or equal to 1. A violation
implies either a negative variance of the measurement error or
• 3: The construct VCV is positive semi-definite.
• 4: All reliability estimates are smaller than or equal to 1.
• 5: The model-implied indicator VCV is positive semi-definite. This is only checked for linear
models (including models containing second-order constructs).

See Also
csem(), summarize(), cSEMResults
88 verify

Examples

### Without higher order constructs --------------------------------------------


model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2

# (Reflective) measurement model


eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"

# Estimate
out <- csem(threecommonfactors, model)

# Check admissibility
verify(out) # ok!

## Examine the structure of a cSEMVerify object


str(verify(out))

### With higher order constructs -----------------------------------------------


# If the model containes higher order constructs both the first and the second-
# stage estimates estimates are checked for admissibility

## Not run:
require(cSEM.DGP) # download from https://m-e-rademaker.github.io/cSEM.DGP

# Create DGP with 2nd order construct. Loading for indicator y51 is set to 1.1
# to produce a failing first stage model

dgp_2ndorder <- "


## Path model / Regressions
eta2 ~ 0.5*eta1
eta3 ~ 0.35*eta1 + 0.4*eta2

## Composite model
eta1 =~ 0.8*y41 + 0.6*y42 + 0.6*y43
eta2 =~ 1.1*y51 + 0.7*y52 + 0.7*y53
c1 =~ 0.8*y11 + 0.4*y12
c2 =~ 0.5*y21 + 0.3*y22

## Higher order composite


eta3 =~ 0.4*c1 + 0.4*c2
"

dat <- generateData(dgp_2ndorder) # requires the cSEM.DGP package


out <- csem(dat, .model = dgp_2ndorder)

verify(out) # not ok
Yooetal2000 89

## End(Not run)

Yooetal2000 Data: Yooetal2000

Description
A data frame containing 34 variables with 569 observations.

Usage
Yooetal2000

Format
An object of class data.frame with 569 rows and 34 columns.

Details
The data is simulated and has the identical correlation matrix as the data that was analysed by
Yoo et al. (2000) to examine how five elements of the marketing mix, namely price, store image,
distribution intensity, advertising spending, and price deals, are related to the so-called dimensions
of brand equity, i.e., perceived brand quality, brand loyalty, and brand awareness/associations. It is
also used in Henseler (2017) and Henseler (2020) for demonstration purposes (Tutorial 10).

Source
Simulated data with the same correlation matrix as the data studied by Yoo et al. (2000).

References
Henseler J (2017). “Bridging Design and Behavioral Research With Variance-Based Structural
Equation Modeling.” Journal of Advertising, 46(1), 178–192. doi: 10.1080/00913367.2017.1281780.

Henseler J (2020). Composite-Based Structural Equation Modeling: An Introduction to Partial


Least Squares & Co. Using ADANCO. Guilford Press.

Yoo B, Donthu N, Lee S (2000). “An Examination of Selected Marketing Mix Elements and
Brand Equity.” Journal of the Academy of Marketing Science, 28(2), 195–211. doi: 10.1177/
0092070300282002.

Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_HOC="
# Measurement models FOC
PR =~ PR1 + PR2 + PR3
90 Yooetal2000

IM =~ IM1 + IM2 + IM3


DI =~ DI1 + DI2 + DI3
AD =~ AD1 + AD2 + AD3
DL =~ DL1 + DL2 + DL3
AA =~ AA1 + AA2 + AA3 + AA4 + AA5 + AA6
LO =~ LO1 + LO3
QL =~ QL1 + QL2 + QL3 + QL4 + QL5 + QL6

# Composite model for SOC


BR <~ QL + LO + AA

# Structural model
BR~ PR + IM + DI + AD + DL
"

out <- csem(.data = Yooetal2000, .model = model_HOC,


.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06)
Index

∗Topic datasets calculateDML (distance_measures), 35


Anime, 3 calculateDML(), 6
BergamiBagozzi2000, 9 calculateEffects(), 6
dgp_2ndorder_cf_of_c, 34 calculatef2, 13
ITFlex, 42 calculatef2(), 6
LancelotMiltgenetal2016, 44 calculateGFI (fit_measures), 40
PoliticalDemocracy, 49 calculateGFI(), 7
Russett, 65 calculateGoF, 13
satisfaction, 66 calculateGoF(), 7
satisfaction_gender, 68 calculateHTMT, 14
Sigma_Summers_composites, 69 calculateHTMT(), 7
Switching, 72 calculateIFI (fit_measures), 40
threecommonfactors, 86 calculateIFI(), 7
Yooetal2000, 89 calculateIndicatorCor(), 26
?future_lapply, 58, 62 calculateNFI (fit_measures), 40
calculateNFI(), 7
Anime, 3 calculateNNFI (fit_measures), 40
args_assess_dotdotdot, 5
calculateNNFI(), 7
args_csem_dotdotdot, 26
calculateRhoC (reliability), 54
args_default, 4
calculateRhoC(), 6
args_default(), 30
calculateRhoT (reliability), 54
assess, 4
calculateRhoT(), 6, 7
assess(), 5, 11–16, 29, 30, 55, 70, 71
calculateRMSEA (fit_measures), 40
BergamiBagozzi2000, 9 calculateRMSEA(), 7
calculateRMSTheta (fit_measures), 40
calculateAVE, 11 calculateRMSTheta(), 7
calculateAVE(), 6 calculateSRMR (fit_measures), 40
calculateCFI (fit_measures), 40 calculateSRMR(), 7, 85
calculateCFI(), 7 calculateVIFModeB, 15
calculateChiSquare (fit_measures), 40 calculateVIFModeB(), 8
calculateChiSquare(), 7 calculateWeightsGSCA, 16
calculateChiSquareDf (fit_measures), 40 calculateWeightsGSCA(), 26
calculateChiSquareDf(), 7 calculateWeightsGSCAm, 18
calculateDf, 12 calculateWeightsGSCAm(), 18, 26
calculateDf(), 6 calculateWeightsKettenring, 19
calculateDG (distance_measures), 35 calculateWeightsPCA, 20
calculateDG(), 6, 85 calculateWeightsPLS, 21
calculateDL (distance_measures), 35 calculateWeightsPLS(), 26
calculateDL(), 6, 85 calculateWeightsUnit, 22

91
92 INDEX

csem, 13, 15, 23, 52, 59, 71 resampleData, 60


csem(), 4, 5, 7, 8, 11–14, 16, 18, 29, 36, 37, Russett, 65
39–42, 51, 54–57, 61–63, 70, 73, 74,
77, 79, 81, 82, 84, 85, 87 satisfaction, 66, 68
csem_arguments, 4 satisfaction_gender, 67, 68
cSEMArguments, 30 Sigma_Summers_composites, 69
cSEMModel, 17–21, 23, 24, 26, 45–47 stats::p.adjust(), 75–77, 81
cSEMResults, 5, 11–16, 30, 36–42, 51, 52, summarize, 70
54–57, 59, 61–63, 70, 71, 73, 74, 77, summarize(), 29, 30, 59, 87
79, 81, 82, 84, 85, 87 Switching, 72
cSEMResults helpfile , 29
testHausman, 73
dgp_2ndorder_cf_of_c, 34 testHausman(), 29, 30
distance_measures, 35 testMGD, 75
doFloodlightAnalysis, 36 testMGD(), 29, 30, 82, 85
doFloodlightAnalysis(), 29 testMICOM, 81
doRedundancyAnalysis, 37 testMICOM(), 29, 30, 79, 85
doRedundancyAnalysis(), 29 testOMF, 83
doSurfaceAnalysis, 38 testOMF(), 29, 30, 79, 82
doSurfaceAnalysis(), 49 threecommonfactors, 86

fit, 39 verify, 87
fit_measures, 40 verify(), 29, 30, 52, 58
foreman(), 4, 30, 40
Yooetal2000, 89
getConstructScores, 42
graphics::persp, 49

handleArgs(), 4

infer, 77
infer(), 25, 29, 30, 56, 57, 59, 70
ITFlex, 42

LancelotMiltgenetal2016, 44
lavaan model syntax, 24, 26, 45, 46, 77, 78

parseModel, 45
parseModel(), 26, 47
plot.cSEMFloodlight, 48
plot.cSEMFloodlight(), 37
plot.cSEMSurface, 49
plot.cSEMSurface(), 39
PoliticalDemocracy, 49
predict, 51
predict(), 29, 30, 51

reliability, 54
resamplecSEMResults, 55
resamplecSEMResults(), 8, 29, 30, 61, 63

You might also like