SEM Overview

Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

STRUCTURAL

EQUATION MODELING:
AN OVERVIEW

Dr. Dwi Ratmono, M.Si., Ak.


Bi-Variate Regression Analysis
• Bi-variate regression analysis extends correlation
and attempts to measure the extent to which a
predictor variable (X) can be used to make a
prediction about a criterion measure (Y).

X Y e

➢Bi-variate regression uses a linear model to predict


the criterion measure.
✓The formula for the predicted score is:
❖Y' = a + bX + e
Bi-Variate Regression Analysis
• Y = a + bX + e
a = intercept/constant
= average value of Y when X = 0
b = slope
= average value of a 1 unit change in Y
for a 1 unit change in X
Multiple Regression Analysis
X1

• Multiple regression analysis


is an extension of bi-variate X2 Y
regression, in which several
predictor variables are used
to predict one criterion
measure (Y). X3

Y' = a + b1X1 +b2X2 +b3X3 + e


Path Analysis
• Path Analysis is an extension of regression.
• In Path analysis the researcher is examining the
ability of more than one predictor variable to explain
or predict multiple dependent variables.
• As we enter into the first of our modeling procedures
we must clarify some key terms
X1

Y1 Y2

X2
e1 e2
Path Analysis

• Measured variables ▪Direct Effects


• Exogenous variables ▪Indirect Effects
• Endogenous ▪Errors in Prediction
variables
X1

Y1 Y2

X2
e1 e2
Definition of terms
• Measured Variables
– Variables that the researcher has observed and
measured.
– In all diagrams, measured variables are depicted by
squares or rectangles
– In path analysis, all variables are measured (X1, X2, Y1,
Y2)
X1

Y1 Y2

X2
e1 e2
Definition of terms
• Endogenous Variables
– Endogenous variables are those which the model
attempts to explain.
– In this path analysis, two endogenous variables exist:
Y1 and Y2.

X1

Y1 Y2

X2
e1 e2
Definition of terms
• Direct Effects
– Direct effects are those parameters that estimate the
"direct" effect one variable has on another.
– These are indicated by the arrows that are drawn
from one variable to another.
– In this model, four direct effects are measured
X1

Y1 Y2

X2
e1 e2
Definition of terms
• Indirect Effects
– Indirect effects are those influences that one variable
may have on another that is mediated through a third
variable.
– In this model, X1 and X2, have a direct effect on Y1
and an indirect effect on Y2 through Y1.

X1

Y1 Y2

X2
e1 e2
Definition of terms
• Errors in Prediction:
– As in any prediction model, errors in
prediction always exist.
– Thus, Y1 and Y2 will have errors (e1 and e2) in
prediction.
X1

Y1 Y2

X2
e1 e2
STRUCTURAL EQUATION MODELING

• COVARIANCE STRUCTURE ANALYSIS

• ANALYSIS OF MOMENT STRUCTURE

• CONFIRMATORY FACTOR ANALYSIS

• LISREL ANALYSIS

• LATENT VARIABLE ANALYSIS


CHARACTERISTICS OF
STRUCTURAL EQUATION MODELING

• ESTIMATION OF INTERRELATED DEPENDENCE


RELATIONSHIPS

• THE ABILITY TO REPRESENT UNOBSERVED CONCEPTS IN


THESE RELATIONSHIPS

• ACCOUNT FOR MEASUREMENT ERROR IN THE


ESTIMATION PROCESS

• REPRESENTING THEORETICAL CONCEPTS


Structural Equation Modeling
• To test whether theoretical hypothesis
about causal relationships fit to empirical
data.
• It has a confirmatory character (i.e.,
researcher determines the relationships
between the variables)
• It tests relationships between observed as
well as unobserved, latent variables
• It combines regression and factor analysis.
Why SEM?
• Inability of Regression-based model, on terms
of:
– reducing measurement and structural errors
– ability to test overall models and individual
parameters (is the model fit with the data
(population)?)
– ability to test models with multiple DVs
– ability to model mediator variables (processes)
– ability to model error terms
– ability to model relations across groups and across
time
Multiple Regression
e1
IQ

EQ CGPA

Std Ec. Status

e1 IQ e2

Path Analysis
Std Ec. Status GPA

Structural Equation Modeling y1 y2


ζ1
Self- ζ2
δ1 x1 Esteem y3 ε3
Students’ Students’
δ2 x2 y4 ε4
Motivation Performance

δ3 x3 y5 ε5
LATENT VARIABLES

• VARIABLES THAT WE DO NOT MEASURE DIRECTLY

• A HYPOTHESIZED AND UNOBSERVED CONCEPT THAT


CAN ONLY BE APPROXIMATED BY OBSERVABLE OR
MEASURABLE VARIABLES

• THE OBSERVED VARIABLES ARE GATHERED FROM


RESPONDENTS THROUGH VARIOUS DATA COLLECTION
METHOD AND KNOWN AS MANIFEST VARIABLES
HUBUNGAN ANTARA VARIABEL LATEN DAN
INDIKATORNYA

Variable Laten
(teoretikal)
X Y

Indikator X1 Y1
(empirikal)
HIPOTESIS
EXPLORATORY FACTOR ANALYSIS

FACTOR A FACTOR B

A1 A2 A3 B1 B2 B3
CONFIRMATORY FACTOR ANALYSIS

FACTOR A FACTOR B

A1 A2 A3 B1 B2 B3
Structural Equation Modeling

CFA Regresi CFA


Factor A Factor B

X1 X2 X3 X4 X5 X6
SEM Diagram

Latent variables, factors, constructs

Observed variables, measures, indicators,


manifest variables

Direction of influence, relationship from one variable


to another

Association not explained within the model


SEM = Factor Analysis +
Regression
• Variables
– observed = measured, manifest, indicators
• can be items, subscales, or scales
– latent = theoretical constructs
• variables that are defined by the observed variables
– goal is to model the commonality in the observed
variables
• sound like factor analysis?
– and then look at relations between latent variables
• sound like multiple regression?
Factor Analysis
• Factor analysis is a fundamental
component of Structural Equation
modeling.
• Factor analysis explores the inter-
relationships among variables to discover
if those variables can be grouped into a
smaller set of underlying factors.
Applications of Factor Analysis
• Three primary applications of Factor Analysis include:
– Explore data for patterns. Often a researcher is unclear if
items or variables have a discernible patterns. Factor
Analysis can be done in an Exploratory fashion to reveal
patterns among the inter-relationships of the items.
– Data Reduction. Factor analysis can be used to reduce a
large number of variables into a smaller and more
manageable number of factors. Factor analysis can create
factor scores for each subject that represents these higher
order variables.
– Confirm hypothesis of factor structure. In measurement
research when a researcher wishes to validate a scale with a
given or hypothesized factor structure, Confirmatory Factor
Analysis is used.
Exploratory Factor Analysis
• In exploratory factor
analysis, the researcher E I1
is attempting to explore Factor
the relationships among I
items to determine if the E I2
items can be grouped
into a smaller number of
underlying factors.
E I3
• In this analysis, all
Factor
items are assumed to II
be related to all factors.
E I4
• In this analysis, we
have now introduced
more terms
Confirmatory Factor Analysis
• Confirmatory factor analysis, meets the third
application of factor analysis:
– To confirm a hypothesized factor structure.
– Used as a validity procedure in measurement
research.
• Confirmatory factor analysis differs from exploratory
factor analysis in that for confirmatory analysis, a
specific relationship between the items and the factors
is confirmed.
– Certain items are hypothesized to go to given
factors
– Not all items go to all factors
Comparison EFA versus
CFA
Exploratory Factor Analysis Confirmatory Factor Analysis

E I1
E I1
Factor Factor
I I
I2 E I2
E

E I3 E I3
Factor
Factor
II
II
E I4
E I4
Confirmatory Factor Analysis
• Note:
– In confirmatory factor
E I1
analysis, only certain Factor
items are proposed to be I
indicators of each factor. E I2

– The curved line indicates


the relationship that
E I3
could exist between the Factor
factors. II
I4
– Again, the errors in E

measurement are shown


by the circles with the E.
Basic Steps for CFA
• There are six basic steps to performing an CFA:
1) Define the factor model. The first thing you need to
do is to precisely define the model you wish to test.
This involves selecting the number of factors, and
determining the nature of the loadings between the
factors and the measures. These loadings can be
fixed at zero, fixed at another constant value,
allowed to vary freely, or be allowed to vary under
specified constraints (such as being equal to another
loading in the model).
✓ Here we use the knowledge derived from our
EFA to specify your CFA model.
Basic Steps for CFA
• There are six basic steps to performing
an CFA:
2) Collect measurements. You need to
measure your variables on the same (or
matched) participants. Remember, you
would not use the same data that was used
for the EFA in the CFA.
3) Obtain the correlation matrix. You need to
obtain the correlations (or covariances)
between each of your variables.
Basic Steps for CFA
• There are six basic steps to performing
an CFA: (continued)
4) Fit the model to the data. You will need to
choose a method to obtain the estimates of
factor loadings that were free to vary.
✓ The most common model-fitting procedure is
Maximum likelihood estimation, which should
probably be used unless your measures
seriously lack multivariate normality.
✓ In this case you might wish to try using
Asymptotically distribution free estimation.
Basic Steps for CFA
• There are six basic steps to performing an
CFA: (continued)
5) Evaluate model adequacy.
✓ When the factor model is fit to the data, the factor
loadings are chosen to minimize the discrepancy
between the correlation matrix implied by the model
and the actual observed matrix.
✓ The amount of discrepancy after the best
parameters are chosen can be used as a measure of
how consistent the model is with the data.
✓ This is where we use our fit statistics!
Construct Validity
• Convergent validity
– It assumes, in CFA, that the indicators of a
latent variable must be converged or share
high variance among the indicators.
• Significant loadings
• Standardized loadings >0.5; preferably >0.7
n

• Variance extracted  i 2

AVE = n
i =1
n
– 1-λ2
= var(εi)  i +  (1 − i )
2 2

i =1 i =1

– AVE > .5 indicates good convergent validity


Reliability
• Measurement relability: R2
• Construct reliability
– Some may prefer Cronbach Alpha despite the
fact that it tends to be underestimated
n
(  i )
– CR = i =1

 n
  n 
  i  +   i 
 i =1   i =1 
– CR of 0.7 and more indicates a good reliability
while 0.6 to 0.7 is quite acceptable
Structural Model
• CFA plus a priori structural model is tested
– two step process (two-step modeling)
• establish the measurement model
• test the structural model
– direct relations among latent variables are modeled
– i.e., regression with latent variables
Structural Model disturbance (ζ2)
21
structural coefficient Performance
Motivation (ξ1)
(η2)

 
   

S1 S2 S3 S4 S1 S2 S3 S4

error error error error error error error error


• Types of latent variables (LV)
– exogenous: LVs that only “cause” other LVs
– endogenous: LVs that are “caused” by other LVs
• pure DVs: are only “caused”
• mediator (intervening): are a “cause” and “caused”

Confidence

Motivation Performance
THE ROLE OF THEORY IN SEM
• THEORY
– A SYSTEMATIC SET OF RELATIONSHIPS PROVIDING CONSISTENT AND
COMPREHENSIVE EXPLANATION OF A PHENOMENON

• A THEORETICAL MODEL
– TO GUIDE THE ESTIMATION PROCESS AND MODEL MODIFICATION

• SEM IS GUIDED MORE BY THEORY THAN BY EMPIRICAL


RESULTS
DEVELOPING A MODELING STRATEGY
• CONFIRMATORY MODELING STRATEGY
– THE RESEARCHER SPECIFIES A SINGLE MODEL, AND SEM IS USED
TO ASSESS ITS STATISTICAL SIGNIFICANCE

• COMPETING MODELS STRATEGY


– TO TEST OF COMPETING THEORIES

• EQUIVALENT MODELS
– AT LEAST ONE OTHER MODEL WITH THE SAME NUMBER OF
PARAMETERS AND THE SAME LEVEL OF MODEL FIT THAT VARIES
IN THE RELATIONSHPS PORTRAYED
STAGES IN SEM

1. DEVELOP A THEORETICALLY BASED MODEL


1. CONFIRMATORY
2. COMPETING MODELS
3. MODEL DEVELOPMENT
4. SPECIFY CAUSAL RELATIONSHIPS
5. AVOID SPECIFICATION ERROR
6. CONSTRUCT A PATH DIAGRAM

2. CONSTRUCT A PATH DIAGRAM


1. DEFINE EXOGENEOUS AND ENDOGENEOUS CONSTRUCTS
2. LINK RELATIONSHIPS IN PATH DIAGRAM
STAGES IN SEM

3. CONVERT THE PATH DIAGRAM


1. TRANSLATE THE STRUCTURAL EQUATIONS
2. SPECIFY THE MEASUREMENT MODELS
3. DETERMINE THE NUMBER OF INDICATORS
4. ACCOUNT FOR CONSTRUCT RELIABILITY

4. CHOOSE THE INPUT MATRIX TYPE

5. ASSESS THE IDENTIFICATION OF THE MODEL

6. EVALUATE MODEL ESTIMATES AND GOODNESS OF FIT


STAGES IN SEM

7. MODEL INTERPRETATIONS
7. CONSIDER MODIFICATION INDICES
8. IDENTIFY POTENTIAL MODEL CHANGES

8. MODEL MODIFICATION
7. FIND THEORETICAL JUSTIFICATION FOR THE PROPOSED
MODEL CHANGES

9. FINAL MODEL
Assumptions
• Theory-driven Model
• Large sample size: >100 or 200
– More complex model requires larger data set:
10 x observed variables
• Linearity: The relationships between latent and
observed variables (loadings) must be linear
• Multivariate normality (c.r. < 5.00; Bryne, 2010 )
• No multicollinearity in observed variables
• No outliers (Mahalanobis Distance < χ2
distribution, pattern inspection; Bryne, 2010 )
Assumptions
• Uncorrelated measurement errors
• Data scale:
– Strictly continous: Joreskog (1993, 2002)
VS
– Both continous and ordinal/interval/Likert-type
scale: (Schumaker & Lomax, 2006)
Identification Problem
• Check whether the model can be solved,
I.e., whether there is enough information
from the empirical data to determine the
unknown parameters
a + 4 = 6; a = 2 ➔ model is identified!!
• But what if, a x b = 60???
a x b = (2 x 30; 3 x 20; 5 x 12; 10 x 6,etc)
Model is unidentified!!
Identification Problem
• Identified model have to follow:
• t < (p +q)(p + q + 1)/2, where:
• t = estimated parameters
• p and q = number of measurement
variable of endogenous and exogenous
latent variables, respectively
• The software will tell whether the model is
identified or not though ☺
Evaluation of the results
• Again, the goal in SEM is to find a
theoretically-sound model that fits well with
the data
• How do we figure it out?
• SEM provides the goodness of fit
measures in the “modification indices
section”
Summary of Fit Indexes
Fit Indexes Model fit acceptance Description

Probability of χ2) – Chi Square p > 0.05 Only valid with reasonable sample
size (75 to 200 cases) and complexity

χ2/df <5 (Wheaton 1977) Overcome the issues of χ2 prob that is


<2 (Carmines & Melver 1981) robust to model complexity

Tucker Lewis Index (TLI) close to 1indicates model fit Overcome the issues of NFI
Possible to have TLI value of more
than 1→ should be set to 1
Comparative Fit Index (CFI) close to 1indicates model fit If CFI<1, CFI > TLI
Possible to have CLI value of more
than 1→ should be set to 1
Root Mean Square Error of <0.05 = good model Should also be interpreted with
Approximation (RMSEA) >0.1 = poor model PCLOSE that should be greater with
0.05
Akaike’s Information Criterion (AIC) Comparing two models, model with
lower AIC is preferred
Goodness of Fit Index (GFI) and GFI and AGFI should be more than The GFI and AGI are very sensitive to
Adjusted GFI 0.9 sample size. Consensus maintained
that the indexes should not be used
Normed Fit Index More than 0.9 Model with NFI less than 0.9 can
usually be increased substantially.
However, it is impossible to have
lower NFI when parameters are
increased
Evaluation of the results

• Plausibility of parameter estimation


• t-value for the estimated parameters
showing whether they are different from
0;
t > 1.96, p < .05
Modification of the model
• What if the model does not fit? Modify it!
(based on Modification Index)
• Based on theory
• How?:
• Simplify the model (delete non-significant
parameters or parameters with large
standard errors)
• Expand the model (new paths: either
structural or measurement parts)
Model SEM Moderasi (Ping, 1995)
Gambar 15.4
Tahap II Model Dengan Variabel Moderasi
1
e3 x3

1 Partisipasi
e2 x2 z1
Anggaran
1
1
e1 x1 1
1
1 x7 e7
1
e6 x6 Kinerja 1
x8 e8
Manajerial
1 Struktur 1
e5 x5 x9 e9
Organisasi
1
1
e4 x4

47.536 5.834
error interak Interaksi
A Two-Step Approach to SEM
Many researchers had proposed and employed a two-step
approach to structural equation modeling (e.g. Crosby, Evans, and
Cowles, 1990; Settoon, Bennett, and Liden, 1996; Brown and
Peterson, 1994; Ganesan, 1994; Hartline and Ferrell, 1996; Howell,
1987; and Anderson and Gerbing, 1988).

In a two-step process, the measurement model is first estimated


and then fixed in the second stage when the structural model is
estimated (Anderson and Gerbing, 1988).

The measurement model in conjunction with the structural model


enables a comprehensive, confirmatory assessment of construct
validity. A two-step approach allows tests of the significance for
all pattern coefficients. Convergent validity can be assessed
from the measurement model by determining whether each
indicator’s estimated pattern coefficient on its posited underlying
construct factor is significant, that is greater than twice its
standard error.

You might also like