Am (121-130) Analisis Multinivel

Graphing Data in Multilevel Contexts 105
ordinate (y axis). The value to the right is treated as an independent vari-
able and plotted on the abscissa (x axis). Alternatively, we can rearrange the
terms so that the independent variable comes first, with a comma separating
it from the dependent variable:
plot(anscombe$x1, anscombe$y1)
Both approaches lead to the same plot.

Several options within the plotting framework can be utilized. The plot
function has six base parameters, with the option of calling other graphi-
cal parameters including, among others, a par function. This function has
more than 70 graphical parameters that can be used to modify a basic plot.
Discussing all of the available plotting parameters is beyond the scope of
this chapter. Rather, we will discuss some of the most important parameters
to consider when plotting in R.
The parameters ylim and xlim modify the starting and ending points for
the y and x axes, respectively. For example, ylim = c(2, 12) will produce
a plot with the y axis scaled from 2 to 12. R typically automates this process,
but it can be useful for the researcher to tweak this setting, for example, by
setting the same axis across multiple plots. The ylab and xlab param-
eters are used to create labels for the y and x axes, respectively. For exam-
ple, ylab = "Dependent Variable" will produce a plot with the y axis
labeled “Dependent Variable.” The main parameter is used for the main title
of a plot. Thus, main = "Plot of Observed Values" would produce a
plot with the title “Plot of Observed Values” above the graph.
A sub-parameter provides a subtitle that appears at the bottom of a plot,
centered and below the xlab label. For example, sub = "Data from
Study 1" would produce such a subtitle. In some situations it is useful to
include text, equations, or a combination in a plot. Text is easy to include
by using the text function. For example, text(2, 5, "Include This
Text") would place “Include This Text” in the plot centered at x = 2 and y = 5.
Equations can also be included in graphs. Doing so requires the use of
expression within the text function call. The function expression
allows the inclusion of an unevaluated expression (i.e., it displays what is
written). The particular syntax for a mathematical expression is available by
calling help for the plotmath function (i.e., ?plotmath). R provides a
demonstration of the plotmath functionality via demo(plotmath), which
shows the various mathematical expressions that may be displayed in a fig-
ure. As an example, to add the R 2 information to the figure in the top left
sub-figure in Figure 6.1, the following command in R was used:
text(5.9, 9.35, expression(italic(R)^2 = =.67))
The values of 5.9 (on the x axis) and 9.35 (on the y axis) are simply where
we thought the text looked best, and can be easily adjusted to suit a user’s
preference.
106 Multilevel Modeling Using R
Combining text and mathematical expressions requires using the paste

function in conjunction with the expression function. For example, if we
wanted to add “The Value of R 2 = 0.67,” we would replace the previous text
syntax with
te
xt(5.9, 9.35, expression(paste("The value of ", italic(R)^2,
" is.67", sep = "")))
Here, paste is used to bind together the text contained within the quotes
and the mathematical expression. Note that sep = "" is used so that there
are no spaces added between the parts pasted together. Although we did not
include it in our figure, the implied regression line can be easily included in a
scatterplot. One way to do this is through the abline function, and to include
the intercept (denoted a) and the slope (denoted b). Thus, abline(a = 3,
b =.5) would add a regression line to the plot that has an intercept at 3 and
a slope of 0.5. Alternatively, to automate the operation, using abline(lm.
object) will extract the intercept and slope and include the regression line
in a scatterplot.
Finally, notice in Figure 6.2 that we have broken the y axis to make very
clear that the plot does not start at the origin. Whether this is needed may
be debatable; note that base R does not include this option, but we prefer to
Anscombe’s Data Set 1

12
10 ^
Y = 3 + X0.5
The value of R2 is 0.67
8
Y
2
4 6 8 10 12 14
X
FIGURE 6.2
Plot of Anscombe’s Data Set 1 with the regression line of best fit, axis breaks, and text
denoting the regression line and values of the squared multiple correlation coefficient.
(Source: Anscombe, F.J. (1973). Graphs in Statistical Analysis. American Statistician, 27, 17–21.
With permission.)
include it in many situations. A broken axis can be identified with the axis.
break function from the plotrix package. The zigzag break is requested
via the style = "zigzag" option in axis.break and the particular axis
(1 for y and 2 for x). By default, R will set the axis to a point that is generally
appropriate. However, when the origin is not shown, there is no break in the
axis by default as some have argued is important.
Now, let us combine the various pieces of information that we have dis-
cussed to produce Figure 6.2, which was generated with the following syntax.
data(anscombe)
# Fit the regression model.

data.1 <- lm(y1~x1, data = anscombe)
# Scatterplot of the data.

plot(anscombe$y1 ~ anscombe$x1,
ylab = expression(italic(Y)),
ylim = c(2, 12),
xlab = expression(italic(X)),
main = "Anscombe's Data Set 1")
# Add the fitted regression line.

abline(data.1)
# Add the text and expressions within the figure.

text(5.9, 9.35,
ex
pression(paste("The value of ", italic(R)^2, " is.67",
sep = "")))
text(5.9, 10.15,
expression(italic(hat(Y)) = =3+italic(X)*.5))
# Break the axis by adding a zigzag.

require(plotrix)
axis.break(axis = 1, style = "zigzag")
axis.break(axis = 2, style = "zigzag")
6.1 Plots for Linear Models

To further demonstrate graphing in R, let us recall the Cassidy GPA data
from Chapter 1, in which GPA was modeled by CTA.tot and BStotal. We
will now discuss some plots that are useful with single-level data and may
be extended easily to the multilevel case with some caveats. See Figure 6.3.
First, let us consider the pairs function that plots all pairs of variables in a
data set. The resulting graph is sometimes called a scatterplot matrix because
20 30 40 50 60 70
4.0
3.5
GPA 3.0
2.5
2.0
70
60
50
CTA_Total
40
30
20
40
35
30
BS_Total 25
20
15
10
2.0 2.5 3.0 3.5 4.0 10 15 20 25 30 35 40
FIGURE 6.3
Pairs plot of Cassidy GPA data showing bivariate scatterplots for all three variables.
it is in fact a matrix of scatterplots. In the context of a multiple regression

model, understanding how the variables all relate to one another can pro-
vide useful insights as we conduct our analysis. As an example, consider the
following call to pairs:
pairs(
cb
ind(GPA = Cassidy$GPA, CTA_Total = Cassidy$CTA.tot,
BS_Total = Cassidy$BStotal))
Because our data contains p variables, we will obtain p · (p – 1)/2 unique

scatterplots (three in this case). The plots below the principal diagonal are
the same as those above it; the only difference is the reversal of the x and
y axes. Such a pairs plot allows multiple bivariate relationships to be
visualized simultaneously. Of course, we can quantify the degree of linear
relation with a correlation. Code to do this can be given as follows, using

listwise deletion:
cor(na.omit(
cb
ind(GPA = Cassidy$GPA, CTA_Total = Cassidy$CTA.tot,
BS_Total = Cassidy$BStotal)))
Other options are available for dealing with missing data (see ?cor). We used
the na.omit function wrapped around the cbind function to obtain a list-
wise deletion data set in which the following correlation matrix is computed.
GPA CTA_Total BS_Total

GPA 1.000 -0.300 -0.115
CTA_Total -0.300 1.000 0.708
BS_Total -0.115 0.708 1.000
Of course, when using multiple regression we must make some assumptions

about the distribution of the model residuals, as discussed in Chapter 1.
In particular, in order for the p values and confidence intervals to be exact,
we must assume that the distribution of residuals is normal. We can obtain
the residuals from a model applying the resid function to a fitted lm object
(e.g., GPAmodel.1 <- lm(GPA ~ CTA.tot + BStotal, data = Cass).
Then, resid(GPAmodel.1)) returns the model’s residuals that may be plot-
ted in a variety of ways. One useful such plot is a histogram with an overlaid
normal density curve (Figure 6.4) that can be obtained using the following
R command:
hist(resid.1,
fr
eq = FALSE, main = "Density of Residuals for Model 1",
xlab = "Residuals")
lines(density(resid.1))
This code first requests that a histogram be produced for the residuals. Note
that freq = FALSE is used. It instructs R to make the y axis scaled in
terms of probability, not the default (frequency). The solid line represents
the density estimate of the residuals, corresponding closely to the bars in the
histogram.
Rather than a histogram and overlaid density, a more straightforward way to
evaluate the distribution of the errors to the normal distribution is a quantile–
quantile (QQ) plot that includes a straight line reflecting the expected distribu-
tion of the data if in fact it is normal. In addition, the individual data points are
represented in the figure as dots. In a normal data distribution, the dots will
fall along or very close to the straight line. The following code produced the
QQ plot in Figure 6.5, based on the residuals from the GPA model.
qqnorm(scale(resid.1), main = "Normal Quantile-Quantile Plot")

qqline(scale(resid.1))
Density of Residuals for Model 1

1.0
0.8
0.6
Density
0.4
0.2
0.0
−1.5 −1.0 −0.5 0.0 0.5 1.0
Residuals
FIGURE 6.4
Histogram with overlaid density curve of residuals from GPA model in Chapter 1.
Normal Quantile−Quantile Plot

2
1
Sample Quantiles
−1
−2
−3
−3 −2 −1 0 1 2 3
Theoretical Quantiles
FIGURE 6.5
Plot comparing observed quantiles of residuals to theoretical quantiles from standard normal
distribution. When points fall along the line that has slope 1 and goes through the origin,
sample quantiles follow normal distribution.
Notice that above we use the scale function that standardizes the residuals
to have a mean of zero (already done due to the regression model) and a
standard deviation of 1.
We can see in Figure 6.5 that points in the QQ plot diverge from the line
for the higher end of the distribution. This is consistent with the histogram
in Figure 6.4 that shows a shorter tail on the high end as compared to the low
end of the distribution. This plot and the histogram reveal that the model
residuals diverge from a perfectly normal distribution. Of course, the degree
of non-normality can be quantified (e.g., with skewness and kurtosis mea-
sures). However, in this chapter we are most interested in visualizing the
data, and especially looking for gross violations of assumptions. We should
note here the grey area between satisfaction of an assumption and a gross
violation. At this point, we leave the interpretation to the reader and provide
information on how such visualizations can be made.
6.2 Plotting Nested Data

Earlier, we illustrated some basic plotting capabilities of R for linear models
with only single levels. These tools are also potentially useful in multilevel
contexts even though they are not specific to that use. Next, we move on to
graphical tools more specifically useful with multilevel data.
Multilevel models are often applied to relatively large, indeed sometimes
huge, data sets. Such data sets provide a richness that cannot be realized
in studies of small samples. However, a complication that often arises from
this vastness of multilevel data is the difficulty of creating plots that can
summarize large amounts of information and thereby clearly portray the
nature of relationships among the variables. For example, the Prime Time
school data set contains more than 10,000 third grade students. A single plot
of all 10,000 would be overwhelming and fairly uninformative. Including
a nesting structure (e.g., school corporation) can lead to many corporation-
specific plots because the data contain 60 corporations. Thus the plotting
of nested data necessarily carries more nuances (difficulties) than typically
appear with single-level data.
Graphing such data and ignoring the nesting structure can lead to aggre-
gation bias and misinterpretation of results. For example, two variables
may be negatively related within nested structures (e.g., classrooms) but
positively related overall (e.g., when ignoring classrooms). This is sometimes
known as Simpson’s paradox. In addition, sometimes a nested structure
of data does not change the sign of the relationship between variables, but
rather the estimate of the relationship can be strengthened or suppressed
by the nested data structure. This result has been called the reversal para-
dox (Tu, Gunnell, & Gilthorpe, 2008). Thus, when plotting multilevel data,
the nested structure should be explicitly considered. If not, at minimum, a

researcher must realize that the relationships in the unstructured data may
differ when the nesting structure is considered. Although dealing with the
complexities in plotting nested data can be at times vexing, the richness that
nested data provide far outweighs any of the complications that may arise
from applying the technique.
6.3 Using the lattice Package

The lattice package in R provides several powerful tools for plotting
nested data (Sarkar, 2008).
6.3.1 dotplot
One very useful function in this package is dotplot. One way to use this
function is to plot a variable of interest on the x axis with a grouping variable
(e.g., classrooms) on the y axis. In the Prime Time data, identifiers for each
corporation, each school within corporation, and each individual student are
included. Classrooms within a school are simply numbered 1 – nj, where nj
is the number of classrooms in the school. This type of structure is typical
of large multilevel data sets. Suppose we want to see how reading achieve-
ment (geread) is distributed within and between classrooms. To begin, we
take a single school (767) in a single corporation (940). This is the first school
represented in the data set and we are using it solely for illustrative pur-
poses. Within this school are four classrooms. To compare the distributions
of geread scores between classrooms within the school, we can use the
dotplot function from the lattice package as follows:
dotplot(
class ~ geread,
data = Achieve.940.767, jitter.y = TRUE, ylab = "Classroom",
ma
in = "Dotplot of \'geread\' for Classrooms in School 767,
Which is Within Corporation 940")
Several programming points should be noted here. First, we created a new

data set containing data for just this individual school, using the following
code:
Ac
hieve.940.767 <- Achieve[Achieve$corp = =940 &
Achieve$school = =767,]
This R command literally identifies the rows in which corp is equal to

940 (equality checks require two equal signs) and school is equal to 767.
The class ~ geread component of the code instructs the function to plot
geread for each classroom. The jitter.y parameter is used to jitter or
slightly shift overlapping data points in the graph. For example, if multiple
students in the same classroom have the same scores for geread, using
the jitter option will shift those points on the y axis to indicate clearly the
multiple values at the same x value. Finally, labels can be specified. Note
that the use of \'geread\' in the main title puts geread in a single quote
in the title. Calling the dotplot function using these R commands yields
Figure 6.6.
The figure shows the dispersion of geread for the classrooms in school
767. From this plot, we can see that students in each of the four classrooms
had generally similar reading achievement scores. However, it is also clear
that classrooms 2 and 4 have multiple students with outlying scores that are
higher than those of other individuals within the school. We hope we have
shown clearly how a researcher may make use of this type of plot when
examining the distributions of scores for individuals at a lower level of data
(e.g., students) nested within a higher level such as classrooms or schools.
Because the classrooms within a school are arbitrarily numbered, we
can alter the order in which they appear in the graph to make a display
more meaningful. Note that if the order of the classrooms had not been
arbitrary (e.g., honors classes numbered 1 and 2), we would need to be
Dotplot of ‘geread’ for Classrooms in School 767
3
Classroom
0 2 4 6 8 10 12
geread
FIGURE 6.6
The dotplot of classrooms in school 767 (within corporation 940) for geread.
very careful about changing the order. However, in this case, no such con-
cerns are necessary. In particular, the use of the reorder function on the
left side of the ~ symbol will reorder the classrooms in ascending order of
the variable of interest (geread in this case) in terms of the mean. Thus,
we can modify Figure 6.6 to place the classes in descending order by the
mean of geread.
dotplot(
reorder(class, geread) ~ geread,
data = Achieve.940.767, jitter.y = TRUE, ylab = "Classroom",
ma
in = "Dotplot of \'geread\' for Classrooms in School 767,
Which is Within Corporation 940")
From Figure 6.7, it is easier to see the within-class and between-class vari-

ability for school 767. Visually, at least, it is clear that classroom 3 is more
homogeneous (smaller variance) and lower performing (smaller mean) than
classrooms 2 and 4.
Although plots such as those in Figures 6.6 and 6.7 are useful, creating one
for each school would yield so much visual information that it would be dif-
ficult to draw any meaningful conclusions from the data. Therefore, suppose
that we ignored the classrooms and schools and instead focused on the high-
est level of data, corporation. Using what we already learned, it is possible
Dotplot of ‘geread’ for Classrooms in School 767
4
Classroom
0 2 4 6 8 10 12
geread
FIGURE 6.7
The dotplot of classrooms in school 767 (within corporation 940) for geread with classrooms
ordered by means (lowest to highest).

Am (121-130) Analisis Multinivel

Uploaded by

Copyright:

Available Formats

Am (121-130) Analisis Multinivel

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Am (121-130) Analisis Multinivel

Uploaded by

Copyright:

Available Formats

Graphing Data in Multilevel Contexts 105

Both approaches lead to the same plot.

text(5.9, 9.35, expression(italic(R)^2 = =.67))

Combining text and mathematical expressions requires using the paste

Anscombe’s Data Set 1

# Fit the regression model.

# Scatterplot of the data.

# Add the fitted regression line.

# Add the text and expressions within the figure.

# Break the axis by adding a zigzag.

6.1 Plots for Linear Models

it is in fact a matrix of scatterplots. In the context of a multiple regression

Because our data contains p variables, we will obtain p · (p – 1)/2 unique

relation with a correlation. Code to do this can be given as follows, using

GPA CTA_Total BS_Total

Of course, when using multiple regression we must make some assumptions

qqnorm(scale(resid.1), main = "Normal Quantile-Quantile Plot")

Density of Residuals for Model 1

Normal Quantile−Quantile Plot

6.2 Plotting Nested Data

the nested structure should be explicitly considered. If not, at minimum, a

6.3 Using the lattice Package

Several programming points should be noted here. First, we created a new

This R command literally identifies the rows in which corp is equal to

Dotplot of ‘geread’ for Classrooms in School 767

From Figure 6.7, it is easier to see the within-class and between-class vari-

Dotplot of ‘geread’ for Classrooms in School 767

You might also like