Learning Unit 12
Learning Unit 12
Learning Unit 12
t-tests
We have already discussed hypothesis testing in learning unit 11. In this learning unit we will
test a hypothesis empirically in order to determine scientifically whether there is a significant
difference between the means of two sets of scores. Determining the difference between two
sets of data is something that we often have to do in the work situation. Remember the
example of the emotional intelligence of New Stars applicants and contestants given in
learning unit 11?
There are different kinds of test statistics that we can test by means of hypotheses,
depending on the type of data. In this learning unit we look at the different t-tests: the t-test for
independent groups and the t-test for two related groups, as covered by Tredoux and
Durrheim (2013) in Tutorial 9.
12.1
To familiarise yourself with the concepts and some of the assumptions that the data have to
comply with, study Tredoux and Durrheim (2013) from pages 142 to 147. This will also help
you understand the principles underlying hypothesis testing.
160
IOP2601/MO001/4/2016
12.2
Follow the reasoning about sampling distributions of differences between means. (Do you
remember what a sampling distribution is? If not, refresh your memory by going through the
relevant section in learning unit 11 again. As you will see when you read the introductory
part of learning unit 11, we are now dealing with the sampling distribution of the difference
between means.)
1.
2.
3.
4.
False
161
IOP2601/MO001/4/2016
1.
2.
3.
4.
The distribution of the differences between means over repeated sampling from the
same population(s).
True
True
True
12.3
Work through the example which follows and make sure that you know how this test statistic
(in this case the t-test for independent groups) is computed. The t-test is done according to
the nine steps set out in learning unit 11. Did you remember to write down the steps on a
piece of cardboard and stick it above your desk? If not, do so now before you work through
the example. You need to be comfortable with the logical flow of the steps and must be able
to follow the steps in the exam so that you can use the relevant test statistic and solve the
problem statement statistically. These nine steps can be used in the same way for other test
statistics (like the t-test for related groups and the F-test) this is another good reason for
familiarising yourself with the different steps now.
A researcher wants to determine whether there is a significant difference between the overall
entertainment rating allocated to New Stars by male and female viewers. She gets the scores
of 20 randomly selected viewers and decides to test significance at the 5% and 1% levels.
162
IOP2601/MO001/4/2016
Data
Male viewers
Female viewers
X2
Y2
16
25
25
36
16
16
49
25
36
25
X =
37
X = 181
=
3,7
4,88
N=
10
Y =
33
Y = 131
=
3,3
2y = 2,47
N=
10
163
IOP2601/MO001/4/2016
164
IOP2601/MO001/4/2016
NN
2,1009
t0.01 (18) =
2,8784
165
IOP2601/MO001/4/2016
For = 0,05
For = 0,01
? Do not reject H0
? Do not reject H0
(Remember: It is unscientific and wrong to say that we accept the null hypothesis.)
12.4
To familiarise yourself with the various facets of the use of the t-test, it will help you to do an
exercise on your own. Do Exercise 1 on page 158 in Tredoux and Durrheim (2013). Follow
the steps of hypothesis testing, which you should know quite well by now.
166
IOP2601/MO001/4/2016
x
x
x
x
Follow all the steps in the process, including the formulation of null and alternative
hypotheses and the decision on a one-tailed or two-tailed test.
Use your pocket calculator to complete the steps in the computation of the test statistic
and determine the degrees of freedom.
Look up the critical values in the table and apply the decision-making rules. Use the test
significance level of 0.05.
Write down your interpretation concerning the rejection or nonrejection of the null
hypothesis.
You have learnt quite a few new skills and at this stage you should be able to
x
x
Go through the example below and make sure that you can
identify and interpret the t-value on the printout.
Exercise 2
Data Set 8
For the independent samples t-test, the scores for both groups should be entered in a single
variable. A dummy variable should then be entered as a means of identifying each group. I
called the independent variable laugh, and the independent variable parent. Two-parent
families were labelled 1 and one-parent families were labelled 2.
167
IOP2601/MO001/4/2016
Independent: samples t-test
Select the Independent Samples T-Test option on the Compare means option on the
Analyse menu. In the dialog box, select the test variable (DV) and the grouping variable (IV).
Define the group variable by specifying the lowest and highest value of the dummy variable
in our case the lowest value is 1 (2 parent families) and 1 (one parent families). The outputs
are given below:
As you know by now, equal variances play a role in a t-test. In the above printout, the first
test (Levenes test) is to test for the assumption of equal variances. If it is significant (less
than 0,05 or 0,01) we read the second row of data. In this case it is not, so we read the first
row. In the column titled sig (two-tailed) we again look to see whether the value is less than
0,05 or 0,01. In this case it is less than 0,05, so the t-test is significant, showing a statistical
significant difference.
168
IOP2601/MO001/4/2016
12.5
There is not a great deal to learn in this section. Study the first paragraph on page 391 in
Tredoux and Durrheim (2013).
Now that you have completed this section you should be able to name the nonparametric
equivalent of the t-test for two independent samples. Remember, you need not be able to
describe it or know the formula.
That is the end of discussion on t-tests for two independent samples. Now we are going to
talk about hypothesis tests when dealing with two related samples.
12.6
Do you still remember when we classify samples as independent? The exact opposite
applies to related samples.
First make a list of the characteristics of independent samples. Then complete the table for
related samples. (Read the introduction on page 152 in Tredoux and Durrheim (2013) to help
you with this activity.)
169
IOP2601/MO001/4/2016
1.
Independent samples
x
x
Related samples
2.
3.
4.
5.
1.
2.
3.
4.
5.
False
Independent samples are samples that are included in only one of the two samples in
an experiment and related samples are samples that are dependent.
true
true
false
false
Lets explain in greater detail what is meant by related samples, for you have to be able to tell
when a research problem or a research hypothesis involves related or independent samples.
We say that two groups of people or participants are matched according to a variable if for
each member allocated to one group, the other group is assigned a member who
corresponds with the member of group one in respect of the particular variable. Thus
participant pairs are selected in such a way that the two members of each pair are as similar
as possible in respect of the relevant variable(s). Variables like intelligence, age, height, eye
colour and gender can be used to match persons or participants.
To sum up: the only ways of obtaining related sample scores are
x
x
170
by matching
when one participant contributes two scores
IOP2601/MO001/4/2016
12.7
A practical example always provides a useful learning experience, so it would be a good idea
first to go through Creating the variable D on page 152 in Tredoux and Durrheim (2013) as
an example of how to compute the t-test. If you are not sure how to compute the various
steps of the formula, revise the formula for and in learning unit 5 and learning unit 6
respectively. Tredoux and Durrheim (2013) provide the computed values of D in Table 9.2 on
page 153.
Complete the following:
1.
A difference score is
.....
.....
.....
2.
Check the computation of the difference scores in the example. Can you see when the
scores acquire positive or negative values?
.....
.....
.....
171
IOP2601/MO001/4/2016
3.
Check the formula of the t-test for related samples in the example.
.....
.....
.....
6.
The answers are on page 153 of Tredoux and Durrheim (2013). Now that you have carefully
read the detailed steps, let us see how well you understand them.
12.8
Before you do an exercise on your own, work through the complete example which follows so
that you can see what we expect you to do in the different steps.
What follows is a fully worked-out practical example of how to do the t-test for related groups.
Work through the example and check all the calculations. Make sure that you understand all
the steps in the process.
172
IOP2601/MO001/4/2016
Data
Before
4
3
7
6
5
4
5
4
3
2
After
5
8
6
6
6
4
7
7
6
4
173
IOP2601/MO001/4/2016
For the t-test computation, we first have to compute the difference scores (D), the mean for
D, variance and standard deviation for D scores. We do this by completing the table as
follows:
Before
4
3
7
6
5
4
5
4
3
2
After
5
8
6
6
6
4
7
7
6
4
D
-1
-5
1
0
-1
0
-2
-3
-3
-2
D = -16
D2
1
25
1
0
1
0
4
9
9
4
2
D = 54
The data in column D (difference scores) is a set of X-scores. Therefore, use the formulas for
mean, variance and standard deviation you already know just substitute X for D.
The mean (refer to learning unit 5) = 1,6.
The variance (refer to learning unit 6) = 3,16, therefore the standard deviation = 1,78.
Once you have all the values, substitute the t-test formula like this:
174
IOP2601/MO001/4/2016
175
IOP2601/MO001/4/2016
For = 0,05
For = 0,01
12.9
Check the decision-making rules explained in step 8 in this learning unit and follow the
argument for rejecting or not rejecting the null hypothesis.
Once you have done this, do the following exercises.
1.
Test your ability to find the following critical values in the relevant table:
1.1
1.2
1.3
176
IOP2601/MO001/4/2016
1.4
1.5
2.
1.1
1.2
1.3
1.4
1.5
2.1
2.2
2.3
2,4620
1,7459
2,8965
2,4620
1,7139
no
no
yes
You have now gone carefully through detailed examples of hypothesis tests with two
related samples. You should be able to
x
x
x
x
Go through the example below and make sure that you can
identify and interpret the t-value on the printout.
Data Set 7
Enter the data as illustrated alongside.
Related samples t-test
On the Compare means option on the Analyse menu, select the aired samples T-test
option. The dialog box is shown below. You should select both sets of observations, and
enter them into the aired variables window. Click OK to run the procedure.
177
IOP2601/MO001/4/2016
12.10
For additional exercise, do Exercises 3 and 4 on pages 158 and 159 in Tredoux and
Durrheim (2013).
178
IOP2601/MO001/4/2016
Exercise 3
In this study, our design is to take two measurements from each subject; in other words,
repeated measures. We cannot get two measurements from each person, as some
individuals are unavailable at the time of the second measurement, so we cope with this
situation by using casewise deletion we simply ignore the data from a person if we dont
have two measurements from them. Our table of data looks like this:
Subject
Time 1
Time 2
1
13
9
2
12
10
3
16
NA
4
14
NA
5
13
10
6
15
NA
7
17
11
8
13
10
9
14
NA
10
16
17
11
13
9
12
16
8
13
13
NA
14
19
16
15
12
NA
By excluding the subjects from which we dont have two measurements (those are subjects 3,
4, 6, 9, 13 and 15), our table now looks like this:
Subject
10
11
12
14
Time 1
13
12
13
17
13
16
13
16
19
Time 2
10
10
11
10
17
16
We shall be working from the data provided in the above table, not the data in the original
table.
As this is a repeated measures design, we need to create variable D first of all. Do this by
subtracting time 1 from time 2 for each subject (i.e. D = Time 2 - Time 1). The data for D look
like this:
Subject
10
11
12
14
Time 1
13
12
13
17
13
16
13
16
19
Time 2
10
10
11
10
17
16
-2
From now on, we shall not be using the data from Time 1 or Time 2; only the data from D.
We begin by working out a mean, standard deviation and n for D.
D=
s
N
179
IOP2601/MO001/4/2016
Putting our values into the equation we get:
Now we can calculate t. The formula is:
Now we need to determine whether the difference is statistically significant. To do this we
need the degrees of freedom.
df=N1
Which are
df
df
Now we determine whether it is a one-tailed or two-tailed test. According to the research
hypothesis, we are only interested in seeing if the substantia nigra has become smaller if so
then this is a one-tailed test. Our alternative hypothesis is thus:
H1: Time2 < Time1
We shall use the standard alpha value of 0,05.
We now use this information to look up a critical t-value on our t-table. The critical value for
df = 8, one tailed, alpha = 0,05 is 1,860. Our t-value is 4,257 [we ignore the () sign for this
comparison].
Our value is greater than the critical value, so we reject the null hypothesis, and accept the
alternative. It seems that the average diameter of substantia nigra in psychotic patients
becomes smaller after a period of time.
180
IOP2601/MO001/4/2016
Exercise 4
Here, we are using two measurements from the one basketball team, but this is not a
repeated measures design. The reason is that the conditions were not the same each time
we are looking at the effect of the coach, and not of the team, so the two sets of data are
actually independent.
Although we have data missing (fewer games played with the first coach than with the
second), we do not need to worry about this (it is only a thing to worry about if you are using a
repeated measures design).
To begin, we must set up a null hypothesis. For an independent samples t-test, this is:
H0: coach2- = coach1
The first step is to determine the mean, n, and variance for each group. The variances are:
s2coach1 = 24.214
s2coach2 = 732.62
Xcoach1 = 86.25
Xcoach2 = 79.9167
Ncoach1 = 8
Ncoach2 = 12
Now we can insert this value into our independent sample t-test formula. The formula is:
181
IOP2601/MO001/4/2016
We already have all those values, so we can insert them and calculate:
Now that we have a t-value, we must determine whether it is statistically significant. To do
this, we need the degrees of freedom.
So we shall use 18 degrees of freedom.
Now, is this a one-tailed or a two-tailed test? According to the research hypothesis, we are
interested in whether the team performed differently under the new coach in other words,
we are interested in either positive or negative chance, so this is a two-tailed test. From this
we can derive our alternative hypothesis, namely:
H
We shall use the standard alpha value of 0,05.
We now use this information to look up a critical t-value on our t-table. The critical value for df
= 18, two-tailed, alpha = 0,05 is 2,1009. Our t-value is 0,65.
Our value is less than the critical value, so we cannot reject the null hypothesis. The average
score with each coach was the same, so it seems that the team is performing the same under
the new coach as it did under the old one.
182
IOP2601/MO001/4/2016
12.11
Consider the following research question from the New Stars programme:
Do male voters give higher scores to female participants than to male participants?
Indicate which analysis technique would be the most appropriate to answer this research
question and explain/substantiate your answer.
In this case the gender of the voters is merely a selection variable (the scores of female
voters will not be considered in this research question). For the male voters, one would
consider the scores that they gave to the participants. Then the scores of the two gender
groups of the participants will be considered to evaluate whether there is a statistically
significant difference in the mean scores of the two groups. This implies a t-test (comparing
two groups) and it would be the t-test for independent groups (as the male and female
participants are unrelated/independent groups).
183