Paired T-Test: AKA Dependent Sample T-Test and Repeated Measures T-Test

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

Paired T-test

AKA dependent sample t-test and repeated measures t-test

Hypothesis testing

There are two types of statistical hypotheses.

Null hypothesis. The null hypothesis, denoted by H0, is usually the


hypothesis that sample observations result purely from chance.

Alternative hypothesis. The alternative hypothesis, denoted by H1or Ha, is


the hypothesis that sample observations are influenced by some non-random
cause.

Can we accept the null hypothesis?

Acceptance vs. fail to reject

Decision

Two types of errors can result from a hypothesis test.

Type I error. A Type I error occurs when the researcher rejects a null
hypothesis when it is true. The probability of committing a Type I error is
called thesignificance level. This probability is also calledalpha, and is
often denoted by .

Type II error. A Type II error occurs when the researcher fails to reject a null
hypothesis that is false. The probability of committing a Type II error is
calledBeta, and is often denoted by . The probability ofnotcommitting a
Type II error is called thePowerof the test.

Decision (cont.)

H0 is true

H0 is false

Reject H0

Type I error

Correct rejection

Correct decision

Type II error

Fail to
reject H0

Power of a test

The probability ofnotcommitting aType II erroris called thepowerof


a hypothesis test.

Power of a statistical test gives the likelihood of rejecting the null


hypothesis when the null hypothesis is false (right decision).This is
similar to the significance level (alpha) of a test, which gives the
probability of rejecting the null hypothesis when the null hypothesis is
actually true (wrong decision). Thus, power is the ability of a test to
correctly reject the null hypothesis.

Power of a test (cont.)


The power of a hypothesis test is affected by three factors.

Sample size (n). Other things being equal, the greater the sample size, the
greater the power of the test.

Significance level (). The higher the significance level, the higher the
power of the test. If you increase the significance level, you reduce theregion
of acceptance. As a result, you are more likely to reject the null hypothesis.
This means you are less likely to accept the null hypothesis when it is false;
i.e., less likely to make a Type II error. Hence, the power of the test is
increased.

Effect size. The "true" value of the parameter being tested. The greater the
difference between the "true" value of a parameter and the value specified in
the null hypothesis, the greater the power of the test. That is, the greater the
effect size, the greater the power of the test.

When to use paired t-test?

In many research designs, it is helpful to measure the same people more


than once.

A common example is testing for performance improvements (or


decrements) over time.

However, in any circumstance where multiple measurements are made


on the same person (or experimental unit), it may be useful to
observe if there are mean differences between these measurements.

The paired t-test will show whether the


differences observed in the 2 measures will be
found reliably in repeated samples.

When to use? (cont.)

Hypothesis testing

Drawing conclusions about differences between groups

Are differences likely due to chance?

Independent vs. dependent t-test

Dependent sample t-test

Same people are tested at two different occasions.

For example, when you are interested in changes of scores after an


intervention for participants tested at Time 1 and then again at Time 2.

Equivalent of Wilcoxon Signed Rank Test (non-parametric test)

Independent t-test

Two different groups of people are tested at one occasion.

For example, you are interested in comparing the anxiety level between male
and female.

Equivalent of Mann-Whitney U-Test (non-parametric test)

Independent vs. dependent t-test


(cont.)

Question 1: A market researcher randomly divided 500 households into


2 equal groups of 250 households. Group 1 was interviewed by phone
using the current manual procedure. Group 2 received the same
interview, but computer-assisted interviewing was used. The researcher
was interested in estimating the difference in the mean interview times
for the 2 procedures (Mean 1 and Mean 2).

Question 2: An educational psychologist studied whether two


mathematics achievement tests (Test 1, Test 2) lead to different
achievement scores. Eight subjects were randomly selected and each
was given the 2 tests. The order of the 2 tests was independently
randomized for each subject.

Independent vs. dependent t-test


(cont.)

Question 3: Another educational psychologist reported the results of a


study in which 7 pairs of children reading below grade level were
obtained by matching so that within each pair the 2 children were
equally deficient in reading ability. One child from each pair received a
new experimental training, while the other received the standard
training (the assignment of treatments within each pair was random).
The psychologist was interested in determining if the new treatment was
superior to the old.

Question 4: A population of 5 banana bunches will be shipped from


Central America to the USA. The banana company wants to estimate the
weight loss of the bunch during shipment. The pre- and post-shipping
weights of the bunch (in kilograms) is obtained.

Independent vs. dependent t-test


(cont.)

Question 5: In a small-scale experiment to compare the pain-relieving


effectiveness of 2 new medications for arthritis sufferers, 27 volunteers
were divided at random into 2 groups of size 14 and 13. Group 1 received
Medication 1 and Group 2 received Medication 2. The parameters are Mean
1 (the true mean hours of pain relief of those taking Medication 1) and
Mean 2 (the true mean hours of pain relief of those taking Medication 2).

Question 6: 19. A researcher conducts a study in which participants are


asked to administer electric shocks to a participant in what they believe to
be a memory experiment, but which in fact is a study of obedience. In a
number of different conditions the proximity of the researcher to the
participant is varied at 0.5m, 1m, 2m and 5m to see if this has an impact
on participants' willingness to administer shocks of different intensity.

Independent vs. dependent t-test


(cont.)

Question 7: An educational psychologist administered a number of


ability measures to 120 14-year olds. The measures included an IQ test,
and tests of mathematics, spelling, and general verbal ability as well as
a test of the amount of time it took the students to learn material from
the new curriculum. The psychologist was interested in determining if
the IQ test is related to the amount of time it took the students to learn
material from the new curriculum.

Assumptions

Level of measurement

Random sampling

The dependent variable should be measured on a continuous scale (interval


or ratio level). The independent variable should consist of two categorical
related group.

The scores are randomly obtained from the parent population.

Independence of observations

The observation that make-up your data must be independent of one another.

Assumptions (cont.)

Outliers

There should be no significant outliers in the differences between the two


related groups as outliers drastically reduce the validity of the results.

Normal distribution

The distribution of the differences in the dependent variable between the two
related group should be approximately normally distributed. With sample size
of 30+, violation of this assumption is unlikely to cause any serious problem.

Example
Suppose a sample of n students were given a diagnostic test before
studying a particular module and then again after completing the module.
We want to find out if, in general, our teaching leads to improvements in
students knowledge/skills (i.e. test scores). We can use the results from
our sample of students to draw conclusions about the impact of this
module in general.
Let x = test score before the module, y = test score after the module
We will need to test the null hypothesis that the true mean difference is
zero.

Example (cont.)

You might also like