T Test
T Test
T Test
A t test is a statistical test that is used to compare the means of two groups. It is often used
in hypothesis testing to determine whether a process or treatment actually has an effect on the
population of interest, or whether two groups are different from one another.
t test example
You want to know whether the mean petal length of iris flowers differs according to their species.
You find two different species of irises growing in a garden and measure 25 petals of each species.
You can test the difference between these two groups using a t test and null and alterative
hypotheses.
The null hypothesis (H0) is that the true difference between these group means is zero.
The alternate hypothesis (Ha) is that the true difference is different from zero.
The t test is a parametric test of difference, meaning that it makes the same assumptions about your
data as other parametric tests. The t test assumes your data:
1. are independent
2. are (approximately) normally distributed
3. have a similar amount of variance within each group being compared (a.k.a. homogeneity of
variance)
If your data do not fit these assumptions, you can try a nonparametric alternative to the t test, such
as the Wilcoxon Signed-Rank test for data with unequal variances.
If the groups come from a single population (e.g., measuring before and after an
experimental treatment), perform a paired t test. This is a within-subjects design.
If the groups come from two different populations (e.g., two different species, or people
from two separate cities), perform a two-sample t test (a.k.a. independent t test). This is
a between-subjects design.
If there is one group being compared against a standard value (e.g., comparing the acidity of
a liquid to a neutral pH of 7), perform a one-sample t test.
One-tailed or two-tailed t test?
If you only care whether the two populations are different from one another, perform
a two-tailed t test.
If you want to know whether one population mean is greater than or less than the other,
perform a one-tailed t test.
Your observations come from two separate populations (separate species), so you perform
a two-sample t test.
You don’t care about the direction of the difference, only whether there is a difference, so
you choose to use a two-tailed t test.
Performing a t test
The t test estimates the true difference between two group means using the ratio of the difference
in group means over the pooled standard error of both groups. You can calculate it manually using a
formula, or use statistical analysis software.
T test formula
The formula for the two-sample t test (a.k.a. the Student’s t-test) is shown below.
In this formula, t is the t value, x1 and x2 are the means of the two groups being compared, s2 is the
pooled standard error of the two groups, and n1 and n2 are the number of observations in each of
the groups.
A larger t value shows that the difference between group means is greater than the pooled standard
error, indicating a more significant difference between the groups.
You can compare your calculated t value against the values in a critical value chart
(e.g., Student’s t table) to determine whether your t value is greater than what would be expected
by chance. If so, you can reject the null hypothesis and conclude that the two groups are in fact
different.
T Test Formula
The t-test is any statistical hypothesis test in which the test statistic follows a Student’s t-
distribution under the null hypothesis. It can be used to determine if two sets of data are
significantly different from each other, and is most commonly applied when the test statistic would
follow a normal distribution if the value of a scaling term in the test statistic were known.
T-test uses means and standard deviations of two samples to make a comparison. The formula for
T-test is given below:
T test function in statistical software
Most statistical software (R, SPSS, etc.) includes a t test function. This built-in function will take
your raw data and calculate the t value. It will then compare it to the critical value, and calculate
a p-value. This way you can quickly see whether your groups are statistically different.
In your comparison of flower petal lengths, you decide to perform your t test using R. The code
looks like this:
t test exampleFrom the output table, we can see that the difference in means for our sample data is
−4.084 (1.456 − 5.540), and the confidence interval shows that the true difference in means is
between −3.836 and −4.331. So, 95% of the time, the true difference in means will be different from
0. Our p value of 2.2e–16 is much smaller than 0.05, so we can reject the null hypothesis of no
difference and say with a high degree of confidence that the true difference in means is not equal
to zero.
You can also include the summary statistics for the groups being compared, namely the mean
and standard deviation. In R, the code for calculating the mean and the standard deviation from the
data looks like this:
flower.data %>%
group_by(Species) %>%
summarize(mean_length = mean(Petal.Length),
sd_length = sd(Petal.Length))
In our example, you would report the results like this:
The difference in petal length between iris species 1 (M = 1.46; SD = 0.206) and iris species 2 (M =
5.54; SD = 0.569) was significant (t (30) = −33.7190; p < 2.2e-16).