Skittle
Skittle
Skittle
Emily ------
8 January 2018
This project involves collecting skittles and developing statistics of the proportion of each
color, as well as the number of candies in each bag. Everyone in the class was assigned to
purchase and bring to class a standard 2.17-ounce bag of Original Skittles. Each participant
would then count and record the number of Skittles and the number of each color of Skittle. We
proceeded to gather into groups for samples. The purpose of this assignment was to perform
research and use data to practice creating confidence intervals and performing hypothesis tests.
55 0.155
71 0.201
65 0.184
79 0.223
84 0.237
Red Orange Yellow Green Purple Red Orange Yellow Green Purple
----- 2
These numbers are about what I expected. There was a difference in the numbers of each
skittle color, but over-all the proportions are similar. My single bag of skittles had similar
proportions to the sample. My most frequent color was orange and my least frequent color was
purple. The sample, however, had yellow as the most frequent color and the least frequent color
was red. However, both my bag and the sample had proportions around 0.2.
Organizing and Displaying Quantitative Data: the Number of Candies per Bag
Next, the entire class recorded the total number of skittles in each 2.17-ounce bag. The
Frequency Table
Total number of 5-number-summary Box and Whisker Plot
Skittles per bag Frequency
63 1 min 54
62 1 Q1 57
61 3 med 59
60 6 Q3 60
59 5 Max 63
58 3
57 5
Mean 58.643
56 1
Standard Dev 2.181
55 2
54 1
----- 3
The distribution is fairly normal. It is slightly skewed to the right, but it is in negligible. I
do not believe the numbers in the frequency table are correct. The collection was gathered by an
individual who wrote the numbers on the board as students called out their numbers. I had four
people in my sample group who had 58 skittles in their bags. However, the class recorded only
three bags of 58 skittles in the whole class. Unfortunately, I don’t know exactly what the class
data is. However, I still believe that the distribution is normal, and these are the only numbers
available.
Reflection
Two types of data include categorical and quantitative data. Categorical data is made up
of categories, or groups, that cannot be numbered. Quantitative data is data that can be
numbered. In this assignment, color is the categorical data. For example, the average color is not
red-orange. Although how many skittles of each color can be numbered (how many reds, how
many oranges, etc.), the colors themselves cannot be numbered. It wouldn’t make sense to make
a box plot for proportions of each color. It makes much more sense to use a pie chart, like we did
above, because pie charts compare categories. The only calculation that makes sense for
categorical data is proportions. The quantitative statistic we worked with above was the number
of candies in each bag. It makes more sense to put this kind of data in a box and whisker plot
because you can make a 5-number-summary with numbers. This is not possible with colors
because you can’t average two colors together. Although we could use a pie chart for our
In statistics, the confidence interval is the area in which we can be reasonably confident
the true number is within. They help show how close the estimate is to the actual number and
For our data, utilizing my calculator’s 1-Proportion Z-Interval test (x: 84, n: 354, C-
Level: 0.99), a 99% confidence interval for the true proportion of yellow candies would be
(0.179, 0.296). We can be 99% confident that the true proportion of yellow candies is between
For our data, utilizing my calculator’s T-Interval test (x: 58.643, sx: 2.181, n:28, C-Level:
0.95), a 95% confidence interval for the true mean number of candies per bag is (57.835,
59.643). We can be 95% confident that the true mean number of candies per bag is between
By looking at this data, we can see that the interval of yellow candies is quite large. This
is because the percent confidence is high (99%). The interval for the average number of candies
per bag is smaller. This is because the percent confidence is lower (95%.
Hypothesis Tests
The purpose of a hypothesis test is to test believed or previously found statistics. For
example, because there are 5 Skittle colors, it would be reasonable to believe that each color is
I used a .05 significance level to test the claim that 20% of all Skittles candies are red.
Utilizing my calculator’s 1-Proportion Z Test, (p0: .2, x: 55, n: 354, prop ≠ p0), I found that
----- 5
p = .036. This means that if the true proportion of red candies is .2, the chance that I would get
55 red candies out of 354 is .036. Remember my significance level is 0.05. Because the p-value
= .036 which is < .05, we reject the null hypothesis. There is enough evidence that the true
Next, I used a .01 significance level to test the claim that the mean number of candies in a
bag of Skittles is 55. Utilizing my calculators T-Test, (µ0: 58.643, x: 55, sx: 2.818, n: 28, µ ≠ µ0)
I found p = 0.000. Because p = 0.000 which is < .01, we reject the null. There is enough evidence
I had expected the difference between the sample proportion of red Skittles to be less
statistically significant from 0.2. I was very surprised at the low p-value. I think this reveals that
there is a great difference in the proportion of different colors of Skittles per bag than the total
proportion of the whole. I was also surprised at how statistically significant the difference
Reflection
The conditions required for accurate hypothesis tests and confidence intervals include the
samples being sufficiently random and either a normal distribution or a large sample size. Both
sets of data collected, the proportion of each color and the number of Skittles per bag, had small
sample sizes. I think that if we were to be more precise with our records and had two classes
perform this activity combined, we would get more accurate results. I also noticed many
classmates brought in larger boxes of Skittles and then counted out a similar number of Skittles
to their neighbor because they couldn’t find a 2.17-ounce bag. This totally ruins the experiment
because the number, and potentially color, of Skittles is now not accurate. To improve the
----- 6
experiment, we could make sure everyone bought an actual 2.17-ounce bag of Skittles. Maybe
each classmate could bring in two bags, to increase the sample size. I do not feel it is appropriate
to draw any conclusions from the numbers calculated in this paper, because the experimental