CH 1 Notes
CH 1 Notes
CH 1 Notes
Bar Graphs:
Pie Charts:
(Area Principle)
Definitions:
Marginal Distribution/Relative Frequency:
The distribution of values of a categorical variable in a two way table among all individuals described by the
table.
Conditional Distribution:
Describes the values of a variable among the individuals who have a specific value of another variable. There is a
separate conditional distribution for each value of the other variable.
Association:
An association exists if knowing the value of one variable helps predict the value of the other. If knowing the
value of one variable does not help you predict the value of the other, there is no association between the
variables.
Examples:
We will collect data regarding our gender and ideal super power in the table below:
Fly
Time Travel
Invisibility
Super Strength
Telepathy
Total
1. Use the two-way table to calculate the marginal distributions (in precents) of superpower preferences.
2. Make a graph to display the marginal distribution. Describe what you see.
4. Is there an association between gender and preference of superpower? Give appropriate evidence to support your
answer.
1.2 – Displaying Quantitative Data
We are now switching to QUANTITATIVE variables from categorical variables.
Dotplots
How good was the 2012 U.S. women’s soccer team? With players like Abby Wambach,
Megan Rapinoe, and Hope Solo, the team put on an impressive showing enroute to
winning the gold medal at the 2012 Olympics in London. Here are data on the number of goals scored by the team in the
12 months prior to the 2012 Olympics.
Stemplots
Also called Stem-and-Leaf plots
How many pairs of shoes does a typical teenager have? To find out, a group of AP Statistics students conducted a survey.
They selected a random sample of 20 female students from their school. Then they recorded the number of pairs of
shoes that each respondent reported having. Here
are the data:
50 26 26 31 57 19 24 22 23 38
13 50 13 34 23 30 59 13 15 51
Create a stemplot of the data.
Back-to-Back Stemplots
If we have multiple variables we are considering
Who is taller? Males or females? A sample of 14-year-olds from the UK
was randomly selected. Here is the heights of the students (in cm):
Male: 154, 157, 187, 163, 167, 159, 169, 162, 176, 177, 151, 175, 174,
165, 165, 183, 180
Female: 160, 160, 152, 167, 164, 163, 160, 163, 169, 157, 158, 153, 161,
165, 165, 159, 168, 153, 166, 158, 158, 166
Example:
What percent of your home state’s residents were born outside the United States? A few years ago, the country as a
whole had 12.5% foreign-born residents, but the states varied from 1.2% in West Virginia to 27.2% in California. The
following table presents the data for all 50 states.
a) Who/what is the individual?
Create histogram:
Describing graphs
Look at the OVERALL pattern and any departures from that pattern
SOCS
o SHAPE
o OUTLIERS
o CENTER
o SPREAD
o SHAPE
o OUTLIERS
o CENTER
o SPREAD
1.3 – Describing Quantitative Data with Numbers
Measuring Center
TWO ways to measure center - Mean and Median
Mean
Example:
Here is a stemplot of the travel times to work for the sample of 15 North Carolinians
b) Calculate the mean again, this time excluding the person who reported a 60-minute travel time to work. What do you
notice?
Here is the stemplot of travel times to work for 20 randomly selected New Yorkers. Earlier, we found that the median
was 22.5 minutes.
1. Based only on the stemplot, would you expect the mean travel time to be less than, about the same as, or larger than
the median? Why?
2. Use your calculator to find the mean travel time. Was your answer to Question 1 correct?
3. Would the mean or the median be a more appropriate summary of the center of this distribution of drive times?
Justify your answer.
Measuring Spread
Range
Identifying Outliers
Creating Boxplots
Barry Bonds set the major league record by hitting 73 home runs in a single season in 2001. On August 7, 2007, Bonds hit
his 756th career home run, which broke Hank Aaron’s longstanding record of 755. By the end of the 2007 season when
Bonds retired, he had increased the total to 762. Here are data on the number of home runs that Bonds hit in each of his
21 complete seasons. Make a boxplot of these data.