g8m6 Study Guide Statistics
g8m6 Study Guide Statistics
g8m6 Study Guide Statistics
Mr. Rogove
Date:__________
Instructions: This study guide covers the material that will be on our next test
covering mostly statistics. Do your best and turn in the completed study guide when
you take your test. Thanks!
SCATTER PLOT
300
Age (years)
Weight (kg)
3.5
250
200
150
3.0
2.5
2.0
1.5
1.0
100
0
0.5
0.0
90
100
110
120
130
140
Chest Girth (cm)
150
160
170
Positive trend in the data
25
50
75
100
Shell Length (mm)
125
150
Negative trend in the data
3
3
2
2
1
0
50
100
150
200
x
250
300
350
10
15
20
x
25
30
35
NAME:___________________________
Mr. Rogove
Date:__________
Independent Variable: this is the explanatory variable or the predictor
variable. This is the variable that is not changed by the action of the other variables.
This is the x-value, on the horizontal axis.
Dependent Variable: This is response variable or the predicted variable. This is
the variable that you are trying to make predictions about. This is the y-value on the
vertical axis.
Independent v. Dependent Variable: We can use the information about the
independent variable to make predictions about the values of the dependent
variable (y- axis).
Line of Best Fit: This is a straight line that represents the trend in the data. The line
of best fit should be drawn as close to as many points on the graph as possible. We
can write an equation for this line by identifying two points on the line, finding a
slope and a y-intercept.
The slope of the line of best fit measures the impact that the explanatory variable
has on the response variable.
The y-intercept is the value of the response variable when the explanatory variable
has no effect. In linear models, the y-intercept might not make sense in the context
of the real world situation.
NAME:___________________________
Mr. Rogove
Date:__________
Candy
Bar
9
Favorite Snack
Baked
Salty
Spicy
Goods
10
15
5
47
13
14
10
40
11
23
29
18
87
Gender
Male
Female
Total
Healthy
Total
Relative Frequency: A description of the frequency of the occurrences of each of
the pieces of categorical data in relation to the whole. This is a proportion
!"#$%#&'(
measured by the following fraction: !"!#$ .
Example: The proportion of all students who are male AND preferred salty snacks is
!"
or 0.17
!"
Row Relative Frequency: A description of the frequency of the occurrences of
pieces of categorical data in relation to the total of a row. This is a proportion
!"#$%#&'(
measured by the following fraction: !"# !"!#$ .
!"
Example: The proportion of female students who like healthy food is !" or 0.25.
Column Relative Frequency: A description of the frequency of the occurrences of
pieces of categorical data in relation to the total of a column. This is a proportion
!"#$%#&'(
measured by the following fraction: !"#$%& !"!#$.
Example: Of the students who like candy bars the proportion of them who are boys
!
is !! or 0.82.
NAME:___________________________
Mr. Rogove
Date:__________
Problem Set.
2. Below is data that measures minutes Nicole played in basketball games and the
number of points she scored.
Minutes Points Minutes Points
Played Scored
Played Scored
15
7
12
8
18
12
23
16
9
6
15
12
21
14
10
14
a. Draw a scatterplot of the data
above in the space provided
below. Clearly label your graph.
b. What pattern(s) do you notice
in the data?
c. Draw a line of best fit on the graph above. Write the equation for the line below.
Show how you determined the equation using calculations.
d. Verbally describe the relationship between the number of minutes Nicole plays
and the number of points she scores. What does the slope mean in the context of the
situation?
NAME:___________________________
Mr. Rogove
Date:__________
3. The table shows the number of active woodpecker clusters in a part of the De Soto
National Forest in Mississippi.
Year
2005 2006 2007 2008 2009 2010 2011 2012 2013
Active
22
24
27
27
34
40
42
45
51
Clusters
a. Create a scatterplot of the data.
Represent the x-axis as the number of
years since 2005.
b. One reasonable line of best fit goes through the 2007 and 2011 data. Find the
equation of that line.
c. Predict the number of active clusters in 2020.
NAME:___________________________
Mr. Rogove
Date:__________
4. A survey was conducted of 400 people that asked them questions about their
gender and their preferred footwear. Some of the results are as follows:
240 people surveyed were female.
160 people surveyed preferred
sneakers.
80 people surveyed preferred heels
40 people surveyed preferred
sandals
60 females preferred sneakers
78 females preferred heels
32 males preferred sandals
a. Complete the two-way frequency table that summarizes the data on footwear and
gender.
Footwear Preference
Sneakers
Heels
Sandals
Flats/Dress
Total
Shoes
Female
Male
Total
e. If you chose a survey participant at random, what kind of footwear would you
expect them to prefer? Explain.
f. If you know that the randomly selected participant was a female, would this
change the prediction from part (e)? Why or why not? What associations can you
can make between the variables?
NAME:___________________________
Mr. Rogove
Date:__________
5. A survey of 58 7th grade students was conducted that asked many interesting
questions about gender and salsa preference. Some results are as follows:
12 students total liked hot salsa
15 students dont like salsa
a. Complete the two-way frequency table that summarizes the data on salsa
preference and gender.
Salsa Preference
No salsa
Mild
Medium
Hot
Total
Male
Female
Total
b. What proportion of the participants are females who like do not like salsa at all?
c. If there were no association between gender and salsa preference, would you
expect to find that more girls do not like salsa or more boys do not like salsa?
Explain your answer.
d. Create a ROW relative frequency table of values for salsa preference for each
gender.
Salsa Preference
No salsa
Mild
Medium
Hot
Total
Male
Female
Total
e. Are there any associations you can make between salsa preference and gender?
What are they?
NAME:___________________________
Mr. Rogove
Date:__________
6. In the same survey we asked students about the amount of sleep they got and the
time they went to bed. Below are the results in a two way table.
BEDTIME
Between
Between
8PM - 9PM
9 - 10PM
Sleep
Less than 6
each
Between 6 and
18
27
night
8
More than 8
21
30
TOTAL
10
39
58
a. Make a conclusion (based on math) about the association between going to bed
after 10PM and getting less than 8 hours of sleep. Write a few sentences explaining
your thoughts.
b. Create a COLUMN relative frequency table below
BEDTIME
Between
Between
8PM - 9PM
9 - 10PM
Sleep
Less than 6
each
Between 6 and
night
8
More than 8
TOTAL
c. Based on your column relative frequency table, for which bedtime category is
there the least association with the amount of sleep? Explain this.