SINGLE VARIABLE Notes 5.3 Year 10

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

SINGLE VARIABLE DATA ANALYSIS (1)

5.3
Textbook Chapters Covered: 4A, 4B, 4C, 4E

Syllabus points:
[these points are not covered in depth this lesson. This lesson is strictly focusing on
calculating data values accurately]
Why do we study single variable data analysis in 5.3?
- It allows students to explore the concept of standard deviation and identify different
measures of spreads. It also allows students to identify different types of spreads
(symmetrical, skewed etc).
- This content will again be assessed in Maths Advanced year 12 topic Statistical
Analysis.
- None of this content is assessed in Maths Extension 1 and 2.

Statistical Data: (revise questions from chapter 4A)


Statistical data is divided into 2 subgroups: Categorical Data and Numerical data.
• Categorical Data: includes nominal and ordinal data. Usually, this type of data is
data that can be divided into groups for example: race, age group, educational level
etc.
o Nominal data: named categories with no order (colours, eye colours, skin tone).
o Ordinal data: categories that can be ordered (socio-economic status, education
level, income, satisfaction rating).
• Numerical Data: is information that is measurable (as the name suggests it is data
that can be collected in number form). There are two types of numerical data
discrete and continuous.
o Discrete data: is information that can only take on certain values. For
discrete data can take only a number of limited values. For example: number
of students in a class, number of students in a household, number of dogs per
household etc.
o Continuous data: is data that can take on any value given any range. For
example, height, time taken to complete a race, temperature change over an
hour.
• Statistical data is often collected through surveys of certain population groups.
• To ensure efficiency and accuracy of statistical data, surveys must be conducted to
achieve the aim. For example: To check how many people in Australia have diabetes
when 50 years old. Instead of surveying everyone is Australia who is 50, we can
survey 10 people from each state, or survey the same % of people from each state.
Types of Data Displays:
• Dot plot: which are efficient in capturing both categorical data and discrete data.

• Column graph: represents categorical data or discrete data graphically.


• Stem and Leaf plot: represents numerical data (both discrete and continuous).

• Histograms: represent numerical data (both discrete and continuous).

• There is symmetrical, negatively skewed and positively skewed data. However we


will look into these later on.
Measures of centre:
• Measures of centre allows you to find data values which occur most frequently.
• There are 3 measures of centres:
o Mean (𝒙
̅): the average value of the data.
𝑠𝑢𝑚 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠
- The mean can be calculated by this equation: 𝑥̅ = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠

o Median: value at the centre of the data.


- For odd numbered data, there is only one median.
- For even numbered data, there are 2 middle numbers, to find the median,
we need to add both of those numbers and divide by two.
- The median is the best measure of centre, because it finds the number in
the middle.
o Mode: value that occurs more frequently/often.
- There can more than one mode in certain data sets.
- There can also be no modes in a data set.

Examples:
1) Consider the following data set:
𝑆 = { 1, 1, 4, 5, 7, 8, 9, 9 }

a) Find the mean value

b) Find the median

c) Find the mode


Five Figure Summary
This provides information on the data base and is useful when analysing large data sets.
• Minimum value (min): is the minimum value in the set (lowest value).
• Lower Quartile (𝑸𝟏 ): it is the median of the lower half of the data set. It is usually
the number above 25% of the ordered data.
• Median (𝑸𝟐 ): the value at the centre. It is usually the number above 50% of the
ordered data.
• Upper Quartile (𝑸𝟑 ): the median of the upper half of the data set. It is usually the
number above 75% of the ordered data.
• Maximum Value (max): the maximum value in the set (highest value).

Examples:
1) Consider the following even data set:
𝑆 = { 2, 2, 4, 5, 6, 8, 10, 13, 16, 20 }

a) Find the min and max values.

b) Find the median (𝑸𝟐 )

c) Find the lower quartile (𝑸𝟏 ):


d) Find the upper quartile (𝑸𝟑 ):

2) Consider the following odd data set:


S = {2, 3, 5, 7, 9, 11, 13}
Find the five-figure summary values.
3) Consider the following set: S = { 1, 2, 2, 3, 5, 6, 6, 7, 9 }
Find the five-figure summary for the following set.

Measures of Spread
Measures of Spread describes how similar or varied the values of the data set are.
• Range: max value – min value
• Interquartile Range (IQR): upper quartile – lower quartile
- IQR measures how spread-out data points are from the mean of the set.
- A higher IQR value indicates more spread of the data points.
- A lower IQR value indicates that the data is clustered and lies closely to the mean.
• Standard deviation (𝝈): measures how far values deviate from the mean. (you can
usually find standard deviation by just using your calculator).
- In the data set if the standard deviation is equal to zero, it means that all values are
identical.
- If the standard deviation is large, then it means that the data is more spread out.
- If the standard deviation is small, then it means that the data is more clustered
around the mean.
NOTE: Standard deviation is not the same as mean. Standard deviation measures how
spread-out data values are in a certain set. Whereas mean is a measure of centre and
measures the central point.
Standard deviation equation:

Steps to find standard deviation:


1. Find the mean (𝑥̅ ) in the data set and find the number of values in the set (n).
2. Substitute everything into the equation. Or find the deviation for each number
(deviation = 𝑥𝑛 − 𝑥̅ ), and substitute numbers one by one.

Find the standard deviation of the following sets:


1. S = { 1, 2, 2, 3, 5, 6, 6, 7, 9 }
2. S = {2, 19, 20, 51, 89, 101, 120}

CALCULATOR SHORT CUT:


1. Click ‘shift’ and ‘Mode Setup’ at the same time and press number ‘4: STAT’. And then
click Frequency ‘1:ON’.
2. Click ‘Mode Setup’ and press ‘3:STAT’ and press ‘1:1-VAR’.
3. Enter values in the table. After values are entered press AC.
4. Then click ‘Shift’ and number ‘1’ at the same time. Press option ‘4:Var’ if you want to
mind the standard deviation or mean. Press option ‘6:MinMax’ if you want to find
min, Q1, Q2, Q3 and max.

You might also like