London School of Commerce
Quantitative Methods for Business Decisions (Lecture Notes 01)
_______________________________________________________________________
Instructor: Mohammad Moniruzzaman Bhuiya
Quantitative Methods for Business Decisions
E-mail:
[email protected]
Q. What are Statistics?
Procedures for organizing, summarizing, and interpreting information
Standardized techniques used by scientists
Vocabulary & symbols for communicating about data
Two main branches:
Descriptive statistics
Tools for summarising, organising, simplifying data
Tables & Graphs
Measures of Central Tendency
Measures of Variability
Examples:
Average rainfall in Manchester last year
Number of car thefts in last year
Your test results
Percentage of males in our class
Inferential statistics
Inference is the process of drawing conclusions or making decisions about a
population based on sample results
Data from sample used to draw inferences about population
Generalising beyond actual observations
Generalise from a sample to a population
Statistical terms
Population
complete set of individuals, objects or measurements
Sample
a sub-set of a population
Variable
a characteristic which may take on different values
Data
numbers or measurements collected
A parameter is a characteristic of a population
e.g., the average height of all Britons.
A statistic is a characteristic of a sample
e.g., the average height of a sample of Britons.
Σ
This symbol (called sigma) means ‘add everything up’. So, if you see something
like Σxi it just means ‘add up all of the scores you’ve collected’.
Π
This symbol means ‘multiply everything’. So, if you see something like Π xi it just
means ‘multiply all of the scores you’ve collected’.
Data
There are two general types of data.
Quantitative data is information about quantities; that is, information that can be
measured and written down with numbers. Some examples of quantitative data are your
height, your shoe size, and the length of your fingernails.
Qualitative data is information about qualities; information that can't actually be
measured. Some examples of qualitative data are the softness of your skin, the grace with
which you run, and the color of your eyes.
Qualitative Data
Overview:
•
•
•
•
Deals with descriptions.
Data can be observed but
not measured.
Colors, textures, smells,
tastes, appearance, beauty,
etc.
Qualitative → Quality
Quantitative Data
Overview:
•
•
•
•
Deals with numbers.
Data which can be measured.
Length, height, area, volume,
weight, speed, time,
temperature, humidity, sound
levels, cost, members, ages,
etc.
Quantitative → Quantity
Example
Example
Oil
Painting
Oil
Painting
Qualitative data:
•
•
•
•
•
blue/green color, gold
frame
smells old and musty
texture shows brush
strokes of oil paint
peaceful scene of the
country
masterful brush strokes
Quantitative data:
•
•
•
•
•
picture is 10" by 14"
with frame 14" by 18"
weighs 8.5 pounds
surface area of painting is 140
sq. in.
cost $300
Nominal Data
Nominal basically refers to categorically discrete data such as name of your school, type of car you
drive or name of a book. This one is easy to remember because nominal sounds like name (they have
the same Latin root).
Ordinal Data
Ordinal refers to quantities that have a natural ordering. The ranking of favorite sports, the order of
people's place in a line, the order of runners finishing a race or more often the choice on a rating scale
from 1 to 5.
Interval Data
Interval data is like ordinal except we can say the intervals between each value are equally split. The
most common example is temperature in degrees Fahrenheit. The difference between 29 and 30 degrees
is the same magnitude as the difference between 78 and 79.
Ratio Data
Ratio data is interval data with a natural zero point. For example, time is ratio since 0 time is
meaningful. Degrees Kelvin has a 0 point (absolute 0) and the steps in both these scales have the same
degree of magnitude.
Central value
Unfortunately, no single measure of central tendency works best in all circumstances
Nor will they necessarily give you the same answer
Mean
Give information concerning the average or typical score of a number of scores
The Mean is a measure of central value
What most people mean by “average”
Sum of a set of numbers divided by the number of numbers in the set
[1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10] 55
= = 5.5
10
10
Σx
Arithmetic average: X =
n
If X = [1, 2,3, 4,5, 6, 7,8,9,10]
Then
∑ X /=n
1 + 2 + 3 + ... + 10 55
= = 5.5
10
10
Median
Middlemost or most central item in the set of ordered numbers; it separates the
distribution into two equal halves
If odd n, middle value of sequence
if X = [1,2,4,6,9,10,12,14,17]
then 9 is the median
If even n, average of 2 middle values
if X = [1,2,4,6,9,10,11,12,14,17]
then 9.5 is the median; i.e., (9+10)/2
Median is not affected by extreme values
Quartiles
Split Ordered Data into 4 Quarters
(Q1)
(Q2)
(Q3)
Q1 = first quartile
Q2 = second quartile= Median
Q3 = third quartile
Mode
The mode is the most frequently occurring number in a distribution
if X = [1,2,4,7,7,7,8,10,12,14,17]
then 7 is the mode
Easy to see in a simple frequency distribution
Possible to have no modes or more than one mode
bimodal
and
multimodal
Don’t have to be exactly equal frequency
major mode, minor mode
Mode is not affected by extreme values
When to Use What
Mean is a great measure. But, there are times when its usage is inappropriate or impossible.
Nominal data: Mode
The distribution is bimodal: Mode
You have ordinal data: Median or mode
Are a few extreme scores: Median
Mean, Median, Mode
Dispersion
How tightly clustered or how variable the values are in a data set.
Example
Data set 1: [0,25,50,75,100]
Data set 2: [48,49,50,51,52]
Both have a mean of 50, but data set 1 clearly has greater Variability than data set 2.
The Range is one measure of dispersion
The range is the difference between the maximum and minimum values in a set
Example
Data set 1: [1,25,50,75,100]; R: 100-1 +1 = 100
Data set 2: [48,49,50,51,52]; R: 52-48 + 1= 5
The range ignores how data are distributed and only takes the extreme scores into
account
RANGE = (X largest – X smallest ) + 1
Difference between third & first quartiles
Inter-quartile Range = Q 3 - Q 1
Spread in middle 50%
Not affected by extreme values
Variance and standard deviation
The standard deviation and the variance are measures of how the data is distributed about the
mean. The larger the SD and the variance, the more spread out the data is.
SD and variance are inversely proportionate to the sample size as well. This means that as your
sample size increases, your standard deviation/variance decreases.
SD/Variance are mostly used to determine what your sample size should be in order to
accurately produce statistical results from a test.
Variance: s
2
∑(X − X )
=
n −1
2
A measure of the spread of the recorded values on a variable. A measure of dispersion.
The larger the variance, the further the individual cases are from the mean.
The smaller the variance, the closer the individual scores are to the mean.
Standard Deviation of sample: s =
∑(X − X )
n −1
2
let X = [3, 4, 5 ,6, 7]
Mean= X = 5
(X - X ) = [-2, -1, 0, 1, 2]
subtract mean, X , from each number in X
(X - X )2 = [4, 1, 0, 1, 4]
squared each value achieved from (X - X )
∑
∑
(X - X)2 = 10
sum of all the squared values achieved from (X - X )2
(X - X)2 / n-1 = 10 / 5-1 = 2.5 (this is called “variance”)
∑(X − X )
divided the sum by n -1, n is the total number of samples minus 1
2
2.5 = 1.58
=
n −1
Square root the variance to get standard deviation.
Symmetry
Skew - asymmetry
Kurtosis - peakedness or flatness
Symmetrical vs. Skewed Frequency Distributions
Symmetrical distribution
Approximately equal numbers of observations above and below the middle
Skewed distribution
One side is more spread out that the other, like a tail
Direction of the skew
Positive or negative (right or left)
Side with the fewer scores
Side that looks like a tail
0
0
20
10
40
60
20
30
80 100
Symmetrical vs. Skewed
-
2
0
2
o
0
r
m
.
. 0 0
. 0 2
x
. 0 4
u
. 0 6
n
i
. 1 8
.
x
0
0
20
40
60
80
20 40 60 80 100 120
n
0
5
1
c
0
h
Positively skewed
1
i
5
s
5
.
x
1
01
52
c
h
Negatively skewed
02
i
53
s
2
Statistical graphs of data
A picture is worth a thousand words!
Graphs for numerical data:
Histograms
Frequency polygons
Pie
Graphs for categorical data
Bar graphs
Pie