Statistical Analysis of Data: Reported By: Kasandra Jane D. Comia

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

STATISTICAL ANALYSIS

OF DATA
Reported by:
Kasandra Jane D. Comia
Outline of Report
 Guidelines for Graphics
 Line chart
 Bar chart
 Pie chart
 Measures of Central Tendency
 Mean
 Median
 mode
 Measures of Dispersion
 Range
 Variance
 Standard deviation
 Coefficient of variation
GUIDELINES FOR GRAPHICS

 BAR CHARTS
 Shows the data in the form of bars that maybe horizontally or vertically oriented.
Bar charts are excellent tools to depict both absolute and relative ,magnitudes,
differences and change.
GUIDELINES FOR GRAPHICS

 LINE CHARTS
 Connects a series of data with continuous line
 Frequently used to portray trends over several periods of time.
GUIDELINES FOR GRAPHICS

 PIE OR ROUND CHARTS


 Often used with business data.

When designing a pie chart, the following must be considered.


1. Show 100% percent of the subject being graphed.
2. Always label the slices with “call outs” or with the percentage or amount
that is presented.
3. Put the largest slice at 12 o’clock position and move clockwise in
descending order.
4. Use light colors for large slices, darker colors for smaller slices.
5. In a pie chart of black & white slices, a single red one will command the
most attention and becomes memorable. Use it to communicate your
most important message.
GUIDELINES FOR GRAPHICS

 PIE OR ROUND CHARTS


Measures of Central Tendency
 Any measure indicating the center of a set of data,
arranged in an increasing or decreasing data, arranged in
an increasing or decreasing order of magnitude, is called a
measure of central location or a measure of central
tendency.
 MEAN

 MEDIAN

 MODE
MEAN
 The mean or average is the most commonly used method
in describing central tendency.
 The mean is equal to the sum of all the values in the data
set divided by the number of values in the data set.
So, if we have n values in a data set and they have values
x1, x2, ..., xn, the sample mean, usually denoted by
(pronounced x bar), is:
MEAN
 This formula is usually written in a slightly different
manner using the Greek capitol letter, , pronounced
"sigma", which means "sum of...":
Measures of Central Tendency: MEAN

Group A Group B
Student No. X1 Student No. X2
1 A 5 A 2
2 B 6 B 4
n
3 C 7 C 6
4 D 7 D 7 96
5 E 8 E 7 12
6 F 8 F 8
8
7 G 8 G 8
8 H 8 H 8
9 I 9 I 9
10 J 9 J 11
11 K 10 K 12
12 L 11 L 14
Total 96 96
Source: Subong and Beldia, 2005
When not to use the mean
 The mean has one main disadvantage: it is particularly susceptible to the
influence of outliers. These are values that are unusual compared to the rest
of the data set by being especially small or large in numerical value. For
example, consider the wages of staff at a factory below:
Staff 1 2 3 4 5 6 7 8 9 10

Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k

30.7K
MEDIAN
 The median of a set of observations arranged in
an increasing or decreasing order of magnitude is
the:
 middle value when the number of
observations is odd, or the
 arithmeticmean of the two middle values
when the number of the observations is
even.
MEDIAN
 The median is less affected by outliers and skewed data. In order to
calculate the median, suppose we have the data below:
65 55 89 56 35 14 56 55 87 45 92

 We first need to rearrange that data into order of magnitude (smallest


first):
14 35 45 55 55 56 56 65 87 89 92

 What if you had only 10 scores? Well, you simply have to take the
middle two scores and average the result.

14 35 45 55 55 56 56 65 87 89

 Median= 55.5
Measures of Central Tendency: MEDIAN

Group A Group B
Student No. X1 Student No. X2
1 A 5 A 2
2 B 6 B 4
3 C 7 C 6 XF+XG
4 D 7 D 7
Md = 2
5 E 8 E 7
6 F 8 F 8
7 G 8 G 8 = 8+8
8 H 8 H 8 2
9 I 9 I 9 = 8
10 J 9 J 11
11 K 10 K 12
12 L 11 L 14
Total 96 96
MODE

 Themode of a set of observations is


that value which occurs most often or
with the greatest frequency.
MODE
Measures of Central Tendency: MODE

Group A Group B
Student No. X1 Student No. X2
1 A 5 A 2
2 B 6 B 4
3 C 7 C 6 Me = 8 ,
4 D 7 D 7 the value that occurs
5 E 8 E 7 with the greatest
frequency
6 F 8 F 8
7 G 8 G 8
8 H 8 H 8
9 I 9 I 9
10 J 9 J 11
11 K 10 K 12
12 L 11 L 14
Total 96 96
 A problem with the mode is that it will not provide us with a very good
measure of central tendency when the most common mark is far away from
the rest of the data in the data set, as depicted in the diagram below:

In the diagram, the mode has a value


of 2. We can clearly see, however,
that the mode is not representative
of the data, which is mostly
concentrated around the 20 to 30
value range. To use the mode to
describe the central tendency of this
data set would be misleading.
Skewed Distributions and the Mean and Median
 An example of a normally distributed set of data is presented below:.

When you have a normally


distributed sample you can
legitimately use both the
mean or the median as your
measure of central
tendency. In fact, in any
symmetrical distribution the
mean, median and mode
are equal.
Skewed Distributions and the Mean and Median
 However, when our data is skewed, for example, as with the right-skewed
data set below:

In these situations, the median


is generally considered to be
the best representative of the
central location of the data.

The more skewed the


distribution, the greater the
difference between the median
and mean, and the greater
emphasis should be placed on
using the median as opposed to
the mean.
Measures of Dispersion
 It is how the observations spread out from the average.
 Refers to the spread of the values around the central tendency.
 Measures of dispersion simply serve as an index of spread of X-
values away from the central value.
Measures of Dispersion
Dispersion in statistics is a way of describing how spread out
a set of data is. When a data set has a large value, the
values in the set are widely scattered; when it is small the
items in the set are tightly clustered. Very basically, this set
of data has a small value:
1, 2, 2, 3, 3, 4
…and this set has a wider one:
0, 1, 20, 30, 40, 100
RANGE
 The range of a set of data is the difference between
the largest and smallest number in set.
 Range is defined as the difference between the value
of largest item and the value of smallest item
included in the distribution. Formula for calculating
range is
Range = L – S
Where, L = Largest value
S = Smallest Value
Coefficient of range = L – S
L+S
Measures of Dispersion: RANGE

Group A Group B
Student No. X1 Student No. X2
1 A 5 A 2
2 B 6 B 4
3 C 7 C 6 Group A
4 D 7 D 7 Range = 11-5
5 E 8 E 7 = 6
6 F 8 F 8
7 G 8 G 8 Group B
8 H 8 H 8 Range = 14-2
9 I 9 I 9 = 12
10 J 9 J 11
11 K 10 K 12
12 L 11 L 14
Total 96 96
MERITS OF RANGE
It should be simple to understand.
It should be easy to compute.
It should be rigidly defined

LIMITATIONS OF RANGE
It is based only on two items and does not cover all the items in a distribution.
It is subject to wide fluctuations from sample to sample based on the same population.
It fails to give any idea about the pattern of distribution.
Finally, in the case of open-ended distributions, it is not possible to compute the range.
VARIANCE

A second characteristic of a distribution is the width of


its spread around the central class.
 Two distributions with the same mean might differ very
much in how closely the measurements are
concentrated around the mean.

 The most common measure of variation around the


center is the variance, which is defined as the average
squared deviation of the observations from the mean
VARIANCE
 The variance considers the position of each observation
relative to the mean of the set. This is accomplished by
examining the deviations from the mean .
 The deviation of an observation is found by subtracting
the mean of the data set from the given observation.
Measures of Dispersion: VARIANCE

Group A Group B
Dx (dx)2 Dx (dx)2
Student X1 Student X2
=(X1-X) =(X1-X)2 =(X1-X) =(X1-X)2

A 5 -3 9 A 2 -6 36
B 6 -2 4 B 4 -4 16
Group A
C 7 -1 1 C 6 -2 4
= 30
D 7 -1 1 D 7 -1 1 12-1
E 8 0 0 E 7 -1 1 = 2.73
F 8 0 0 F 8 0 0
Group B
G 8 0 0 G 8 0 0
= 120
H 8 0 0 H 8 0 0 12-1
I 9 1 1 I 9 1 1 = 10.91
J 9 1 1 J 11 3 9
K 10 2 4 K 12 4 16
L 11 3 9 L 14 6 36
96 0 30 96 0 120
STANDARD DEVIATION
 The standard deviation gives the picture of the
homogeneity or heterogeneity of a set of data
being analyzed.
Measures of Dispersion: STANDARD DEVIATION

Group A Group B
Dx (dx)2 Dx (dx)2
Student X1 Student X2
=(X1-X) =(X1-X)2 =(X1-X) =(X1-X)2

A 5 -3 9 A 2 -6 36
B 6 -2 4 B 4 -4 16
C 7 -1 1 C 6 -2 4 Group A
D 7 -1 1 D 7 -1 1
= √30
12-1
E 8 0 0 E 7 -1 1
= 1.65
F 8 0 0 F 8 0 0
G 8 0 0 G 8 0 0 Group B
H 8 0 0 H 8 0 0 =√120
12-1
I 9 1 1 I 9 1 1
= 3.30
J 9 1 1 J 11 3 9
K 10 2 4 K 12 4 16
L 11 3 9 L 14 6 36
96 0 30 96 0 120
COEFFICIENT OF VARIATION
 The coefficient of variation expresses the
standard deviation as a percentage of the mean.
Measures of Dispersion: COEFFICIENT OF VARIATION

Group A Group B
Dx (dx)2 Dx (dx)2
Student X1 Student X2
=(X1-X) =(X1-X)2 =(X1-X) =(X1-X)2

A 5 -3 9 A 2 -6 36
B 6 -2 4 B 4 -4 16
C 7 -1 1 C 6 -2 4 Group A
D 7 -1 1 D 7 -1 1 = 1.65
E 8 0 0 E 7 -1 1 8
= 20.63%
F 8 0 0 F 8 0 0
G 8 0 0 G 8 0 0
H 8 0 0 H 8 0 0 Group B
I 9 1 1 I 9 1 1 = 3.30
8
J 9 1 1 J 11 3 9
= 41.25%
K 10 2 4 K 12 4 16
L 11 3 9 L 14 6 36
96 0 30 96 0 120
References:

 https://www.slideshare.net/rajkumarteotia/measures-of-dispersion-or-
variation
 https://docplayer.net/22747567-Mcq-s-of-measures-of-central-tendency.html
 https://www.fusioncharts.com/charts/column-bar-charts/
 https://statistics.laerd.com/statistical-guides/measures-central-tendency-
mean-mode-median.php
 https://statisticsbyjim.com/basics/variability-range-interquartile-variance-
standard-deviation/
 Research Method in Business Education by Prof. Honorata M. Pagaduan

You might also like