Skewness and Kutosis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Skewness and Kurtosis

Skewness quantifies how Skewness of Quantities and Percentiles


symmetrical the distribution is.
•A symmetrical distribution has a Other measures of skewness defined in terms of
skewness of zero. Quartiles and Percentiles are as follows:
•An asymmetrical distribution with a
long tail to the right (higher values) Quartiles Coefficient of skewness
Positive or right skewed: The right tail is longer; has a positive skew. (𝑸𝟑 − 𝑸𝟐 ) − (𝑸𝟐 − 𝑸𝟏 )
the mass of the distribution is concentrated on •An asymmetrical distribution with a 𝑺𝑸𝑪 =
𝑸𝟑 − 𝑸𝟏
the left of the figure. It has relatively few high long tail to the left (lower values) has 10–90 Percentile coefficient of skewness = 𝑺𝒑𝒄
values. The distribution is said to be right- a negative skew. 𝑷𝟗𝟎 −𝟐𝑷𝟓𝟎 + 𝑷𝟏𝟎
skewed or "skewed to the right". Example •The skewness is unitless. 𝑺𝒑𝒄 = 𝑷𝟗𝟎 −𝑷𝟏𝟎
(observations): 1,2,3,4,100 (the mean is greater •Any threshold or rule of thumb is
than the median) arbitrary, but here is one: If the Kurtosis is the degree of peakedness of a
skewness is greater than 1.0 (or less distribution, usually taken relative to normal
Negative or left skewed: The left tail is longer; than -1.0), the skewness is distribution. A distribution having a relatively
the mass of the distribution is concentrated on substantial and the distribution is far high peak such as the curve Leptokutic Curve,
the right of the figure. It has relatively few low from symmetrical. while the curve which is flat-topped is called
values. The distribution is said to be left-skewed Platykurtic. The normal distribution which is not
or "skewed to the left". Example (observations): 𝑴𝒆𝒂𝒏−𝑴𝒐𝒅𝒆 very peaked or very flat-topped is called
Skewness =𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
1,1000,1001,1002. (The mean is less than the Mesokurtic
median) 𝑴− 𝑴𝒐
Or 𝑺𝟏 = 𝑺𝑫
If Mean = Median – then it is symmetrical.
To avoid use of mode, use the
Positive skewness occurs when the mean is empirical formula:
increased by some unusually high values,
Negative occurs when extreme low values 𝟑(𝑴𝒆𝒂𝒏−𝑴𝒆𝒅𝒊𝒂𝒏)
𝒔𝒌𝒆𝒘𝒏𝒆𝒔𝒔 = 𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
occur.
𝟑(𝑴−𝑴𝒅)
In our example Mean = 280.4 Median = 358.6 𝑺𝟐 = 𝑺𝑫
So we have negative skewness, The 65 is
pulling the mean down relative to the median.
Kurtosis quantifies whether the shape of the data
𝑀− 𝑀𝑜
distribution matches the Gaussian distribution. 𝑆1 = 𝑃𝑝 = L + [
𝑝𝑁−𝐹
]c
𝑆𝐷
•A Gaussian distribution has a kurtosis of 0. 𝑓𝑝
•A flatter distribution has a negative kurtosis, 3(𝑀−𝑀𝑑)
•A distribution more peaked than a Gaussian distribution 𝑆2 = Formulas Needed:
𝑆𝐷
has a positive kurtosis.
𝑄.𝐷. 𝑁
•Kurtosis has no units. K =𝑃 − 𝐹2
90 − 𝑃10 Median = L + [ 2 𝑓 ]c
•The value that Prism reports is sometimes called 2
the excess kurtosis since the expected kurtosis for a (𝑄3 − 𝑄2 ) − (𝑄2 − 𝑄1 )
Gaussian distribution is 0.0. 𝑆𝑄𝐶 = Mean = A.M. + (
∑ 𝑓𝑥
)c
•An alternative definition of kurtosis is computed by 𝑄3 − 𝑄1 𝑁
adding 3 to the value reported by Prism. With this 𝑷𝟗𝟎 −𝟐𝑷𝟓𝟎 + 𝑷𝟏𝟎 𝑑1
definition, a Gaussian distribution is expected to have a 𝑺𝒑𝒄 = Mode =L + 𝑑 i
𝑷𝟗𝟎 −𝑷𝟏𝟎 1 + 𝑑2
kurtosis of 3.0.
A measure of Kurtosis based on both quartiles and
∑ 𝑓𝑀2 ∑ 𝑓𝑀 2
percentiles is given by: 𝑁
− 𝐹1
S=√ −[ ]
4
𝑛 𝑛
𝑄1 = L + [ ]c
𝑄.𝐷. 𝑓1
K=
𝑃90 − 𝑃10
𝑁
− 𝐹2
Where: 𝑄2 = L + [ 2 𝑓 ]c
2

𝑄 −𝑄
Q.D. = 3 2 1 is the semi-interquartile range, this refer to as 3𝑁
− 𝐹3
𝑄3 = L + [ 4 𝑓 ]c
percentile coefficient of Kurtosis. For normal distribution 3
it is equal to 0.263.
Solve for 𝑺𝟏 , 𝑺𝟐 , 𝑺𝒒𝒄 , 𝑺𝒑𝒄 , and K for each of the problems below.

1. The tables show a frequency distribution of words per minute of a typing speed test given by Bank of the Phil. Islands to all of its
applicants:
X words per f (no. of
minute applicants) CF≤ d fd 𝒇𝒅𝟐 M(x) fM fM²
24 - 26 9 9 -5 -45 225 25 225 5625
27 - 29 11 20 -4 -44 176 28 308 8624
30 - 32 15 35 -3 -45 135 31 465 14415
33 - 35 19 54 -2 -38 76 34 646 21964
36 - 38 24 78 -1 -24 24 37 888 32856
39 - 41 38 116 0 0 0 40 1520 60800
42 - 44 23 139 1 23 23 43 989 42527
45 - 47 19 158 2 38 76 46 874 40204
48 - 50 15 173 3 45 135 49 735 36015
51 - 53 10 183 4 40 160 52 520 27040
54 - 56 8 191 5 40 200 55 440 24200
57 - 59 7 198 6 42 252 58 406 23548
N=198 Ʃfd=32 𝟐
Ʃ𝒇𝒅 = ∑ 𝑓𝑀= Ʃ𝑓𝑀2 =
1482 8016 337818

𝑴− 𝑴𝒐
a) 𝑺𝟏 = 𝑺𝑫

∑ 𝑓𝑥
Mean = A.M. + ( )c A.M.=40; = ∑ 𝑓𝑥= 8016; N = 198
𝑁
8016
𝑀 = 40 + 198
= 44.12

𝑑1
Mode =L + 𝑑 I 𝐿 = 38.5, 𝑑1 = 38 − 23 = 15 ; 𝑑2 = 38 − 24 = 14
1 + 𝑑2
15
𝑀𝑜 = 38.5 + 15+14 (3) = 38.5 + 1.55 = 40.05

∑ 𝑓𝑀2 ∑ 𝑓𝑀 2 337818 8016 2


S=√ −[ ] =√ − [ 198 ] = √1706.15 − 1639.02 = √67.13 = 8.2
𝑛 𝑛 198

44.12−40.05
𝑆1 = = 0.496
8.2

3(𝑀−𝑀𝑑)
b) 𝑆2 = 𝑆𝐷

𝑁
− 𝐹2 𝑁 198
2
Median = L + [ ]𝑐 = = 99; 𝐹2 = 78; 𝐿 = 38.5; 𝑓2 = 38
𝑓2 2 2

99−78
𝑀𝑑 = 38.5 + [ ] 3 = 38.5 + 1.66 = 40.16
38

3(𝑀−𝑀𝑑 ) 3(44.12−40.16)
𝑺𝟐 = = = 1.45
𝑆𝐷 8.2

(𝑸𝟑 − 𝑸𝟐 )−(𝑸𝟐 −𝑸𝟏 )


c) 𝑺𝑸𝑪 = 𝑸𝟑 −𝑸𝟏

𝑁
− 𝐹1 𝑁 198
4
𝑄1 = L + [ ]c = = 49.5; 𝐿 = 32.5; 𝐹1 = 35; 𝑓1 = 19; 𝑐=3
𝑓1 4 4

49.5−35
𝑄1 = 32.5 + [ ] (3) = 32.5 + 2.3 = 34.8
19

𝑁
− 𝐹2 𝑁 198
2
𝑄2 = L + [ ]c = = 99 𝐿 = 38.5; 𝐹2 = 78; 𝑓2 = 38
𝑓2 2 2

99−78
𝑄2 = 38.5 + [ ] 3 = 40.16
38
3𝑁
− 𝐹3 3𝑁 3(198)
4
𝑄3 = L + [ ]c = = 148.5; 𝐿 = 44.5; 𝐹3 = 139; 𝑓3 = 19
𝑓3 4 4

148.5−139
𝑄3 = 44.5 + [ ] 3 = 44.5 + 1.5 = 46
19

(𝑸𝟑 − 𝑸𝟐 ) − (𝑸𝟐 − 𝑸𝟏 )
𝑺𝑸𝑪 =
𝑸𝟑 − 𝑸𝟏

(46 − 40.16) − (40.16 − 34.8) 5.84 − 5.36 0.48


𝑆𝑄𝐶 = = = = 0.043
46 − 34.8 11.2 11.2

𝑷𝟗𝟎 −𝟐𝑷𝟓𝟎 + 𝑷𝟏𝟎


d) 𝑺𝒑𝒄 = 𝑷𝟗𝟎 −𝑷𝟏𝟎

𝑝𝑁−𝐹
𝑃𝑝 = L + [ ]c
𝑓𝑝

𝑝𝑁 = 90%(198) = 178.2; 𝐿 = 50.5; 𝐹 = 173; 𝑓 = 10; 𝑐=3

178.2 − 173
𝑃90 = 50.5 + [ ] 3 = 52.06
10

𝑝𝑁 = 50%(198) = 96; 𝐿 = 38.5; 𝐹 = 78; 𝑓 = 36; 𝑐=3

96 − 78
𝑃50 = 38.5 + [ ] 3 = 40.16
38

𝑝𝑁 = 10%(198) = 19.8; 𝐿 = 26.5; 𝐹 = 9; 𝑓 = 11; 𝑐=3

19.8 − 9
𝑃10 = 26.5 + [ ] 3 = 29.44
11
𝑄.𝐷.
e) K =𝑃
90 − 𝑃10

𝑄3 −𝑄1
Q.D. = 2

46 − 34.8
𝑄. 𝐷. = = 5.6
2

5.6 5.6
𝐾= = = 0.248
52.06 − 29.44 22.62

For normal distribution K is equal to 0.263, since K=0.248 it is almost normal

You might also like