QTT Project 2 2023

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

NAME: Gyanendra Shahi CLASS: B.

COM(MAIF)

COURSE CODE: QTT201 Academic Task No.: CA-2

COURSE TITLE: Business mathematics & Statistics

DATE OF ALLOTMENT: 18/11/23 DATE OF SUBMISSION: 30/11/23

ACADEMIC TASK TYPE: ASSIGNMENT


Assignment title: “Analysing and Comparing Measures of Dispersion”

Part 1: Theory and Concepts

“MEASURES OF DISPERSION” AND THERE TYPES.

Measures of dispersion, also known as measures of variability or spread, are


statistical metrics that describe the extent to which a set of data points or
values deviate or spread out from the central tendency, usually represented by
the mean, median, or mode. In other words, they provide information about
the degree of variability or scatter within a data set.

Significance of Measures of Dispersion:


1. Quantifying Variability: Measures of dispersion help quantify how much
individual data points vary from the central tendency. A small dispersion
indicates that most values are close to the mean, while a large dispersion
suggests greater variability.
2. Risk Assessment: In finance and economics, measures of dispersion are
crucial for assessing risk. For example, a higher standard deviation in
investment returns indicates greater volatility and risk.
3. Comparing Data Sets: Dispersion measures allow for the comparison of
variability between different data sets. This is important in various fields,
such as scientific research, where researchers may compare the
variability of experimental results.
4. Data Interpretation: Understanding the spread of data is essential for
proper interpretation. For instance, if the average income of two
populations is the same, knowing the dispersion helps determine if the
income distribution is similar or significantly different.
5. Quality Control: In manufacturing and quality control, measures of
dispersion help assess the consistency and reliability of processes. A
lower variability indicates more consistent production.
Measures of dispersion provide valuable insights into the spread of data points
within a dataset, offering a more comprehensive understanding of the
distribution and variability of the data.

Common measures of dispersion include:


1. Range:
• Definition: The range is the simplest measure of dispersion and is
calculated as the difference between the maximum and minimum
values in a dataset.
• Explanation: It provides a quick and easy way to see the spread of
the data. However, it is sensitive to outliers and may not provide a
robust measure of variability.
2. Variance:
• Definition: Variance is a measure that calculates the average of
the squared differences from the mean.
• Explanation: It gives a comprehensive view of the data spread, but
because it involves squaring the differences, it is sensitive to
outliers. Variance is expressed in squared units of the original
data.
3. Standard Deviation:
• Definition: Standard deviation is the square root of the variance.
• Explanation: It is a widely used and more interpretable measure of
dispersion. Since it is in the same units as the original data, it
provides a clearer understanding of the spread. It is also sensitive
to outliers but less so than variance.
4. Interquartile Range (IQR):
• Definition: IQR is the range covered by the middle 50% of the
data, specifically the difference between the third quartile (Q3)
and the first quartile (Q1).
• Explanation: IQR is less sensitive to extreme values or outliers
than the range, variance, or standard deviation. It is particularly
useful when dealing with skewed distributions or data with
outliers.
5. Skewness:
• Definition: Skewness measures the asymmetry or lack of
symmetry in a dataset.
• Explanation: A positive skewness indicates a longer right tail,
meaning the data is skewed to the right. Conversely, a negative
skewness indicates a longer left tail, meaning the data is skewed to
the left. Skewness helps identify the direction and degree of
asymmetry in the data distribution.

Choosing a Measure:
• Range: Choose when you need a quick and simple measure of the spread
and outliers are not a significant concern. However, it's not
recommended when dealing with datasets containing outliers.
• Variance and Standard Deviation: Choose when you want a more
comprehensive measure of variability and are willing to accept sensitivity
to outliers. Standard deviation is often preferred over variance because it
is in the same units as the original data.
• Interquartile Range (IQR): Choose when you want a measure that is less
sensitive to outliers and provides a better description of the central 50%
of the data. IQR is particularly useful when the dataset has extreme
values.
• Skewness: Choose when you want to understand the asymmetry in the
data distribution. Skewness is helpful for identifying whether the data is
symmetric or skewed and in which direction.
Part 2: Data Analysis

Dataset 1: [15, 18, 20, 22, 23, 25, 28, 30, 32, 33, 35, 36, 38, 40, 45, 47, 50, 52,
55, 60]

Dataset 1:
1. Range:
• Range=Max−Min
=60−15=45.

2. Variance:
• Calculate Mean (μ):
Mean=
15+18+20+22+23+25+28+30+32+33+35+36+38+40+45+47+50+52+55+60
20
= 704/20 = 35.2

Calculate Variance:
• Variance= ∑(xi-u)2
N

Variance= (15-35.2)2+(18-35.2)2+(20-35.2)2+…………………………..+(60-35.2)2
20
= 319.12/20
= 15.956.
3. Standard Deviation:

Standard Deviation= root of variance variance

= 15.956 = 3.99.

4. Interquartile Range (IQR):


• Calculate Quartiles:
• Q1=25
• Q3=47
• IQR=Q3−Q1=47−25=22.

5. Skewness:
• Calculate Skewness:
Skewness=

= (15-35.2)3+(18-35.2)3+(20-35.2)3+……………………………….+(60-35.2)3
20*(3.99)3

= 11,139/1,270
= 8.77.
Dataset 2: [25, 25, 26, 27, 27, 28, 28, 30, 30, 30, 31, 32, 32, 33, 34, 35, 36, 38,
40, 42]

1.Range:
Range=Max−Min=42−25=17.

2.Variance:
• Calculate Mean (μ):
Mean=25+25+26+27+27+28+28+30+30+30+31+32+32+33+34+35+36+38
+40+42 20
= 629/20
= 31.45.

Calculate Variance:
• Variance=∑(xi-u)2
N

Variance= (25-31.45)2+(25-31.45)2+(26-31.45)2…………………………+(42-31.45)2
20
= 87.295/20
= 4.36.

3. Standard Deviation:
• Standard Deviation= Variance
= 4.36
= 2.088.
4. Interquartile Range (IQR):
• Calculate Quartiles:
• Q1=30
• Q3=35
• IQR=Q3−Q1=35−30=5.

Skewness:
• Calculate Skewness:
Skewness=

Skewness= (25−31.45)3+(25−31.45)3+………………………………+(42−31.45)
20*(2.088)3

= 1,274/182
= 7.

Dataset 1:
1. Range:
• The range of 45 indicates a substantial spread between the
minimum and maximum values in Dataset 1.
2. Variance and Standard Deviation:
• The high variance (15.956) and standard deviation (3.99) suggest a
relatively large degree of variability among the data points. This
indicates that the values in Dataset 1 are spread out from the
mean.
3. Interquartile Range (IQR):
• The IQR of 22 suggests that the middle 50% of the data falls within
this range. It's a measure of the spread of the central portion of
the dataset, showing that the majority of the values are
concentrated within this interval.
4. Skewness:
• The positive skewness (8.77) indicates a slight skewness to the
right. This means that there are some larger values that are pulling
the distribution in that direction.
Dataset 2:
1. Range:
• The range of 17 indicates a smaller spread between the minimum
and maximum values in Dataset 2 compared to Dataset 1.
2. Variance and Standard Deviation:
• The lower variance (4.36) and standard deviation (2.088) suggest a
smaller degree of variability among the data points in Dataset 2
compared to Dataset 1.
3. Interquartile Range (IQR):
• The IQR of 5 indicates that the middle 50% of the data is
concentrated within a smaller range compared to Dataset 1.
4. Skewness:
• The positive skewness (7) suggests a slight skewness to the right.
Similar to Dataset 1, there are some larger values that are pulling
the distribution in that direction.
Interpretation:
• Dataset 1: This dataset has a larger range, higher variance, and standard
deviation, indicating a wider spread of values and greater variability. The
positive skewness suggests that there are some higher values pulling the
distribution to the right.
• Dataset 2: This dataset has a smaller range, lower variance, and standard
deviation, indicating less variability compared to Dataset 1. The positive
skewness also suggests a slight skewness to the right, indicating some
higher values.
these measures collectively provide insights into the variability and distribution
of values in each dataset. Dataset 1 exhibits higher variability and a wider
spread, while Dataset 2 has lower variability and a more concentrated
distribution. The skewness values indicate a slight rightward skewness in both
datasets, suggesting the presence of higher values.

PART 3: Interpretation and Comparison.

Dataset 1:
• Range: 45
• Variance: 15.956
• Standard Deviation: 3.99
• Interquartile Range (IQR): 22
• Skewness: 8.77 (positive)
Dataset 2:
• Range: 17
• Variance: 4.36
• Standard Deviation: 2.088
• Interquartile Range (IQR): 5
• Skewness: 7 (positive)
Comparison:
1. Range:
• Dataset 1 has a much larger range (45) compared to Dataset 2
(17), indicating a wider spread of values.
2. Variance and Standard Deviation:
• Dataset 1 has a significantly higher variance (15.956) and standard
deviation (3.99) compared to Dataset 2 (Variance: 4.36, Standard
Deviation: 2.088). This suggests that the values in Dataset 1 are
more spread out from the mean compared to Dataset 2.
3. Interquartile Range (IQR):
• Dataset 1 has a larger IQR (22) compared to Dataset 2 (5),
indicating that the middle 50% of the data is more spread out in
Dataset 1.
4. Skewness:
• Both datasets exhibit positive skewness, suggesting a slight
skewness to the right. However, the skewness values are relatively
close, and the difference is not substantial.
• Variability: Dataset 1 exhibits more variability than Dataset 2. This
conclusion is supported by the larger range, higher variance, standard
deviation, and IQR in Dataset 1 compared to Dataset 2.
• Why?: The larger spread in Dataset 1, as evidenced by the higher range,
variance, and standard deviation, indicates that the values are more
dispersed from the mean. The larger IQR also suggests that the middle
50% of the data in Dataset 1 covers a wider range compared to Dataset
2. Overall, these measures collectively point to Dataset 1 having a higher
degree of variability.

The relationship between measures of central tendency (mean and median)


and measures of dispersion (range, variance, standard deviation, and
interquartile range) provides insights into the distribution and variability of the
data.

Dataset 1:
• Mean: = 36.7
• Median: = 35.5
Relationship:
1. Mean and Median: In Dataset 1, the mean is slightly higher than the
median. This suggests that the distribution is slightly right-skewed, which
aligns with the positive skewness value (8.77).
2. Dispersion Measures:
• The large range (45) and high standard deviation (3.99) indicate a
wide spread of values from the mean.
• The IQR (22) is also relatively large, indicating a significant spread
within the middle 50% of the data.
• The positive skewness further confirms that there are some higher
values pulling the distribution to the right.
Dataset 2:
• Mean: = 31.7
• Median: = 30.5
Relationship:
1. Mean and Median: In Dataset 2, the mean is slightly higher than the
median, indicating a slight rightward skewness. This aligns with the
positive skewness value (7).
2. Dispersion Measures:
• The smaller range (17) and lower standard deviation (2.088)
indicate a narrower spread of values from the mean compared to
Dataset 1.
• The IQR (5) is also relatively small, indicating a concentrated
distribution within the middle 50% of the data.
• The positive skewness suggests that there are some higher values
pulling the distribution to the right, but the effect is less
pronounced than in Dataset 1.
Differences Between Datasets:
1. Variability: Dataset 1 has higher variability, as indicated by a larger range,
higher variance, and higher standard deviation compared to Dataset 2.
2. Central Tendency vs. Dispersion: In both datasets, the means are slightly
higher than the medians, suggesting a slight rightward skewness.
However, the effect is more pronounced in Dataset 1.
3. Spread within the Middle 50% (IQR): Dataset 1 has a larger IQR,
indicating a wider spread within the middle 50% of the data compared to
Dataset 2.
Dataset 1 exhibits higher variability, a slightly more pronounced rightward
skewness, and a wider spread within the middle 50% of the data compared to
Dataset 2. The relationships between measures of central tendency and
dispersion provide a comprehensive understanding of the distributional
characteristics of each dataset.

PART 4: Real-world Application.

One real-world situation where understanding dispersion metrics is crucial is in


the field of financial investments, particularly when evaluating the risk and
return of investment portfolios.
Background:
Imagine you are a financial analyst tasked with assessing the performance of
two investment portfolios—Portfolio A and Portfolio B. Each portfolio contains
a mix of stocks, bonds, and other financial instruments. As an investor or
portfolio manager, it is essential to not only consider the average returns
(measures of central tendency) but also understand the dispersion of returns
(measures of dispersion) to make informed decisions.
How Dispersion Metrics are Utilized:
1. Risk Assessment:
• Variance and Standard Deviation: These metrics are used to
quantify the volatility or risk associated with each portfolio. A
higher standard deviation indicates greater variability in returns,
suggesting higher risk. Investors often prefer portfolios with lower
volatility, especially if they have a lower risk tolerance.
2. Diversification Strategy:
• Covariance and Correlation: These metrics help assess the degree
to which individual assets within a portfolio move together.
Diversification involves selecting assets with low or negative
correlations to spread risk. For instance, if two assets in a portfolio
have a negative correlation, one may perform well when the other
performs poorly, reducing overall portfolio risk.
3. Tail Risk Analysis:
• Skewness: Skewness provides insights into the shape of the return
distribution. A negative skewness may indicate the possibility of
large negative returns, which could be crucial for investors
concerned about the potential for extreme losses.
4. Portfolio Optimization:
• Efficient Frontier Analysis: Investors aim to construct portfolios
that provide the maximum return for a given level of risk or the
minimum risk for a given level of return. Dispersion metrics play a
vital role in identifying the optimal trade-off between risk and
return. The efficient frontier is a graphical representation that
helps investors select portfolios that maximize returns for a given
level of risk.
Decision-Making:
• Investors can use dispersion metrics to align their investment strategies
with their risk preferences. For example, an investor with a low risk
tolerance may opt for a portfolio with lower standard deviation, even if it
means accepting a lower average return.
• When comparing two portfolios, understanding the dispersion of returns
allows investors to make informed decisions based on both the potential
for higher returns and the associated level of risk. in the context of
investment portfolios, understanding dispersion metrics is crucial for
assessing risk, optimizing portfolios, and making informed decisions that
align with an investor's risk tolerance and return objectives.

PART 5: Conclusion.

Measures of dispersion are essential in statistical analysis for understanding the


spread or variability within a dataset. The key findings and significance of
measures of dispersion include:
1. Types of Measures:
• Range: Simplest measure, representing the difference between
the maximum and minimum values.
• Variance and Standard Deviation: Provide a comprehensive view
of the spread by considering the squared differences from the
mean.
• Interquartile Range (IQR): Measures the spread of the middle 50%
of the data, less sensitive to extreme values.
• Skewness: Indicates the asymmetry or lack of symmetry in the
data distribution.
2. Significance of Measures of Dispersion:
• Quantifying Variability: Measures of dispersion quantify the extent
to which data points deviate from the central tendency, providing
a comprehensive understanding of data spread.
• Risk Assessment: Crucial in fields like finance for evaluating
volatility and risk associated with investments.
• Comparing Data Sets: Facilitates comparison of variability between
different datasets, aiding in data interpretation and decision-
making.
• Quality Control: Important in manufacturing to assess the
consistency and reliability of processes.
• Data Interpretation: Helps in understanding distribution shape,
identifying outliers, and making informed decisions based on both
central tendency and data spread.
3. Example in Financial Investments:
• Understanding dispersion metrics is crucial in assessing the risk
and return of investment portfolios, with measures like standard
deviation guiding decisions aligned with risk tolerance.
4. Decision-Making in Statistics:
• Dispersion metrics guide decision-making by providing insights
into risk, diversification, and optimization in various fields.
• The choice of a specific measure depends on data characteristics
and analysis goals.
5. Summary:
Measures of dispersion, including range, variance, standard deviation, IQR, and
skewness, provide valuable information about the variability in a dataset.
They are essential for assessing risk, optimizing portfolios, and making
decisions that balance risk and return in statistical analysis.
The choice of a specific measure depends on the characteristics of the data and
the goals of the analysis.
In conclusion, measures of dispersion are fundamental tools in statistical
analysis, offering a more complete perspective on the distribution and
variability of data. Their significance extends across diverse fields, providing the
foundation for informed decision-making that considers both central tendency
and the spread of data points.

You might also like