Probability and Statistics Notes

STATISTICS
Statistics Definition:
Statistics is the branch of mathematics for collecting, analysing and interpreting data.
Statistics is a branch that deals with every aspect of the data. Statistical knowledge helps to
choose the proper method of collecting the data and employ those samples in the correct analysis
process in order to effectively produce the results. In short, statistics is a crucial process which
helps to make the decision based on the data.
Characteristics of Statistics
The important characteristics of Statistics are as follows:
 Statistics are numerically expressed.

 It has an aggregate of facts
 Data are collected in systematic order
 It should be comparable to each other
 Data are collected for a planned purpose
Importance of Statistics
The important functions of statistics are:
 Statistics helps in gathering information about the appropriate quantitative data

 It depicts the complex data in graphical form, tabular form and in diagrammatic
representation to understand it easily
 It provides the exact description and a better understanding
 It helps in designing the effective and proper planning of the statistical inquiry in any
field
 It gives valid inferences with the reliability measures about the population parameters
from the sample data
 It helps to understand the variability pattern through the quantitative observations
Statistics Example
An example of statistical analysis is when we have to determine the number of people in a town
who watch TV out of the total population in the town. The small group of people is called the
sample here, which is taken from the population.
Types of Statistics
The two main branches of statistics are:
 Descriptive Statistics
 Inferential Statistics
1. Descriptive Statistics – Through graphs or tables, or numerical calculations,
descriptive statistics uses the data to provide descriptions of the population.
2. Inferential Statistics – Based on the data sample taken from the population,
inferential statistics makes the predictions and inferences.
Descriptive statistics
Summarize and organize characteristics of a data set. A data set is a collection of responses or
observations from a sample or entire population.
In quantitative research, after collecting data, the first step of statistical analysis is to describe
characteristics of the responses, such as the average of one variable (e.g., age), or the relation
between two variables (e.g., age and creativity).
Types of descriptive statistics

There are 3 main types of descriptive statistics:
1. Frequency distribution
A data set is made up of a distribution of values, or scores. In tables or graphs, you can
summarize the frequency of every possible value of a variable in numbers or percentages. This is
called a frequency distribution.
For the variable of gender, you list all possible answers on the left hand column. You count the
number or percentage of responses for each answer and display it on the right hand column.
Gender Number
Male 182
Female 235
Gender Number
Other 27
From this table, you can see that more women than men or people with another gender identity
took part in the study.
2. Measures of central tendency

Measures of central tendency estimate the center, or average, of a data set. The mean, median
and mode are 3 ways of finding the average.
Here we will demonstrate how to calculate the mean, median, and mode using the first 6
responses of our survey.
 Mean
 Median
 Mode
3. Measures of variability
Measures of variability give you a sense of how spread out the response values is. The range,
standard deviation and variance each reflect different aspects of spread.
 Range
The range gives you an idea of how far apart the most extreme response scores are. To find the
range, simply subtract the lowest value from the highest value.
Range of visits to the library in the past year
Ordered data set: 0, 3, 3, 12, 15, 24
Range: 24 – 0 = 24
 Standard deviation
The standard deviation (s or SD) is the average amount of variability in your dataset. It tells you,
on average, how far each score lies from the mean. The larger the standard deviation, the more
variable the data set is.
There are six steps for finding the standard deviation:
1. List each score and find their mean.

2. Subtract the mean from each score to get the deviation from the mean.
3. Square each of these deviations.
4. Add up all of the squared deviations.
5. Divide the sum of the squared deviations by N – 1.
6. Find the square root of the number you found.
Standard deviations of visits to the library in the past year
In the table below, you complete Steps 1 through 4.
Raw data Deviation from mean Squared deviation
15 15 – 9.5 = 5.5 30.25
3 3 – 9.5 = -6.5 42.25
12 12 – 9.5 = 2.5 6.25
0 0 – 9.5 = -9.5 90.25
24 24 – 9.5 = 14.5 210.25
3 3 – 9.5 = -6.5 42.25
M = 9.5 Sum = 0 Sum of squares = 421.5
Step 5: 421.5/5 = 84.3
Step 6: √84.3 = 9.18
From learning that s = 9.18, you can say that on average, each score deviates from the mean by
9.18 points.
Variance
The variance is the average of squared deviations from the mean. Variance reflects the degree of
spread in the data set. The more spread the data, the larger the variance is in relation to the mean.
To find the variance, simply square the standard deviation. The symbol for variance is s2.
Variance of visits to the library in the past year
Data set: 15, 3, 12, 0, 24, 3
s = 9.18
s2 = 84.3
 Univariate descriptive statistics
Univariate descriptive statistics focus on only one variable at a time. It‘s important to examine
data from each variable separately using multiple measures of distribution, central tendency and
spread. Programs like SPSS and Excel can be used to easily calculate these.
Visits to the library
N 6
Mean 9.5
Median 7.5
Mode 3
Standard deviation 9.18
Variance 84.3
Range 24
If you were to only consider the mean as a measure of central tendency, your impression of the
―middle‖ of the data set can be skewed by outliers, unlike the median or mode.
 Bivariate descriptive statistics
If you‘ve collected data on more than one variable, you can use bivariate or multivariate
descriptive statistics to explore whether there are relationships between them.
In bivariate analysis, you simultaneously study the frequency and variability of two variables to
see if they vary together. You can also compare the central tendency of the two variables before
performing further statistical tests.
 Multivariate analysis is the same as bivariate analysis but with more than two variables.
Q1 What is Statistics?
Statistics is the branch of mathematics for collecting, analysing and interpreting data. Statistics
can be used to predict the future, determine the probability that a specific event will happen, or
help answer questions about a survey. Statistics is used in many different fields such as business,
medicine, biology, psychology and social sciences.
Q2 What are the types of statistics?

There are two types of statistics. One type is called descriptive statistics, which focuses on
summarising data. Another type is called inferential statistics, which focuses on making
conclusions about populations based on samples.
Q3 Why are statistics vital?

Statistics is an important field because it helps us understand the general trends and patterns in a
given data set. Statistics can be used for analysing data and drawing conclusions from it. It can
also be used for making predictions about future events and behaviours. Statistics also help us
understand how things are changing over time.
Q4 what are the uses of statistics in real life?

Statistics is an integral part of our lives. It is used in the workplace and everyday life. In the
workplace, statistics are often used to analyse what works best for a company‘s marketing
strategy or how to distribute work among employees. In daily life, statistics can be used to
analyse what food you should buy at the grocery store or how much money you spend on
purchasing each week. Statistics are everywhere, and they help us make sense of the world
around us.
Q5 what are the most important concepts covered in statistics?

The most important concepts covered in Statistics include mean, median, mode, range, and
standard deviation.
Graphical Representation
A graphical representation is a visual display of data and statistical results. It is a way of
analysing numerical data. It exhibits the relation between data, ideas, information and concepts
in a diagram. It is easy to understand and it is one of the most important learning strategies. It
always depends on the type of information in a particular domain.
Types of Graphical Representation
There are different types of graphical representation. Some of them are as follows:
1. Bar Graph
2. Pie Chart
3. Line Graph
4. Pictograph
5. Histogram
6. Frequency Distribution
7. Stem and Leaf Plot
8. Scatter Plot
Stem and Leaf Plot:

A stem and leaf plot shows a large amount of data in a clear way by listing it in order of place
value. A stem and leaf plot is generally used when the data has multi-digit numbers.
Stem – largest place value(s) of a number,
Leaf – smallest place value of a number
A Stem and Leaf Plot is a special table where each data value is split into a "stem" (the first digit
or digits) and a "leaf" (usually the last digit). Like in this example:
Example:
"32" is split into "3" (stem) and "2" (leaf).

Uses of Graphical Representation of Data
1. It helps us to relate and compare data for various time periods.
2. It saves time as it covers most of the information in facts and figures.
3. It is used in statistical analysis to determine the mean, median and mode for different data
sets.
4. It is easy for us to understand the graphical data as the information portrayed are in facts
and figures.
5. Data displayed by graphical representation can be memorised for the long term.
6. The use of graphs in our daily life also helps in making and analysing.
Advantages and Disadvantages of Graphical Representation
Advantages of Graphical Representation:
Attractive and Impressive: Graphs are always more attractive and impressive than tables or
figures.
Simple and understandable presentation of data: Graphs help to present complex data in a simple
and understandable way. It saves time and energy for both the statistician and the observer.
Useful in comparison: Graphs provide an easy comparison of two or more phenomena.
Location of positional averages: Grapha provides a method of locating certain positional

averages like median, mode, quartiles, etc.
Universal utility: Graphs can be used in all fields such as trade, economics, government
departments, advertisements, etc.
Helpful in predictions: Through graphs, tendencies that could occur in the near future can be
predicted in a better way.
Disadvantages of Graphical Representation:

Costly: Graphical representation of data is costly because it includes images, colours, and paints.
A combination of human efforts and materials makes the graphical representation of data costly.
Lack of Secrecy: Graphical representation of data makes the full presentation of information that
may cause the objective of hiding something.
Errors and Mistakes: There are more chances of errors in the graphical representation of data
because it is complex. This will cause problems in a better understanding.
More Time: Graphical representation of data takes more time in comparison to normal reports.
What is a graphical representation?
What are the different types of graphical representations?
List two advantages to the graphical representation of data?
What are the different ways to represent the data?
What is the purpose of the graphical representation?
Box-cox plot
The Box-Cox linearity plot is a plot of the correlation between Y and the transformed X for
given values of \lambda . That is, \lambda is the coordinate for the horizontal axis variable and
the value of the correlation between Y and the transformed X is the coordinate for the vertical
axis of the plot.
Measures of Central Tendency & Dispersion

Measures that indicate the approximate center of a distribution are called measures of central
tendency.
Measures that describe the spread of the data are measures of dispersion. These measures include
the mean, median, mode, range, upper and lower quartiles, variance, and standard deviation.
A. Finding the Mean
The mean of a set of data is the sum of all values in a data set divided by the number of values in
the set.
It is also often referred to as an arithmetic average. he reek letter ―mu‖ is used as the symbol
for population mean and the symbol is used to represent the mean of a sample. o determine the
mean of a data set:
1. Add together all of the data values.

2. Divide the sum from Step 1 by the number of data values in the set.
B. Finding the Median
he median of a set of data is the ―middle element‖ when the data is arranged in ascending order.
To determine the median:
Example:
Consider the data set: 17, 10, 9, 14, 13, 17, 12, 20, 14
Step 1: Put the data in order from smallest to largest. 9, 10, 12, 13, 14, 14, 17, 17, 20
Step 2: Determine the absolute middle of the data. 9, 10, 12, 13, 14, 14, 17, 17, 20
Since the number of data points is odd choose the one in the very middle.
The median of this data set is 14
C. Finding the Mode

The mode is the most frequently occurring measurement in a data set. There may be one mode;
multiple modes, if more than one number occurs most frequently; or no mode at all, if every
number occurs only once.
Example:
Step 2: Look for any number that occurs more than once. 9, 10, 12, 13, 14, 14, 17, 17, 20
Step 3: Determine which of those occur most frequently. 14 and 17 both occur twice.
The modes of this data set are 14 and 17.
E. Finding the Range

The range is the difference between the lowest and highest values in a data set.
Example:
Step 2: Identify your maximum. 9, 10, 12, 13, 14, 14, 17, 17, 20
Step 2: Identify your minimum. 9, 10, 12, 13, 14, 14, 17, 17, 20
Step 3: Subtract the minimum from the maximum. 20 – 9 = 11
The range of this data set is 11.
F. Finding the Variance and Standard Deviation

The variance and standard deviation are a measure based on the distance each data value is from
the mean.
1. ind the mean of the data. if calculating for a population or if using a sample
2. ubtract the mean or from each data value xi .
3. Square each calculation from Step 2.
4. Add the values of the squares from Step 3.
5. Find the number of data points in your set, called n.
6. Divide the sum from Step 4 by the number n (if calculating for a population) or n – 1(if using
a sample). This will give you the variance.
7. To find the standard deviation, square root this number.

Probability
Probability defines the likelihood of occurrence of an event. There are many real-life
situations in which we may have to predict the outcome of an event. We may be sure or not
sure of the results of an event. In such cases, we say that there is a probability of this event to
occur or not occur. Probability generally has great applications in games, in business to make
predictions, and also it has extensive applications in this new area of artificial intelligence.
Definition
Probability can be defined as the ratio of the number of favorable outcomes to the total
number of outcomes of an event. For an experiment having 'n' number of outcomes, the
number of favorable outcomes can be denoted by x.
Formula for Probability

The probability formula is defined as the possibility of an event to happen is equal to the
ratio of the number of favourable outcomes and the total number of outcomes.
where,
P(A) is the probability of an event 'B'.
n (A) is the number of favorable outcomes of an event 'B'.
n(S) is the total number of events occurring in a sample space.
Probability Tree
The tree diagram helps to organize and visualize the different possible outcomes. Branches
and ends of the tree are two main positions. Probability of each branch is written on the
branch, whereas the ends are containing the final outcome. Tree diagrams are used to figure
out when to multiply and when to add. You can see below a tree diagram for the coin:
Types of Probability
There are three major types of probabilities:
1. Theoretical Probability
2. Experimental Probability
3. Axiomatic Probability
1. Theoretical Probability
It is based on the possible chances of something to happen. The theoretical probability is
mainly based on the reasoning behind probability. For example, if a coin is tossed, the
theoretical probability of getting a head will be ½.
2. Experimental Probability
It is based on the basis of the observations of an experiment. The experimental probability

can be calculated based on the number of possible outcomes by the total number of trials. For
example, if a coin is tossed 10 times and head is recorded 6 times then, the experimental
probability for heads is 6/10 or, 3/5.
3. Axiomatic Probability
In axiomatic probability, a set of rules or axioms are set which applies to all types. These
axioms are set by Kolmogorov and are known as Kolmogorov‘s three axioms. With the
axiomatic approach to probability, the chances of occurrence or non-occurrence of the events
can be quantified. The axiomatic probability lesson covers this concept in detail with
Kolmogorov‘s three rules axioms along with various examples.
Probability Terms and Definition or
Terminology of Probability Theory

Some of the important probability terms are discussed here:
 Sample Space:
 All the possible outcomes of an experiment together constitute a sample space. For
example, the sample space of tossing a coin is {head, tail}.
 Event: The total number of outcomes of a random experiment is called an event.
 Equally Likely Events: Events that have the same chances or probability of occurring
are called equally likely events. The outcome of one event is independent of the other.
For example, when we toss a coin, there are equal chances of getting a head or a tail.
 Exhaustive Events: When the set of all outcomes of an event is equal to the sample
space, we call it an exhaustive event.
 Mutually Exclusive Events: Events that cannot happen simultaneously are called
mutually exclusive events. For example, the climate can be either hot or cold. We
cannot experience the same weather simultaneously.
Events in Probability
In probability theory, an event is a set of outcomes of an experiment or a subset of the sample
space. If P(E) represents the probability of an event E, then, we have,
P(E) = 0 if and only if E is an impossible event.
P(E) = 1 if and only if E is a certain event.
0 ≤ P E ≤ 1.
Suppose, we are given two events, "A" and "B", then the probability of event A, P(A) > P(B)
if and only if event "A" is more likely to occur than the event "B". Sample space(S) is the set
of all of the possible outcomes of an experiment and n(S) represents the number of outcomes
in the sample space.
P(E) = n(E)/n(S)
P E‘ = n - n(E))/n(S) = 1 - (n(E)/n(S))
Calculating Probability
In an experiment, the probability of an event is the possibility of that event occurring. The
probability of any event is a value between (and including) "0" and "1". Follow the steps
below for calculating probability of an event A:
Step 1: Find the sample space of the experiment and count the elements. Denote it by n(S).
Step 2: Find the number of favorable outcomes and denote it by n(A).
Step 3: To find probability, divide n(A) by n(S). i.e., P(A) = n(A)/n(S).
 Here are some examples that well describe the process of finding probability.
Example 1: Find the probability of getting a number less than 5 when a dice is rolled by
using the probability formula.
Solution
To find:
Probability of getting a number less than 5
Given: Sample space, S = {1,2,3,4,5,6}
Therefore, n(S) = 6
Let A be the event of getting a number less than 5. Then A = {1,2,3,4}
So, n(A) = 4
Using the probability equation,
P(A) = (n(A))/(n(s))
p(A) = 4/6
m = 2/3
Answer: The probability of getting a number less than 5 is 2/3.
Example 2: What is the probability of getting a sum of 9 when two dice are thrown?
Solution:
There is a total of 36 possibilities when we throw two dice.
To get the desired outcome i.e., 9, we can have the following favorable outcomes.
(4,5),(5,4),(6,3)(3,6). There are 4 favorable outcomes.
Probability of an event P(E) = (Number of favorable outcomes) ÷ (Total outcomes in a

sample space)
Probability of getting number 9 = 4 ÷ 36 = 1/9
Answer: Therefore the probability of getting a sum of 9 is 1/9.
Probability Theorems
The following theorems of probability are helpful to understand the applications of
probability and also perform the numerous calculations involving probability.
Theorem 1: The sum of the probability of happening of an event and not happening of an
event is equal to 1. P(A) + P(A') = 1.
Theorem 2: The probability of an impossible event or the probability of an event not

happening is always equal to 0. P(ϕ) = 0.
Theorem 3: The probability of a sure event is always equal to 1. P(A) = 1
Theorem 4: The probability of happening of any event always lies between 0 and 1. 0 <
P(A) < 1
Theorem 5: If there are two events A and B, we can apply the formula of
the union of two sets and we can derive the formula for the probability of happening of event
A or event B as follows.
P(A∪B) = P(A) + P(B) - P A∩B)
Also for two mutually exclusive events A and B, we have P( A U B) = P(A) + P(B)
Bayes' Theorem on Conditional Probability

Bayes' theorem describes the probability of an event based on the condition of occurrence of
other events. It is also called conditional probability. It helps in calculating the probability of
happening of one event based on the condition of happening of another event.
For example,
let us assume that there are three bags with each bag containing some blue, green, and
yellow balls. What is the probability of picking a yellow ball from the third bag? Since there
are blue and green colored balls also, we can arrive at the probability based on these
conditions also. Such a probability is called conditional probability
Law of Total Probability

If there are n numbers of events in an experiment, then the sum of the probabilities of those n
events is always equal to 1.
P (A1) + P A2 + P A3 + … + P An = 1
Important Notes on Probability:

 Probability is a measure of how likely an event is to happen.
 Probability is represented as a fraction and always lies between 0 and 1.
 An event can be defined as a subset of sample space.
 The sample of throwing a coin is {head, tail} and the sample space of throwing dice is
{1, 2, 3, 4, 5, 6}.
 A random experiment cannot predict the exact outcomes but only some probable
outcomes.
Theoretical Distribution
A random exponent is assumed as a model for theoretical distribution, and the probabilities
are given by a function of the random variable is called probability function.
For example,
If we toss a fair coin, the probability of getting a head is 12. If we toss it for 50 times, the
probability of getting a head is 25. We call this as the theoretical or expected frequency of the
heads. But actually, by tossing a coin, we may get 25, 30 or 35 heads which we call as the
observed frequency.
Types of Theoretical Distribution

1. Binomial Distribution
2. Poisson distribution
3. Normal distribution or Expected Frequency distribution
Binomial Definition
The Latin prefix "bi-" means "two", the root "nom" means name, and the suffix "-ial" means
"of or relating to". The literal translation of the word binomial is "of or relating to two
names."
The algebraic expression which contains only two terms is called binomial.
It is a two-term polynomial. Also, it is called a sum or difference between two or more

monomials. It is the simplest form of a polynomial.
 When expressed as a single indeterminate, a binomial can be expressed as;
Where a and b are the numbers, and m and n are non-negative distinct integers. x takes the
form of indeterminate or a variable.
 In Laurent polynomials, binomials are expressed in the same manner, but the only
difference is the exponents, m and n can be negative. Therefore, we can write it as;
Examples of Binomial
Some of the binomial examples are;
4x2+5y2
xy2+xy
0.75x+10y2
x+y
x2 + 3
Binomial distribution
The binomial distribution is the discrete probability distribution that gives only two possible
results in an experiment, either Success or Failure
For example,
A coin toss has only two possible outcomes: heads or tails and taking a test could have two
possible outcomes: pass or fail.
Properties of Binomial Distribution
The properties of the binomial distribution are:
 There are two possible outcomes: true or false, success or failure, yes or no.
 here is ‗n‘ number of independent trials or a fixed number of n times repeated trials.
 The probability of success or failure remains the same for each trial.
 Only the number of success is calculated out of n independent trials.
 Every trial is an independent trial, which means the outcome of one trial does not
affect the outcome of another trial.
Binomial Distribution Mean and Variance
For a binomial distribution, the mean, variance and standard deviation for the given number
of success are represented using the formulas
Mean, μ = np Variance, σ2 = npq
Standard Deviation σ= √(npq)
Where p is the probability of success
q is the probability of failure, where q = 1-p
Negative Binomial Distribution

A negative binomial random variable is the number X of repeated trials to produce r
successes in a negative binomial experiment. The probability distribution of a negative
binomial random variable is called a negative binomial distribution. The negative binomial
distribution is also known as the Pascal distribution.
Examples of Negative Binomial Distribution
The following quick examples help in a better understanding of the concept of the negative
binomial distribution.
Suppose we flip a coin repeatedly and count the number of heads (successes). If we continue
flipping the coin until it has landed 2 times on heads, we are conducting a negative binomial
experiment. The negative binomial random variable is the number of coin flips required to
achieve 2 heads. In this example, the number of coin flips is a random variable that can take
on any integer value between 2 and plus infinity. The negative binomial probability
distribution for this example is presented below.
Properties of Negative Binomial Distribution
A negative binomial distribution is a distribution that has the following properties.
 The negative binomial distribution has a total of n number of trials.

 Each trial has two outcomes, and one of them is referred to as success and the other as a
failure.
 The probability of success or failure is the same across each of these trials.
 The probability of success is denoted by p, and the probability of failure is defined as q,
and each of these is the same in every trial.
 The sum of the probability of success and failure is equal to 1. p + q = 1.
 Each of these trials is independent. The outcome of one trial does not affect the outcome
of other trials.
 The experiment is continued until r success is obtained, and r is defined in advance.
Geometric Distribution Definition

A geometric distribution is defined as a discrete probability distribution of a random variable
―x‖ which satisfies some of the conditions. he geometric distribution conditions are
 A phenomenon that has a series of trials

 Each trial has only two possible outcomes – either success or failure
 The probability of success is the same for each trial
Geometric Distribution Formula

In probability and statistics, geometric distribution defines the probability that first success
occurs after k number of trials. If p is the probability of success or failure of each trial, then
the probability that success occurs on the
Examples
 A person is seeking new employment that is both challenging and fulfilling. What is
the probability that he will quit zero times, one time, two times so on until he finds
his ideal job?
 A pharmaceutical company is designing a new drug to treat a certain disease that will
have minimal side effects. What is the probability that zero drugs fail the test, one
drug fails the test, two drugs fail the test and so on until they have designed the ideal
drug?
Poison distribution
A Poisson distribution is a discrete probability distribution. It gives the probability of an
event happening a certain number of times (k) within a given interval of time or space.
he Poisson distribution has only one parameter, λ lambda , which is the mean number of
events.
Poisson distribution Formula

The formula for the Poisson distribution function is given by:
Where,
e is the base of the logarithm
x is a Poisson random variable
λ is an average rate of value
Example 1: In a cafe, the customer arrives at a mean rate of 2 per min. Find the
probability of arrival of 5 customers in 1 minute using the Poisson distribution formula.
Solution:
iven: λ = 2, and x = 5.
Using the Poisson distribution formula:
P(X = 6) = 0.036
Answer: The probability of arrival of 5 customers per minute is 3.6%.
Mean and variance of a Poisson distribution
The Poisson distribution has only one parameter, called λ.
 he mean of a Poisson distribution is λ.

 The variance of a Poisson distribution is also λ.
Properties of Poisson distribution

The Poisson distribution is applicable in events that have a large number of rare and independent
possible events. The following are the properties of the Poisson distribution. In the Poisson
distribution,
 The events are independent.

 The average number of successes in the given period of time alone can occur. No two
events can occur at the same time.
 The Poisson distribution is limited when the number of trials n is indefinitely large.
 mean = variance = λ
 np = λ is finite, where λ is constant.
 he standard deviation is always equal to the square root of the mean μ.
 he exact probability that the random variable X with mean μ =a is given by P X= a =
μa / a! e -μ
 If the mean is large, then the Poisson distribution is approximately a normal distribution.
Applications of Poisson distribution

There are various applications of the Poisson distribution. The random variables that
follow a Poisson distribution are as follows:
 To count the number of defects of a finished product

 To count the number of deaths in a country by any disease or natural calamity
 To count the number of infected plants in the field
 To count the number of bacteria in the organisms or the radioactive decay in
atoms
 To calculate the waiting time between the events.
Normal distribution
Normal distribution, also known as the Gaussian distribution, is a probability distribution that is
symmetric about the mean, showing that data near the mean are more frequent in occurrence than
data far from the mean. In graphical form, the normal distribution appears as a "bell curve".
The Formula for the Normal Distribution
he normal distribution follows the following formula. Note that only the values of the mean μ
and standard deviation σ are necessary
Example of a Normal Distribution
Many naturally-occurring phenomena appear to be normally-distributed. Take, for example, the

distribution of the heights of human beings. The average height is found to be roughly 175 cm (5'
9"), counting both males and females.
As the chart below shows, most people conform to that average. Meanwhile, taller and shorter
people exist, but with decreasing frequency in the population. According to the empirical rule,
99.7% of all people will fall with +/- three standard deviations of the mean, or between 154 cm
(5' 0") and 196 cm (6' 5"). Those taller and shorter than this would be quite rare (just 0.15% of
the population each).
Normal Distribution Problems and Solutions
Question 1: Calculate the probability density function of normal distribution using the following
data. x = 3, μ = 4 and σ = 2.
Solution: Given, variable, x = 3
Mean = 4 and
Standard deviation = 2
By the formula of the probability density of normal distribution, we can write;
Properties of Normal Distribution:

 Its shape is symmetric.
 The mean and median are the same and lie in the middle of the distribution
 Its standard deviation measures the distance on the distribution from the mean to the
inflection point the place where the curve changes from an ―upside-down-bowl‖ shape to
a ―right-side-up-bowl‖ shape .
 Because of its unique bell shape, probabilities for the normal distribution follow the
Empirical Rule, which says the following:
 About 68 percent of its values lie within one standard deviation of the mean. To find this
range, take the value of the standard deviation, then find the mean plus this amount, and
the mean minus this amount.
 About 95 percent of its values lie within two standard deviations of the mean.
 Almost all of its values lie within three standard deviations of the mean.
Applications
The normal distributions are closely associated with many things such as:
 Marks scored on the test

 Heights of different persons
 Size of objects produced by the machine
 Blood pressure and so on.
Gamma distribution
The gamma distribution term is mostly used as a distribution which is defined as two parameters
– shape parameter and inverse scale parameter, having continuous probability distributions. It is
related to the normal distribution, exponential distribution, and chi-squared distribution and
Erlang distribution. ‗Γ‘ denotes the gamma function.
Gamma distributions have two free parameters, named as alpha α and beta β , where;
α = hape parameter
β = Rate parameter the reciprocal of the scale parameter
It is characterized by mean µ=αβ and variance σ2=αβ2
he scale parameter β is used only to scale the distribution. his can be understood by remarking
that wherever the random variable x appears in the probability density, then it is divided by β.
Since the scale parameter provides the dimensional data, it is seldom useful to work with the
―standard‖ gamma distribution, i.e., with β = 1.
Hypothesis Testing
Hypothesis testing is the process of data utilization to test a hypothesis about a population. A
hypothesis is a statement about a population parameter. For example, the hypothesis that the
population mean equals to 5 is considered to be a null hypothesis. A test statistic is a number
that is used for testing the hypothesis.
 Hypothesis testing is a way to test a claim or assumption about a population

parameter.
 The process starts by stating two hypotheses: the null hypothesis (H0), which is the
status quo, and the alternative hypothesis (H1 or Ha), which represents what the
researcher believes to be true.
Types
The three major types of hypotheses are:
1. Null Hypothesis (H0): Represents the default assumption, stating that there is no
significant effect or relationship in the data.
2. Alternative Hypothesis (Ha): Contradicts the null hypothesis and proposes a specific
effect or relationship that researchers want to investigate.
3. No directional Hypothesis: An alternative hypothesis that doesn't specify the direction
of the effect, leaving it open for both positive and negative possibilities.
Example
Let's consider a hypothesis test for the average height of women in the United States. Suppose
our null hypothesis is that the average height is 5'4". We gather a sample of 100 women and
determine that their average height is 5'5". The standard deviation of population is 2.
To calculate the z-score, we would use the following formula:
z = x – μ0 / σ /√n
z = (5'5" - 5'4" / 2" / √100
z = 0.5 / (0.045)
z = 11.11
We will reject the null hypothesis as the z-score of 11.11 is very large and conclude that there is
evidence to suggest that the average height of women in the US is greater than 5'4".
Estimation
Any of numerous procedures used to calculate the value of some property of a population from
observations of a sample drawn from the population.
Estimation refers to the process by which one makes inferences about a population, based on
information obtained from a sample.
Estimation is a process in which we obtain the values of unknown population parameters with
the help of sample data
Types of Estimation
Estimators are two different types:
1. Point Estimation
2. Interval Estimation
Point Estimates
A point estimate is a sample statistic calculated using the sample data to estimate the most likely
value of the corresponding unknown population parameter. In other words, we derive the point
estimate from a single value in the sample and use it to estimate the population value.
or instance, if we use a value of x to estimate the mean µ of a population.
x = Σx/n
For example,
62 is the average x mark achieved by a sample of 15 students randomly collected from a class
of 150 students, which is considered the mean mark of the entire class. Since it is in the single
numeric form, it is a point estimator.
The basic drawback of point estimates is that no information is available regarding their
reliability. In fact, the probability that a single sample statistic is equal to the population
parameter is very unlikely.
Interval Estimates
 A confidence interval estimate is a range of values constructed from sample data so that
the population parameter will likely occur within the range at a specified probability.
Accordingly, the specified probability is the level of confidence.
 Broader and probably more accurate than a point estimate
 Used with inferential statistics to develop a confidence interval – where we believe with a
certain degree of confidence that the population parameter lies.
 Any parameter estimate that is based on a sample statistic has some amount of sampling
error.
In statistics, interval estimation uses sample data to calculate an interval of possible values of an
unknown population parameter.
Properties of Estimation
We use sample measures to estimate the population measures; these statistics are the estimators.
Following are the properties of good estimators.
 An estimator should be consistent. For instance, if it is consistent, the estimator value

approaches the parameter value estimated as the sample size increases.
 Estimators should be unbiased. In other words, the expected value obtained from the
sample is equal to the parameter being estimated. Otherwise, the estimator is biased.
 The estimator should be efficient. In other words, it should have a minimal variance from
the actual variance of the estimator.
Difference between Hypothesis Testing and Estimation-
Difference between Correlation and Regression
Variable
A variable is any characteristic, number, or quantity that can be measured or counted. A variable
may also be called a data item.
Quantitative
A quantitative variable is a variable that reflects a notion of magnitude, that is, if the values it can
take are numbers. A quantitative variable represents thus a measure and is numerical.
 Quantitative variables are divided into two types: discrete and continuous. The difference
is explained in the following two sections.
1. Discrete
Quantitative discrete variables are variables for which the values it can take are countable and
have a finite number of possibilities. The values are often (but not always) integers. Here are
some examples of discrete variables:
 Number of children per family

 Number of students in a class
 Number of citizens of a country
Even if it would take a long time to count the citizens of a large country, it is still technically
doable. Moreover, for all examples, the number of possibilities is finite. Whatever the number of
children in a family, it will never be 3.58 or 7.912 so the number of possibilities is a finite
number and thus countable.
2. Continuous
On the other hand, quantitative continuous variables are variables for which the values are not
countable and have an infinite number of possibilities. For example:
 Age
 Weight
 Height
For simplicity, we usually referred to years, kilograms (or pounds) and centimeters (or feet and
inches) for age, weight and height respectively. However, a 28-year-old man could actually be
28 years, 7 months, 16 days, 3 hours, 4 minutes, 5 seconds, 31 milliseconds, 9 nanoseconds old.
For all measurements, we usually stop at a standard level of granularity, but nothing (except our
measurement tools) prevents us from going deeper, leading to an infinite number of potential
values. The fact that the values can take an infinite number of possibilities makes it uncountable.
Qualitative
In opposition to quantitative variables, qualitative variables (also referred as categorical variables
or factors in R) are variables that are not numerical and which values fit into categories.
In other words, a qualitative variable is a variable which takes as its values modalities, categories
or even levels, in contrast to quantitative variables which measure a quantity on each individual.
 Qualitative variables are divided into two types: nominal and ordinal.
Nominal
A qualitative nominal variable is a qualitative variable where no ordering is possible or implied
in the levels.
For example, the variable gender is nominal because there is no order in the levels (no matter
how many levels you consider for the gender—only two with female/male, or more than two
with female/male/ungendered/others, levels are unordered). Eye color is another example of a
nominal variable because there is no order among blue, brown or green eyes.
A nominal variable can have:
Two levels (e.g., do you smoke? Yes/No, or are you pregnant? Yes/No), or a large number of
levels (what is your college major? Each major is a level in that case).
Note that a qualitative variable with exactly 2 levels is also referred as a binary or dichotomous
variable.
Ordinal
On the other hand, a qualitative ordinal variable is a qualitative variable with an order implied in
the levels. For instance, if the severity of road accidents has been measured on a scale such as
light, moderate and fatal accidents, this variable is a qualitative ordinal variable because there is
a clear order in the levels.
Another good example is health, which can take values such as poor, reasonable, good, or
excellent. Again, there is a clear order in these levels so health is in this case a qualitative ordinal
variable.
Variable transformations
There are two main variable transformations:
 From a continuous to a discrete variable

 From a quantitative to a qualitative variable
From continuous to discrete
Let‘s say we are interested in babies‘ ages. he data collected is the age of the babies, so a
quantitative continuous variable. However, we may work with only the number of weeks since
birth and thus transforming the age into a discrete variable. The variable age remains a
quantitative continuous variable but the variable we are working on (i.e., the number of weeks
since birth) can be seen as a quantitative discrete variable.
From quantitative to qualitative
Let‘s say we are interested in the Body Mass Index BMI . or this, a researcher collects data on
height and weight of individuals and computes the BMI. The BMI is a quantitative continuous
variable but the researcher may want to turn it into a qualitative variable by categorizing
individuals below a certain threshold as underweighted, above a certain threshold as
overweighted and the rest as normal weighted. The raw BMI is a quantitative continuous
variable but the categorization of the BMI makes the transformed variable a qualitative (ordinal)
variable, where the levels are in this case underweighted < normal < overweighted.

Probability and Statistics Notes

Uploaded by

Copyright:

Available Formats

Probability and Statistics Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Statistics Notes

Uploaded by

Copyright:

Available Formats

STATISTICS

 Statistics are numerically expressed.

 Statistics helps in gathering information about the appropriate quantitative data

The two main branches of statistics are:

Types of descriptive statistics

2. Measures of central tendency

Range of visits to the library in the past year

Ordered data set: 0, 3, 3, 12, 15, 24

1. List each score and find their mean.

Standard deviations of visits to the library in the past year

In the table below, you complete Steps 1 through 4.

Raw data Deviation from mean Squared deviation

15 15 – 9.5 = 5.5 30.25

3 3 – 9.5 = -6.5 42.25

12 12 – 9.5 = 2.5 6.25

0 0 – 9.5 = -9.5 90.25

24 24 – 9.5 = 14.5 210.25

3 3 – 9.5 = -6.5 42.25

M = 9.5 Sum = 0 Sum of squares = 421.5

Step 5: 421.5/5 = 84.3

Step 6: √84.3 = 9.18

Variance of visits to the library in the past year

Data set: 15, 3, 12, 0, 24, 3

 Univariate descriptive statistics

Visits to the library

Standard deviation 9.18

 Bivariate descriptive statistics

Q2 What are the types of statistics?

Q3 Why are statistics vital?

Q4 what are the uses of statistics in real life?

Q5 what are the most important concepts covered in statistics?

Types of Graphical Representation

Stem and Leaf Plot:

Stem – largest place value(s) of a number,

Leaf – smallest place value of a number

"32" is split into "3" (stem) and "2" (leaf).

Advantages and Disadvantages of Graphical Representation

Advantages of Graphical Representation:

Useful in comparison: Graphs provide an easy comparison of two or more phenomena.

Location of positional averages: Grapha provides a method of locating certain positional

Disadvantages of Graphical Representation:

What is a graphical representation?

What are the different types of graphical representations?

List two advantages to the graphical representation of data?

What are the different ways to represent the data?

What is the purpose of the graphical representation?

Measures of Central Tendency & Dispersion

A. Finding the Mean

1. Add together all of the data values.

B. Finding the Median

The median of this data set is 14

C. Finding the Mode

The modes of this data set are 14 and 17.

E. Finding the Range

Step 3: Subtract the minimum from the maximum. 20 – 9 = 11

The range of this data set is 11.

F. Finding the Variance and Standard Deviation

2. ubtract the mean or from each data value xi .

3. Square each calculation from Step 2.

4. Add the values of the squares from Step 3.