ML Unit2-1
ML Unit2-1
ML Unit2-1
1
0.4 Properties of Probability Distributions
Probability distributions possess several key properties that are essential for
understanding and analyzing random phenomena.
2
3. Specific Probabilities: Each value of a discrete random variable has a
specific probability associated with it. These probabilities sum up to 1,
ensuring that one of the possible outcomes must occur.
4. Examples of Discrete Random Variables: Common examples in-
clude:
• The number of students in a classroom.
• The number of defects in a batch of products.
• The number of goals scored in a soccer match.
• The number of cars passing through a toll booth in an hour.
f (x) = P (X = x)
Where:
• x represents a specific value of the random variable X.
• f (x) is the probability mass function.
Example:
Consider a fair six-sided die. Let X represent the outcome of a single roll of the
die. The possible values of X are 1, 2, 3, 4, 5, and 6. Since the die is fair, each
outcome has an equal probability of 61 . The PMF for X is:
(
1
if x = 1, 2, 3, 4, 5, 6
f (x) = 6
0 otherwise
This PMF satisfies both properties: each probability is between 0 and 1, and
the sum of all probabilities is 1.
3
1 Fundamental Rules of Probability
Probability theory relies on a set of fundamental rules that provide the founda-
tion for understanding and calculating probabilities. These rules ensure consis-
tency and coherence in probabilistic reasoning, guiding us in making informed
decisions based on uncertain outcomes. Let’s explore each fundamental rule in
depth:
1. Non-negativity Axiom:
The non-negativity axiom states that probabilities must be non-negative.
In other words, the probability of an event occurring cannot be negative.
Mathematically, for any event E, we have:
P (E) ≥ 0
where the sum is taken over all events E in the sample space S.
Explanation: This rule ensures that the total probability space is fully
accounted for, leaving no room for uncertainty. It establishes a basis
for interpreting probabilities as proportions or fractions of certainty, with
1 representing complete certainty (i.e., certainty that something in the
sample space will occur).
3. Addition Rule for Disjoint Events:
The addition rule for disjoint events, also known as the sum rule, states
that if A and B are disjoint events (i.e., they cannot both occur simulta-
neously), then the probability of either event occurring is the sum of their
individual probabilities. Mathematically, for disjoint events A and B, we
have:
P (A ∪ B) = P (A) + P (B)
4
Explanation: This rule captures the idea that when events are mutually
exclusive (i.e., if one event occurs, the other cannot), the probability of
their union is simply the sum of their individual probabilities. It allows us
to calculate the probability of at least one of the events occurring without
double-counting the overlap between them.
These fundamental rules provide a solid framework for reasoning about un-
certainty and making probabilistic predictions. Understanding and applying
these rules correctly are essential for proper interpretation and manipulation of
probabilities in various real-world scenarios, ranging from gambling and finance
to scientific research and decision-making processes.
P (B|A) × P (A)
P (A|B) =
P (B)
Where:
5
- Prior Probability: P (A) = 0.01 (1- Likelihood: P (B|A) = 0.99 (99-
Total Probability: P (B) = P (B|A) × P (A) + P (B|¬A) × P (¬A)
Since the test can either correctly indicate the disease given its presence
(P (B|A)×P (A)) or incorrectly indicate the disease given its absence (P (B|¬A)×
P (¬A)), we can calculate P (B):
P (B|A) × P (A)
P (A|B) = ≈ 0.9519
P (B)
So, given that the test indicates the presence of the disease, the probability
that the person actually has the disease is approximately 95.19
Significance and Applications:
Bayes’ theorem has widespread applications across various fields, including
but not limited to:
• Medical diagnosis
P (A ∩ B) = P (A) × P (B)
6
In other words, the probability of A given B (or vice versa) is the same as
the probability of A without considering B, and vice versa.
Example: Consider two events: flipping a fair coin and rolling a fair six-
sided die. The outcome of the coin flip (heads or tails) is independent of the
outcome of the die roll (1, 2, 3, 4, 5, or 6). The probability of getting heads on
the coin flip is P (heads) = 0.5, and the probability of rolling a 4 on the die is
P (4) = 16 . The joint probability of getting heads on the coin flip and rolling a
4 on the die is P (heads ∩ 4) = P (heads) × P (4) = 0.5 × 16 = 121
, demonstrating
independence.
Event A Event B
| |
| |
Condition C (Given)
This block diagram represents events A and B being conditioned on event
C. If A and B are conditionally independent given C, they are represented as
separate branches from C, indicating that the occurrence of C does not affect
the relationship between A and B.
7
4 Continuous Random Variables
Continuous random variables are variables that can take on any value within
a certain range, often representing measurements or quantities that can vary
continuously. Unlike discrete random variables, which can only assume specific,
distinct values, continuous random variables can take on an infinite number of
values within their defined intervals. Understanding continuous random vari-
ables is essential in various fields such as physics, engineering, economics, and
statistics.
8
Where:
• f (t) is the PDF of the continuous random variable X.
The CDF provides a convenient way to calculate probabilities for continuous
random variables and is particularly useful in statistical analysis and hypothesis
testing.
Example: For the normal distribution described earlier, the CDF can be
calculated by integrating the PDF from negative infinity to a specified value x:
Z x
1 (t−µ)2
F (x) = √ e− 2σ2 dt
−∞ 2πσ
This integral gives the probability that the random variable X is less than or
equal to x, providing valuable information about the distribution of the variable.
These measures help characterize the central tendency and spread of the dis-
tribution, providing valuable insights into the behavior of the random variable.
9
5 Quantiles, Mean, and Variance
Quantiles
Quantiles are values that divide a probability distribution into equally-sized
intervals. They provide insight into the spread and distribution of data. The
most commonly used quantiles include the median (50th percentile), quartiles
(25th and 75th percentiles), and percentiles (any value between 0 and 100).
• Median: The median is the value that separates the lower and upper
halves of a dataset. It is the 50th percentile, meaning that 50% of the data
lies below it and 50% lies above it. The median is resistant to outliers and
provides a robust measure of central tendency.
• Quartiles: Quartiles divide a dataset into four equal parts. The first
quartile (Q1) is the value below which 25% of the data lies, the second
quartile is the median (Q2), and the third quartile (Q3) is the value below
which 75% of the data lies. Quartiles help assess the spread and skewness
of the data.
• Percentiles: Percentiles generalize the concept of quartiles to divide a
dataset into hundred equal parts. For example, the 90th percentile rep-
resents the value below which 90% of the data lies. Percentiles are useful
for comparing individual data points to the overall distribution.
The mean is sensitive to extreme values (outliers) and may not accurately
represent the central tendency if the distribution is skewed.
Variance
Variance measures the dispersion or spread of a probability distribution. It
quantifies how much the values of a random variable deviate from the mean.
For a discrete random variable X with probability mass function f (x), the
variance σ 2 is calculated as:
10
X
σ2 = (x − µ)2 · f (x)
x
The square root of the variance, known as the standard deviation (σ), is
often used as a measure of spread, providing the same unit of measurement as
the random variable.
Interpretation
Quantiles, mean, and variance provide complementary information about
the distribution of data. While quantiles describe the spread of data at spe-
cific points, the mean represents the average value, and the variance quantifies
the dispersion around the mean. Together, these measures offer a comprehen-
sive understanding of the characteristics of a probability distribution and are
essential for statistical analysis and inference.
11