Probability Lecture Notes: 1 Definitions
Probability Lecture Notes: 1 Definitions
Probability Lecture Notes: 1 Definitions
1 Definitions
This section includes some basic definitions that we must go over to
understand the lecture.
• An experiment is a random process that leads to a set of out-
comes. e.g. the flipping of a coin is an experiment
• The entire set of outcomes is known as the sample space of said
experiment. e.g. the sample space of a coin flip is {Head, Tail}
• An event is a subset of the sample space. e.g. if you roll a
dice, some possible events include: an even number is rolled, a
prime number is rolled, and a number between two and four is
rolled. In these cases, the sample space is {1, 2, 3, 4, 5, 6}, and
the corresponding subsets are {2, 4, 6}, {2, 3, 5}, and, {2, 3, 4} ,
respectively
• The probability of an event is the likelihood of the event occur-
ring; it quantifies the chances of the event happening. Probabil-
ities must lie between 0 and 1 (inclusive). e.g. a fair dice (each
outcome is equally likely). There are six possible outcomes, and
since each outcome is equally likely, each outcome has a proba-
bility of 1/6. The probability of an event, A, is denoted as P (A)
• When all outcomes in a sample space are equally likely, we can
use the following formula to calculate the probability of an event,
A, occurring:
number of outcomes in A
P (A) =
number of outcomes in the sample space
1
• The probability that an event A occurs, given that an event B
occurs is denoted as P (A|B)
• Two events are disjoint if the outcomes in their sets do not over-
lap. Examples of disjoint events: getting different numbers on a
dice roll. Probabilities of disjoint events/outcomes can be added.
Example of non-disjoint events: the outcome of a dice is prime
and the outcome that it is odd. The “additive” rule for any two
events can be stated as P (A or B) = P (A) + P (B) − P (A and B)
• Two events are independent if the occurrence of one event does
not effect the occurrence of the other. For two independent events
A, B, P (A|B) = P (A), P (B|A) = P (B), and P (A and B) =
P (A) × P (B) e.g. rolling multiple dice; flipping multiple coins;
rolling a dice and then flipping a coin; same example as before,
but this time if the coin is heads the outcome of the dice is doubled
• In general, P (A and B) = P (A|B) × P (B) = P (B|A) × P (A)
2
three times eliminates your turn. In these examples, we are assuming
that after a token is killed, an extra turn is not given.
2.1 Example 1
Figure 2: The current state is shown on the top, while the two possible
moves are shown at the bottom.
Moving the second token. In this case, both of your tokens are
nin a safe spot and you can’t be killed in the next turn:
!
1 1 1
P (Red kills blue) = 2 × + + ≈ 0.3981
6 62 63
2.2 Example 2
It’s currently your turn and you’ve rolled a 3.
Moving the first token. In this case, both of your tokens are
three and five spaces away from the red token. We can use the same
calculation as in the latter case of the last example to show that:
Moving the second token. In this case, both of your tokens are
two and six spaces away from the red token:
3
Figure 3: The current state is shown on the top, while the two possible
moves are shown at the bottom.
1 5 5
P (Red kills blue) = + 2 + 3 ≈ 0.3287
6 6 6
Therefore, it is better to move the second token.
4
Figure 5: Different beats of musical notes and rests.
Assuming that all keys and beats are equally probable, the proba-
bility that any note is played for any given time is:
1 1 1
× =
89 6 534
The probability of getting one note right is already quite low. Given
that each note is played independently, one can say that for n notes
the probability of playing them all right is:
!n
1
534
There are 359 notes in Beethoven’s Fifth Symphony, meaning that
the probability that a monkey playing randomly on the piano is able
to stumble on Beethoven’s Fifth is:
!359
1
534
4 More Definitions
• A random variable is a variable whose outcome is determined
by a random event. e.g. Let X be a random variable dependent
on the outcome of rolling a dice:
X = the outcome of the dice roll
5
• The Law of Large Numbers dictates that if a random experi-
ment is repeated enough times, the average value of the random
variables will approach the expectation
5 Application 3: Estimating Pi
Let’s say you have a unit circle inscribed within a 2 × 2 square. We will
now throw a dart at the square such that it is equally likely to land on
each point on the square. The probability that it will land in the circle
is the ratio of the area of the circle to the area of the square:
π × 12 π
P (Dart lands in circle) = 2
=
2 4
Now, let’s define a random variable:
6
Similarly, reinforcement learning via neural networks is done by
telling the network what state the game is in, and asking it to out-
put one of multiple valid moves to make within the game. The equa-
tion behind this process is given in Figure 6. (Source: medium.com/
@jonathan_hui/rl-model-based-reinforcement-learning-3c2b6f0aa323).
This equation simply tells the network to increase the probability of
more favourable moves based on rewards given to the network.
7
in the first three flips - a probability of 1/8. However, if a fourth flip is
required, you will always lose because a tails would always precede a
sequence of three heads (actually two since your adversary would win
at the second heads).