Bayesian Statistics Probabilities Are Subjective

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Bayesian statistics

So far we have thought of probabilities as the long term


“success frequency”: #successes / #trails → P(success).

In Bayesian statistics probabilities are subjective!


Examples
* Probability that two companies merge
* Probability that a stock goes up
* Probability that it rains tomorrow

We typically want to make inference for a parameter θ, for


example µ, σ2 or π. How is this done using subjective
probabilities?

1 lecture 8
Bayesian statistics
Bayesian idea: We describe our “knowledge” about the
parameter of interest, θ, in terms of a distribution π(θ). This
is known as the prior distriubtion (or just prior) – as it
describes the situation before we see any data.

Example: Assume θ is the probability of success.


Prior distributions describing what value we think θ has:

2 lecture 8
Bayesian statistics
Posterior
Let x denote our data. The conditional distribution of θ
given data x is denoted the posterior distribution:
f (x | θ )π (θ )
π (θ | x ) =
g (x )

Here f(x|θ) how data is specified conditional on θ.

Example:
Let x denote the number of successes in n trail.
Conditional θ, x follows a binomial distribution:
n x
f ( x | θ ) =  θ (1 − θ ) n − x
 x
3 lecture 8
Bayesian statistics
Posterior – some data
We now observe n=10 experiment with x=3 successes, i.e.
x/n=0.3
Posterior distributions – our “knowledge” after having seen
data.

Shaded area: Prior distribution


Solid line: Posterior distribution
Notice that the posteriors are moving towards 0.3.
4 lecture 8
Bayesian statistics
Posterior – some data
We now observe n = 100 experiment with x = 30 successes,
i.e. x/n = 0.3
Posterior distributions – our “knowledge” after having seen
data.

Shaded area: Prior distribution


Solid line: Posterior distribution
Notice that the posteriors are almost identical.
5 lecture 8
Bayesian statistics
Mathematical details
The prior is given by a so-called Beta distribution with
parameters α > 0 and β > 0:
Γ(α + β ) α −1
π (θ ) = θ (1 − θ ) β −1 for 0 ≤ θ ≤ 1
Γ(α )Γ( β )

The posterior then becomes

Γ(α + β + n)
π (θ | x) = θ α + x −1 (1 − θ ) β + n − x −1
Γ(α + x)Γ( β + n − x)

a Beta distribution with parameters α+x and β+n−x.

6 lecture 8

You might also like