Statistics For Decisions Making: Dr. Rohit Joshi, IIM Shillong
Statistics For Decisions Making: Dr. Rohit Joshi, IIM Shillong
Statistics For Decisions Making: Dr. Rohit Joshi, IIM Shillong
PGP 17-19
Election Contestants
Let us see some situations
Operations Manager
Let us see some situations
Marketing Research
Let us see some situations
Collect data
ex. Survey
Present data
ex. Tables and graphs
Characterize data
ex. Sample mean = X i
n
Inferential Statistics
Estimation
ex. Estimate the
population mean weight
using the sample average
weight
Hypothesis testing
ex. Test the claim that
the population mean
weight is 65 Kg
Drawing conclusions and/or making decisions
concerning a population based on sample results.
Basic Vocabulary of Statistics
VARIABLE
A variable is a characteristic of an item or individual.
DATA
Data are the different values associated with a variable.
POPULATION
A population consists of all the items or individuals about which you want to draw a
conclusion.
SAMPLE
A sample is the portion of a population selected for analysis.
PARAMETER
A parameter is a numerical measure that describes a characteristic of a population.
STATISTIC
A statistic is a numerical measure that describes a characteristic of a sample
Population vs. Sample
Population Sample
Data
Categorical Numerical
Examples:
Marital Status Discrete Continuous
Political Party
Eye Color
Examples: Examples:
(Defined categories)
Number of Children Weight
Defects per hour Voltage
(Counted items) (Measured characteristics)
Levels of Measurement
Nominal
Ordinal
Interval
Ratio
Nominal Scale
Lowest level of data measurement
Classifies data into distinct categories in which no
ranking is implied
Used to classify or categorize
Ratio
Interval
Ordinal
Nominal
A survey in healthcare industry
Many changes continues to occur in the healthcare
industry. Because of increased competition for
patients among providers and the need to
determine how providers can better serve their
clientele, hospital administrator sometimes mail a
qualitative satisfactory survey to their patient after
the patient is released. The following types of
questions are some time asked on such a survey.
1. How long ago were you released from the hospital?
2. Which type of unit were you in for most of your stay?
1. Intensive care
2. Maternity care
3. Medical unit
4. Pediatric/ children’s unit
5. Surgical unit
3. In choosing a hospital how important was the hospital
location? (Circle one)
Very imp Somewhat imp Not very imp not at all
4. Rate the skill of the doctor:
Excellent Very good Good Fair Poor
5. On the following scale from one to seven, rate the
nursing care
Poor 1 2 3 4 5 6 7 Excellent
These question will result in what level of data measurement?
Let us jump to SPSS
Probability
Empirical classic probability
Based on historical data
Computed after performing the experiment
Number of times an event occurred divided by the number of
trials
Objective -- everyone correctly using the method assigns an
identical probability
Subjective probability
different individuals may (correctly) assign different numeric
probabilities to the same event
Mutually Exclusive event
Collectively Exhaustive event
Equally Likely event
Random Variable
A random variable x takes on a defined set
of values with different probabilities.
For example, if you roll a die, the outcome is random
(not fixed) and there are 6 possible outcomes, each of
which occur with probability one-sixth.
For example, if you poll people about their voting
preferences, the percentage of the sample that responds
“Yes on Proposition 100” is a also a random variable
(the percentage will be slightly differently every time
you poll).
p(x)
1/6
x
1 2 3 4 5 6
P(x) 1
all x
Probability mass function (pmf)
x p(x)
1 p(x=1)=1/6
2 p(x=2)=1/6
3 p(x=3)=1/6
4 p(x=4)=1/6
5 p(x=5)=1/6
6 p(x=6)=1/6
1.0
Cumulative distribution function
(CDF)
1.0 P(x)
5/6
2/3
1/2
1/3
1/6
1 2 3 4 5 6 x
Cumulative distribution function
x P(x≤A)
1 P(x≤1)=1/6
2 P(x≤2)=2/6
3 P(x≤3)=3/6
4 P(x≤4)=4/6
5 P(x≤5)=5/6
6 P(x≤6)=6/6
Practice Problem:
The number of patients seen in the ER in any given hour is
a random variable represented by x. The probability
distribution for x is:
x 10 11 12 13 14
P(x) .4 .2 .2 .1 .1
f ( x) e x
e
x x
e 0 1 1
0
0
For example, the probability of x falling within 1 to 2:
x
1 2
2 2
x x
P(1 x 2) e e e 2 e 1 .135 .368 .23
1
1
Expected Value and Variance
All probability distributions are
characterized by an expected value
(mean) and a variance (standard
deviation squared).
Expected value, formally
Discrete case:
E( X ) x p(x )
all x
i i
Continuous case:
E( X )
all x
xi p(xi )dx
A Situation
Acme Fruit and Vegetable Wholesalers buys tomatoes,
then sells them to retailers. Acme currently pays ` 2000
per container. Tomatoes sold on the same day bring `
5000 per container. Extremely perishable in nature, if
any tomato container not sold on the same day are
worthless and required to be disposed off (consider at
no cost). The distribution manager’s problem is to
determine the optimum number he should order each
day. On days when he stocks more than he sells, his
profit is reduced by the cost of the unsold containers.
On the other hand, when retailers request more
containers than he has in stock, he loses sales and
makes smaller profit than he could have.
Developing Pay-off table
Acme currently pays Rs. 2000 per container. Tomatoes
sold on the same day bring Rs. 5000 per container. Profit
= 3000 per container.
Pay off table in ` ‘00
ACTIONS ( Quantity ordered Q)
EVENTS Q1= 10 Q2= 11 Q3 =12 Q4= 13
(Demand)
D1= 10 300 280 260 240
D2= 11 300 330 310 290
D3= 12 300 330 360 340
D4= 13 300 330 360 390
N
The expected value
EV( d i ) decision
(EV) of P( s j )Vij alternative di is defined as:
j 1
n X n X
p (1 p )
X 1-p = probability
X=# of failure
successes p=
out of n probability of
trials success
Binomial distribution: example
20 10 10
(.5) (.5) .176
10
Binomial distribution: example
If I toss a coin 20 times, what’s the probability of
getting of getting 2 or fewer heads?
20 20!
(.5) 0
(.5) 20
(.5) 20 9.5 x107
0 20!0!
20 20!
1
(.5) (.5)
19
(.5) 20 20x9.5 x10 7 1.9 x105
1 19!1!
20 20!
(.5) 2
(.5)18
(.5) 20 190x9.5 x10 7 1.8 x10 4
2 18!2!
1.8 x10 4
**All probability distributions are characterized
by an expected value and a variance:
If X follows a binomial distribution with parameters n
and p: X ~ Bin (n, p)
Mean μ E(x) np
σ 2 np (1 - p ) σ np (1 - p )
A N A
X
n X
P( X )
N
n
Where
N = population size
A = number of successes in the population
N – A = number of failures in the population
n = sample size
X = number of successes in the sample
n – X = number of failures in the sample
The Hypergeometric Distribution
Example
Different computers are checked from 10 in the
department. 4 of the 10 computers have illegal
software loaded. What is the probability that 2 of the 3
selected computers have illegal software loaded?
So, N = 10, n = 3, A = 4, X = 2
A N A 4 6
X n X 2 1 (6)(6)
P(X 2) 0.3
N
10 120
n 3
nA
μ E(x)
N
The standard deviation is:
nA(N - A) N - n
σ 2
N N -1
N-n
Where N - 1 is called the “Finite Population Correction Factor”
eλ λ x
P(X)
X!
where:
X = the probability of X events in an area of opportunity
= expected number of events
e = mathematical constant approximated by 2.71828…
An example
Suppose that, on average, 5 cars enter a parking lot
per minute. What is the probability that in a given
minute, 7 cars will enter?
e λ λ x e 5 5 7
P(7) 0.104
X! 7!
Mean = Variance = λ