Unit 4 & Unit 5
Unit 4 & Unit 5
Unit 4 & Unit 5
Nonparametric Statistics
2.
3.
Nonparametric Test
Procedures
1.
2.
3.
Advantages of
Nonparametric Tests
1.
2.
3.
4.
5.
Disadvantages of
Nonparametric Tests
1. May Waste Information
Parametric model more efficient
if data Permit
2. Difficult to Compute by
hand for Large Samples
3. Tables Not Widely Available
Nonparametric Tests
One Sample and Two Sample Tests
1.
K-Sample Tests (K 3)
2.
1.
Kruskal-Wallis Test - H
McNemar Test
Appropriate for Nominal Data
Relates to Measurements taken Before & After of one
and same sample
PROCEDURE
1.Set up Null Hypothesis that there is no change in
peoples attitude as a result of the treatment
2.Classify data into the following 4 categories
I.
McNemar Test
3. Calculate the test static which is only a transformation
of X2 :
McNemar Test
Example1:
A researcher attempts to determine if a drug has an effect on a
particular disease. Counts of individuals are given in the table, with
the diagnosis (disease: present or absent) before treatment given
in the rows, and the diagnosis after treatment in the columns. The
test requires the same subjects to be included in the before-and
after measurements. Test whether there is a significant effect of the
treatment.
After: present
After: absent
Row total
Before: present
101
121
222
Before: absent
59
33
92
Column total
160
154
314
Critical Values of
Chi-square Distribution
Run Test
Run Test is used to know whether the observations in a
given series can be regarded as random.
For example: Queue at a bus-stop
MFMFMFMFMFMFMFMFMFMF
Another sequence:
MMMMMMMFFFFFFF
Data is converted into runs
M/F/M/F/M/F/M/F/M/F/M/F/M/F/M/F/M/F/M
MMMMMMM / FFFFFFF
Run Test
Example 1:
Marks of 15 students are
55,52,43,49,36,61,44,47,67,78,63,57,41,28 and 50
Taking deviations from 50
aa bbb a bb aaaa bb a
PROCEDURE (Small Sample <20 )
n1 = Number of occurrences of type one (say 8 a)
n2 = Number of occurrences of type two (say 7 b)
r = Total number of runs (7)
If Observed number of runs (r) lies between the two critical values
then accept null hypothesis otherwise reject it.
Run Test
PROCEDURE (Large Sample >20 )
If either n1 or n2 is greater than 20 then the sample is said to be large
and approximated to normal distribution. The number of runs (r) is a
statistic with sampling distribution (Z) and the mean (r) and standard
error (sigma r) of the r statistic is
( 2n1n2 )( 2n1n2 n1 n2 )
r
(n1 n2 ) 2 ( n1 n2 1)
r r
z
r
Rejection
Region /2 =
0.025
Rejection
Region /2 =
0.025
z.025 1.96
z.025 1.96
Example 2:
On a commuter train, the conducter wishes to see whether
the passengers enter the train at random. He observes the
first 30 people with the following sequences of males (M)
and females (F)
FFFFMMFFFFMFMMFFFFFFMMFFFFFMMM
Example
3:
OOOUOOUOUUOOUUOOOOUUOUUOOO
UUUOOOOUUOOUUUOUUOOUUUUUOO
OUOUUOOOUOOOOUUUOUUOOOUOOU
UOUOOUUUOUUOOOOUUUOOO
Procedure
Tip: U1 + U2 = N1N2
Take the smaller of U1 & U2 and compare with the table value.
If calculated value is less than equal to the table value reject
H 0.
Example
n1n2
(n1 )( n2 )( n1 n2 1)
12
n1n2
U
2
(n1 )( n2 )( n1 n2 1)
12
Example 2:
A survey is conducted to test the difference between 2
alternative methods of teaching. A sample of 20 students
is selected at random. Two group of 10 students each of
equal ability are formed and taught by different methods.
A standardized test is then given to both the groups and
the following marks (out of 100) are scored by the 10
students in each group:
Group A: 40,45,48,46,52,58,72,85,67,73
Group B: 42,68,45,64,85,78,87,62,84,90
Kolmogorov-Smirnov Test D
Like Chi-Square test, this test is also used to find out
whether an emphirical distribution agrees with an assumed
theoretical one or whether two samples may reasonably be
regarded as coming from the same population.
Procedure:
1. Calculate cumulative frequencies for each class in
respect of both observed and theoretical categories
2. Convert the cumulative frequency of each class into
proportion in respect of both categories
Procedure:
3. Compute the difference between the observed and
tjeoritical proportions ignoring the pls or minus sign
4. Compare the Max Difference (Dn) figure with critical
value in the table at desired level of significance.
If calculated value is less than the table value accept H0,if it
is greater, reject the H0
Example 1:
A manufacturer of readymade garments
conducts a market survey to know the
choice of brands A,B,C and D of 100
prospective customers. The result show:
A=20, B=30, C=18, D=32
Find out if the customers have any distinct
brand preference.
Example 2:
A sample of 26 male patients and another 25 female patients
suffering from respiratory T.B. is randomly selected. The
following table gives the frequency distributions of these
samples according to age:
Age
Male
Female
05
5 15
15 25
25 35
35 45
45 55
55 65
Above 65
N1 = 26
N2 = 25
Use K-S test to test the hypothesis that the age distribution of
males and females is the same.
Sign Test
This test is used in both study of single and paired
samples.
Procedure:
1. Find difference between the sample items and
hypothesized mean/median are computed and
expressed in terms of pls and minus signs. If a zero
difference is found then it is ignored and sample size is
correspondingly reduced.
2. Number of times the less frequent sign (plus or minus)
occurs among the differences, is counted as success
and denoted as r.
13
16
11
12
11
11
11
10
13
13
10
13
12
14
12
15
15
12
11
19
14
12
10
n(n 1)
n(n 1)( 2n 1) z
4
n(n 1)( 2n 1)
4
24
24
10
Before
61
62
55
62
59
74
62
57
64
62
After
59
63
52
54
59
70
67
65
59
71
3.
Calculate H
where
n1, n2, ..nk are the number of items in each of the K
samples.
N= n1 + n2 +nk
R1, R2,.Rk are the sums of the ranks given to
observations in each samples.
4. H is approximately distributed as X2. The calculated
value should be compared with the table value of X2 with
(K-1) degrees of freedom at desired level of significance.
If calculated value is less than the table value, accept H0.
Example
1: School children taking coaching in three private
A School
B School
C School
33
32
55
38
15
68
39
87
27
48
32
38
58
22
46
70
63
52
61
56
76
41
57
45
44
10
49
Unit 5
Multiple Regression
Multiple
Regression equation involving two
Annual
Savings x1
Annual
Income x2
Family Size
x3
10
16
13
10
21
10
13
Factor Analysis
Factor Analysis
Cluster Analysis
Cluster
Cluster Analysis
Cluster Analysis
Cluster
Discriminant Analysis
Discriminant
Discriminant Analysis
The objectives of discriminant analysis are
as follows:
Development
of discriminant functions, or
linear combinations of the predictor or
independent variables, which will best
discriminate between the categories of the
criterion or dependent variable (groups).
Examination of whether significant differences
exist among the groups, in terms of the
predictor variables.
Discriminant Analysis
Conjoint Analysis
Conjoint Analysis
Conjoint
is used to identify:
Image measurement
Market Segmentation
New Product Development
Assessing Advertising Effectiveness
Pricing Analysis