Unit 16 Analysis of Quantitative Data: Inferential Statistics Based On Nqn-Pametric Tests

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

UNIT 16 ANALYSIS OF QUANTITATIVE DATA: INFERENTIAL STATISTICS BASED ON NQN-PAMETRIC TESTS

Structure
16.1 16.2 16.3 16.4 Introduction Objectives Non-parametric Tests Statistical lnference Based on Non-parametric Tests: IJnrelated Samples 16.4.1 The Chi Square ( N ') Test 16.4.2 The Median Test 16.4.3 The Mann-Whitney U Test Statistical lnference Based on Non-Parametric Tests: Related Samples 16.5.1 The Sign Test 16.5.2 'The Wilcoxon Matched - Pairs Signed - Ranks 'Test Statistical lnference Regarding Correlations IJsing Non-parametric Data 16.6.1 Significance of Sprearman's Rho ( p ) Correlation Coefficient 16.6.2 Significance of Phi ($) Correlation Coefficient 16.6.3 Signifance of Contingency Coefficient (C) Let Us Sum Up Unit-end Activities Suggested'Readings

16.5

16.6

16.7 16.8 16.9

16.10 Answers To Check Your Progress

16.1 INTRODUCTION
In the previous unit you learnt about the use of parametric tests in making inferences about the means computed from large and small samples. The use of Z and T tests were also explained to you for testing the significance of the difference between the means of two large and small samples. The a~vlication and use of analvsis of variance and co-variance for testing the differnce between the means of three or more samples were also discussed with the help of examples. The signifance of the Pearson's co-efficient of correlation using Fisher's Z conversion were also explained alongwith the the use of Z test for testing the signifance of the difference between Pearson's coefficients of correlation computed from two samples. In the use of parametric tests for making statistical inferences, we need to take into account certain assumptions about the nature of the population distribution, and also the type of the measurement scale used ot quantify the data. In this itnit you will learn about another category of tests which do not make stringent assumptions about the nature of the population distribution. This category of test is called distribution free or non-parametric tests. The use and applicatio~:of various of nonparametric tests involving unrelated and related samples will be explained in this unit. These would include chi-square test, median test, Man-Whitney U test. sign test and Wilcoxon-matched pairs signed-ranks test.

Data Analysis and

Interpretation

16.2 OBJECTIVES
After studying this unit, you will be able to:
e
8

explain the nslture of non-parametric tests; state the use of non-parametric tests; draw statistical inference pertaining to unrelated samples using the: (i) chisquare test: (ii) median test; and (iii) man-whitney U test; make s;atistical inferences pertaining to related samples using the: (i) sign test and (ii) wilcoxon matched-pairs signed-ranks test; and test statistical signifance of spearman's correiation co-efficient, phi-co-efficient and contingency co-efficient.

16.3 NON-PARAMETRIC TESTS


In the last unit you learnt that parametric tests are generally quite robust and are useful even when some of their mathematical assumptions are violated. However, these tests are used only with the data based upon ratio or interval measurements. In case of counted or ranked data, we make use of non-parametric tests. It is argued that non-parametric tests have greater merit because their validity is not based upon assumptions about the nature of the population distribution, assumptions that are so frequently ignored or violated by researchers using parametric tests. It may be noted that non-parametric tests are less precise and have less power than the parametric tests. The use of non-parametric tests do not make numerous or stringent assumption about tlie nature of the population distribution and hence they are called distribution free tests. Non-parametric are used when: 1. The nature of the population from which samples are drawn is not known to be normal.
2.

The variables are expressed in nominal form. The data are measures which are ranked or expressed in numerical scores which have the strength of ranks.

3.

16.4

STATISTICAL INFERENCE BASED ON NONPARAMETRIC TESTS :UNRELATED SAMPLES

The most frequently non-parametric tests which are used in drawing statistical inferences in case of unrelated or independent samples are: (1) chi square test; (ii) median test; and (iii) man-whitney test. The use and application of these tests are discussed below:
16.4.1 The Chi Square C u * ) Test

62

The chi square test is applied only to discrete data. The data that are counted rather than measured. It is a test of independence and is used to estimate the likelihood that some factor other chance accounts for the observed relationship. ' The chi square ( K) is not a measure of the degree of relationship between the

variables under study, the chi square test merely evaluates the probability that the observed relationship results from chance. The basic assumption, as in case of other statistical significance, is that the sample observations have been randomly selected. The formula for chi-square (H 2 ) is:

Analysis of Quantitative Data: Inferentia' Statistics Based On Non-parametric Tests

In which fo fe
= =

frequency of occurrence of observed or experimentally determined facts. expected frequency of occurrence.

To evaluate the significance of chi square, we use the Table E of chi square presented in the Appendix with computed value of chi square and the appropriate number of degrees of freedom (do. The number of df = (r-I) (c-I) r is number of rows and c is the number of columns, in which the data are tabulated. To illustrate the use of formula, let us consider the following data based on the judgements of 390 judges.The judgements have been classified into five categories taken to represent a continuum of opinion: Categories
I I1
63

111
82

IV
93

V
57

Total
350

Judgements

55

The hypothesis to be tested is 'equal probability hypothesis 'i-e. whether the judgments expressed in five categories differ significantly or not. For this we have t i compute the distribution of answers to be expected on the equality or null hypothesis. Since the total judgements are 350 and the number of categories is 5, the expected judgements in each category would be 35015 = 70. The data in respect of observed (fo) and expected frequencies (fe) alongwith the values (fo-fe), (fofe) 2 etc. can be arranged as under Categories Total

I
Observed Judgements (fo) Expected Judgements (fe)
(fo-fe)

I1
63 70 7 49 0.70

111
82 70 12
144

IV
93 70 23 529 7.56

v
57 70 13 169 350 350

55 70 15 225 3.21

(fo-fe)'

(fo - fe)'
fe

2.06

2.41

Data Analysis and Interpretation

The degrees of freedom in the table may be calculated from the formula df (r-1) (c-1) to be (5-1) (2-1) or 4.

Using chi square Table E in the Appendix, we find in row df = 4, K 2of9.488 in the column headed .05. Since the obtained value of K = 15.94 is greater than the table value of 9.488, we reject the equal judgement hypothesis and conclude that judgemerrts in terms of various categories differ significantly. Suppose instead of the 'hypothesis of equality', we may wish to test the data expressed in various judgement categories against the hypothesis of a normal distribution. In that case our hypothesis may assert that the judgement frequencies which we have observed really follow the normal distribution instead of being equally probably. Using the data of the above example. we have to find out how many of the 350 (total of the categories of judgementsj may be expected to fall in each categories on the hypothesis of a normal distribution. Thesc are found by first dividing the base line of a normal curve (taken to extend over 6 o) into 5 equal segments each of 1.20 o each. From the normal table (Table A) of the Appendix, the proportion of the normal distribution to be found in each of these segments would be as follows:
0.4514

-3 0

-1.800

- 0.600

+0.600

Fig.16.1: Normal Distribution of judgment frequencies.

Between

+ 3.00 o and +
-- 0.60

1.80 o = .0359

+ 1.80 o and + 0.60 o = .2384


o and - 0.60 o = .45 14 - 0.60 o and - 1.80 o = .2384 - 1.80 o and - 3.00 o = .0359
These proportions of 350 have been calculated as 12.56, 83,44, 157.99, 83.44 and 12.56 and are entered in the row fe in following table:
Categories Total

I
!fo) (fe) (fo -fe)
55 12.56 42.44

II
63 83.44 20.44 417.79 5.0 1

111
82 158.00 76 5776 36.56

Iv
93 83.44 9.56 91.39 1.10

v
57 12.56 44.44 1974.91 350 350

( f ~ - f e ) ~ 1801.15 (fo-fe)2 fe
64 143 -40

157.24

Analysis of Quantitative Data: Inferential Statistics Based on Non-parametric Tests

The value of N ,in the Table E in the Appendix is 9.488 for df = 4 in the column headed by .05, which is less than the computed N value of 343.3 1. The difference between observed and expected values is so great that the hypothesis of normal distribution of judgement categories must be rejected. Let us use chi square test to the data, which represent the number of boys and the number of girls who chose each of the three possible answers to an item on a personality inventory, to test whether the item differentiates significantly between boys and girls.
Yes No 66 66
132

Undecided
10 7 17

Total
90

Boys Girls Total

14 27 41

100 190

For each of the observed frequency in the table, let us compute the expected freauencv in the followine. way:

Row 2 (Girls):

132 x 100 41x100 = 21.58; = 62.4 190 190

The data in respect of observed and expected frequencies are arranged in the following table. The values in parentheses within the different cells are expected frequencies.
Responses Yes No Undecided
10 (8.05) 7 (8.95) 17

Total
90
100 190

Boys Girls Total

14 (19.42) 27 (21.58) 41

66
(62.53)

66
(69.47) 132

The hypothesis to be tested is the null hypothesis namely, that the item does not differentiate between the groups of boys and girls. Using the formula of N~

Data Analysis and Interpretation

The N 2criticalvalues for 2 df as given in the Table E are 5.991 and 9.210 respectively for .05 and .O1 levels of signifance and the obtained value 10.94 of K 2 is higher than these values. This indicates that the item of the personality inventory differentiate between boys and girls and'the null hypothesis is rejected.

Notes :a) Space is given below for your answer.

b) Compare your answer with the one given at the end of this unit.

The following judgements were classified into six categories taken to

.......................................................................................................................... ..........................................................................................................................
2. The following table represents the number of boys and the number of girls
who choose each of the possible answers to an item in an attitude scale.

Boys Girls

25 10

30 15

10 5

25 15

10 15

100

60

Do these date indicate a significant sex difference in attitude towards this question?

.......................................................................................................................... ..........................................................................................................................
16.4.2 The Median Test
The median test is used for testing whether two independent samples differ in central tendencies. It gives information as to whether it is likely that two independent samples have been drawn from populations with the same median. It is particularly useful when even the measurements for the two samples are expressed in an ordinal scale.

In using the median test, we first calculate the combined median for all measures (scores) in both samples. Then both sets of scores at the combined median are dichotomized and the data are set in a 2 x 2 table presented below:
lsble for Use of Median Test

Analysis of Quantitative Data: Based On lnferentia' Non-parametric Tests

Group I No. of measures (scores) above combined Median No. of measures (scores) below combined Median Total A C A+C

Group.It B D B+D

Total A+B C+D

Under the null hypothesis, we would expect about half of each group's (scores) to be above the combined median and about half to be below , that is, we would expect frequencies A and C to be about equal, and frequencies B and D to be about equal. In order to test this hypothesis, we calculate using the following

(A+B)(C+D)(A+C)(B+D) Let us illustrate the use of this formula with the help of the following example: Twenty male and fifteen female teacher educators of a teacher training institute were asked to express their attitude towards teacher education programmes offered through distance mode at the B. Ed. Level. Both the groups were administrated an attitude scale and common median attitude score was computed. The number of cases from both groups falling above and below the median score is shown in the
Distribution of Male and Female Teacher Educators Below and Above the Common Median Attitude Score

Below Median Female Teachers Educators Male Teachers Educators Total Using the formula for N 9 6 15

Above Median
6

Total 15 20 35

14
20

'

- 35(ll26 - 361 - 17.5)' 90000 - 35(90 - 17.5)~ 90000


90000 =2.044

67

Data Analysis and Interpretation

obtained N value of 2.044 is less than 3.84, the null hypothesis is retained and we may conclude that there is no difference in the attitude of male and female teacher educators towards teacher education programmes at the B.Ed. Level.

16.4.3 The Mann-Whitney U Test


The Mann-Whitney U test is more useful than the Median test. It is a most useful alternative to the parametric t test when the parametric assumptions cannot be met .and when the measurements are expressed in ordinal scale values. Suppose N, is the number of individuals in one of the two independent groups and N, the number of the individuals in the other. In using Mann-Whitney U test, we first combine the measures or scores from both groups, and rank these in order of increasing size. In this ranking, we have to consider the algebraic sign, that is, the lowest ranks are assigned to the largest negative numbers, if any. The ranks of each sample group are then summed individually and represented as CR, and CR,. There are two Us: U I and U2 which are calculated using the following formula:

N, N,

=
=

number in one group number in second group sum of ranks in one group sum of ranks in second group

CRI =
x 2= R

The two U's are related by the equation:

Thus only one U needs to be calculated, for the other can be easily determined by this education. bv The Z value of U can be com~uted the formula:

It does not matter which U (the larger or smaller) is used in the computation of Z. The sign of Z will depend on which U is used, but the numerical value will be identical. The following example used by Koul (1997) illustrated the application of MannWhitney U test in which a researcher wished to evaluate the effectiveness of micro-teaching and simulation in developing certain teachings skills among student-

and the group B was trained through simulation technique. After a period of two months training, the student-teachers were rated in the teaching skills by supervisors. The rating scores of the student teachers are given in table 16.1:

Table 16.1: Scores of Student Teachers


lnferenttal stattstrcs bases on Non-parametric Tests

All rating measures are ranked from lowest to highest and the Mann-Whitney U test is used to test the null hypothesis at the .05 of significance using the formula of U, and U,.

Using the equation U, = N,N2-U,, we check:

The obtained Z value of -1.61 does not exceed the Z critical value of 1.96 at .05 level, the null hypothesis is accepted. It may be concluded that micro teaching approach and simulation technique are equally effective in developing certain teaching skills among student teachers.

69

Data Analysis and

Interpretation

Check Your Progress


Notes :a) Space is given below for your answer.
b) Compare your answer with the one given at the end of this unit. 3. In answering a questionnaire the following scores were achieved by 10 men and 20 women: Men: 22,31, 38,47,48,48,49, 50, 52,61

Women: 22,23,25,25, 13,33,34,35,37,40,41,42,43,44,44,46,48,53,54 Do men and women differ significantly in their answers to this questionnaire? Apply Median test by taking the Median = 41.5.

.......................................................................................................................

....................................................................................................................... .......................................................................................................................
......................................................................................................................
4. The performance scores of the students taught by method A and method B are given below:

Apply Mann-Whitney U test and test the significance between the performance of the students taught by method A and method B.

........................................................................................................................

........................................................................................................................
70
I-

........................................................................................................................

16.5

STATISTICAL INFERENCE BASED ON NONPARAMETRIC TEST: RELATED SAMPLES

Analysis of Quantitative Data: ,nfirential Statistics Based on Non-parametric Tests

Various tests are used in drawing statistical inferences in case of related samples. In this section we shall confine our discussion to the use of Sign Test and Wilcoxon Matched-Paris Signed-Ranks Test Only.

16.5.1 The Sign Test


The sign test is the simplest test of significance in the category of non-parametric tests. It makes use of plus and minus signs rather than quantitative measures as its data. It is particularly useful in situations in which quantitative measurement is impossible or inconvenient, but on the basis of superior or inferior performance it is possible to rank with respect to each other, the two members of each pair. The sign test is used either in the case of single sample from which observations are obtained under two experimental conditions are obtained under two experimental conditions and the researcher wants to establish that two conditions are different or to the case of two equivalent samples in which the subjects are matched with respect to the relevant extraneous variables. The use of this test does not make any assumption about the form of the distribution of differences. The only assumption underlying this test is that the variable under investigation has a continuous distribution. If the number of the individuals in the single sample or in each of the equivalent or related samples is less or equal to 25, we make use of the 'Table of Probabilities Associated with Values as Small as Observed Values of x (number of fewer signs) in the Binomial Test' (Siegal, 1956). When the number of individuals in the group is larger than 25, the normal approximation to the binomial distribution is used. The significance of difference is

16.5.2 The Wilcoxon Matched - Pairs Signed - Ranks Test


The Wilcoxon matched-pairs signed - ranks test is more powerful than the sign test because it tests not only direction but also magnitude of differences within pairs of matched groups. This test, like the sign test, deals with dependent groups made up of matched pairs of individuals and is not applicable to independent groups. The null hypothesis would assume that the direction and magnitude of pair difference would be about the same. The application of wilcoxon matched-pairs signed-ranks test involves the following steps:
1. Let dl be the difference scores for any matched pair, representing the difference between a pair's scores under two treatments A and B. There would be one dl for each pair of scores.

2.

Delete all such pairs for which d, = 0 Rank all the dl's without regard to sign, giving rank I to the smailesl difference d, rank 2 the next smallest, etc. If two or more dl's are of the same size, assign the same rank to such tied cases. The rank assigned would he average of the ranks which would have been assigned if the dl's had differed slightly

3.

Data Analysis and Interpretation

from each other. For example, if three pairs yield dl's of -1, -1 and +I, then

1+2+3
2

each pair would be assigned the rank of 2, for.

= 2, and next dl on

order would be assigned the rank of 4 because ranks 1, 2 and 3 have already been exhausted.
4.

Indicate which ranks arose from negative dl's and which ranks arose from positive d,'s by affixing to each rank the sign of difference. Sum the ranks for the positive differences and sum the ranks for the negative differences. Under the null hypothesis we would expect the two sums to be equal. In other words, if the sum of the positive ranks equals the sum of the negative ranks, we would conclude that the treatments A and B are not different. But if the sum of the positive ranks is very much different from the sum of the negative ranks, we would infer that the treatment A differs fiom treatment B and thus we would reject the null hypothesis.

5.

Let us illustrate the application of Wilcoxon test with the help of the following example used by Koul (1997). Suppose a group of 26 delinquent children were initially rated for their social adjustment by psychiatrist and sent to a juvenile jail. After a year they were rated again by a psychiatrist for social adjustment and then initial and final adjustment rating scores were compared. The rating data are presented in the following Table 16.2:
Table 16.2: Rating Scores of Delinquent Children

The null hmthesis that there was no difference in initial and final adiustment rating. -. * scores of the group was tested at .05 level of significance using the following formula:
u

Analysis of Quantitative Data: Inferential Based On Non-parametric Tests

in which

= number

of pairs ranked

T = sum of ranks of the smaller of the like-signed ranks


In the example, T, the smaller of sums of the like-signed ranks = 2.5 + 7.5 14.0 + 2.5 + 21 = 59.5.

+ 12.0 +

Since the obtained Z value of 2.95 exceeds Z critical value of 1.96 at -05 level, the null hypothesis is rejected and we may conclude that the environment in juvenile jail has considerable improved the social adjustment of delinquent children.

Check Your Progress Notes :a) Space is given below for your answer.
b) Compare your answer with the one given at the end of this unit.

5.

List the uses of: (i) sign test and (ii) Wilcoxon matched-pairs signed-ranked test.

.......................................................................................................................

.......................................................................................................................

.......................................................................................................................

.......................................................................................................................

16.6

STATISTICALINFERENCE REGARDING CORRELATION USING NON-PARAMETRIC DATA

The correlations which are computed from the measurements based on nominal (enumerative) and ordinal (ranking) data give rise to spearman's rho ( p ) , phi ( 4) contingence (C) coefficients. In this section, we will discuss the procedure for testing the statistical significance of these coefficients.

Data Analysis and Interpretation

16.6.1 Significance of Spearman's Rho ( p ) Correlation Coefficient


There is no generally accepted formula for estimating the standard error of p which we need for testing its significance and determine its confidence limits. However, we can test the null hypothesis that the two variables under study are not associated in the population and that the observed value of rho ( p ) differs zero only by chance, in two ways.
I.

When the size of sample N is from 4 to 30, the interpretation is best made by the aid of Table L given in the Appendix, in which are given p coefficients significant at .05 and .O1 levels of confidence. This is a one tailed table, that is, the stated probabilities apply when the observed value of p is in the predicted direction, either positive or negative. For a one tailed test, if an observed value of p equals or exceeds the value of p shown in the Table L, the observed value is significant at the level indicated. When N is 10 or large, the significance of an obtained p under null hypothesis may tested by the formula:

2.

The interpretation of the obtained value o f t is made with the use of Table C of the Appendix, using (N-2) degrees of freedom (df).

16.6.2 Significance of Phi ( 4 ) Correlation Coefficient


The significance of as:
K2 = N + ~

coefficient is determined using the relationship of

to K

This formula helps us to test an obtained convert 4 to an equivalent K

4 against the null hypothesis. First we


by referring to

and then test the significance of K

the chi-square Table E. If K comes out to be significant for a particular level of confidence, the corresponding value of s is also significant.

16.6.3 Significance of Contingency Coefficient (C)


The significane of contingency coefficient C is also determined through the relationship which it bears to K using the formula:

This formula helps us to test the significance of the obtained value of C coefficient against the null hypothesis by first converting C to H~ . The interpretation of an obtained chi-square is made with the use of chi-square Table E. If chi-square is significant at a particular level of confidence, C is also significant. Check Your Progress

Notes :a) Space is given below for your answer.


b) Compare your answer with the one given at the end of this unit.
74

6. Test the significance of rho ( p ) 0.76 for N = 2

.......................................................................................................................... ..........................................................................................................................
7. The coefficient of contingency between father's eye colour and son's eye colour computed on the basis of 4 x 4 contingency table came out to be 0.46. Test its significance at .OS level.

Analysis of Quantitative Data: Inferential Statistics Based on Non-parametric Tests

.......................................................................................................................... ..........................................................................................................................

16.7 LET US SUM UP


The use of non-parametric tests do not make stringent assumptions about the nature of the population distribution. Non parametric tests are distribution free tests. Non-parametric tests are used when: (i) the nature of the population, from which samples are drawn, is not known to be normal; (ii) the variables are expressed in nominal scale of measurement; and (iii) the data are measures which are ranked or expressed in numerical scores which have the strength of ranks. Chi-square test, median and Mann-Whitney U test are most frequently nonparametric tests of significance which we use in case of unrelated or independent samples. In case of related or dependent samples, we make use of sign test and Wilcoxonmatched-pairs signed-ranks test. The procedure for testing the significance of rho ( p ) ,phi ( 4 ) and contingency (C) coefficients of correlations were also discussed.

16.8 UNIT-END ACTMTIES


1. Discuss the uses of non-parametric tests. 2. Describe the uses of chi square test, median test and Man-Whitney U test.

3. Illustrate the use of Wilcoxon matched-pairs signed-ranks test with the help of
an example.

16.9 SUGGESTED READINGS


Garrett, H.E. (1962): Statistics in Psychology and Education. Bombay: Allied Pacific Pvt. Ltd. Guilford, J.P. (1965): Fundamental Statistics in Psychology and Education. New York: McGraw Hill Book Company. Koul, Lokesh ( 1997): Methodology of Educational Research. New Del hi; Vikas Publishing House Pvt. Ltd. (Third Revised Edition). Siegal, S. (1956): Non-Parametric Statistics for Behavioural Sciences. Tokyo: ~ c ~ r Hill Hoga Kusna Ltd. a w
75

Data Analysis and Interpretation

1 . 0 ANSWERS TO CHECK YOUR PROGRESS 61


1.
2.
~2

= 346, the deviation from the

normal distribution is significant.

~2

= 7.03, no signifance sex difference in attitude towards the question.

3.

= No, K ~ 1.35

5.

i)

Sign test is particularly useful in the situations is which quantitative measurements is impossible or impracticable, on the basis of superior or, inferior performance. It is applicable either to the case of single sample from which observations are obtained under two experimental conditions and one wished to establish that two conditions are different or to the case of the equivalent samples in which the subjects are matched with respect to the relevant extraneous variables. direction but also magnitude of differences within pairs of matched groups.

iii Wilcoxon test is more powerful than the sign test because it tests not only
6.

Significant at .O1 level. Not significant at .05 level.

7.

You might also like