Learning Unit 10

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

IOP2601/MO001/4/2016

LEARNING UNIT 10: The normal distribution

As researchers we can investigate numerous variables that occur in our daily lives. The
height of a person is one such example. Lets for a moment consider height as a variable.
Very few people are exceptionally short and just as few are exceptionally tall. Most people are
of average height. We can say the same about many other human attributes. If we were
required to graphically illustrate the height distribution (where only a few people are
exceptionally short, most people are of average height and only a few are exceptionally tall),
the distribution would look like the graph below.

This distribution is called the normal distribution. Think of the bell-shaped normal distribution
as a continuous frequency distribution. If data are distributed in this normal form, it means
that a few individuals possess very little of the particular attribute, and just as few possess a
lot of the attribute. Most individuals possess an average amount of it. The normal distribution
is basic to the use of many of the statistical techniques that you will be learning about in this
module.
Normal distributions have the following characteristics, some of which have already
been mentioned in learning unit 4:
x
x
x

Bell shape: A normal distribution assumes a clearly identifiable bell shape with one
distinct peak, so it is unimodal.
Symmetry: The figure forms an exact mirror image on either side of the centre line.
The area or portion below the normal distribution equals 1,00 in decimals or 100% as a
percentage. Since the figure is symmetrical, it means that the area of the left-hand half
occupies 50% (or 0,5) and that of the right-hand half the other 50% (or 0,5).

Many psychological tests assume that the attribute(s) which they measure are normally
distributed. For example, most psychological tests of intelligence are based on the

133

IOP2601/MO001/4/2016

assumption that intelligence is normally distributed. That is why norm tables for tests are
drawn up so that individuals test performance can be interpreted in relation to that of a norm
group. It is assumed that the attribute is normally distributed in the norm group.
From this you can deduce that we use the normal distribution to determine a relative position
in a distribution, for instance to determine an individuals position relative to others in a normal
distribution.
You will see that this learning unit consists mainly of
activities. Doing these activities will show you how useful
the normal distribution is for practical purposes.

The standard normal distribution


In order to determine an individuals position relative to others in a normal distribution, we
need to know the mean and variance of the distribution. As you can imagine, the mean and
variance for height as a variable will differ from the mean and variance of weight as a variable
on a normal distribution. It is therefore impossible to know the values of the normal
distribution of each and every variable we want to investigate. So instead of having the
normal distribution scores for each and every variable we want to investigate, statisticians
have developed one standard normal distribution. The values on the standard normal
distribution are displayed as -scores. Statisticians have also identified a way of transforming
a score of any variable to a -score so the standard normal distribution can be used to
determine the distribution of a score relative to other scores of any variable we want to
investigate.

10.1
Work through the introduction to Tutorial 6 on pages 90 to 91 of Tredoux and Durrheim
(2013) in order to summarise what you have learnt so far about the normal distribution. Now
read the section on the standard normal distribution on pages 91 to 93 (before using tables of
-scores).
Tredoux and Durrheim (2013) explain that when we deal with a number of normally
distributed data sets, we can obtain uniformity by converting them to a standard form.
This standard normal distribution has the same characteristics as the normal distribution. and,
in addition to those characteristics, the following can be noted:,
x
x

A standard normal distribution is one with a mean of 0 and a standard deviation of 1.


The standard normal distribution is defined in terms of standard deviation units which are
called -scores.

134

IOP2601/MO001/4/2016

These -scores range from three standard deviations below the mean to three standard
deviations above it.
Whenever you are working with -scores, make a habit of
drawing a sketch of the normal distribution. Indicate the centre
on it as = 0. All negative z-scores will lie in the left-hand half
and all positive z-scores in the right-hand half. Your sketch
should look like the graph at the beginning of this learning unit.

In order to be able to use the standard normal distribution, the -scores are presented in the
form of a table.

10.2
Now work through the section on using the table of -scores on pages 93 to 97 of Tredoux
and Durrheim (2013). (You need not study Box 6.1 on pages 97 to 98.) Figure 3 on page 93
of Tredoux and Durrheim (2013) represents only a small section of table A1.1 in Appendix 1.
Note, however, that because a normal distribution is symmetrical, only the values for the
positive half are given in the table of -values. The values for the other half are identical
except that the signs are different. So when a negative -score is obtained, use the
corresponding positive -score in the table to read off the relevant proportion.
After working through this section in Tredoux and Durrheim (2013), you should be familiar
with table A1.1 and should be able to determine proportions related to various -scores.

As indicated earlier in the learning unit, we are interested in the distribution of various
variables that occur in everyday life, not necessarily in the distribution of -scores.

10.3
Work through the section in Tredoux and Durrheim (2013) from pages 98 to 99 that explains
how the statistical world (the standard normal distribution) can be used to interpret and make
conclusions about the real world (variables that occur in everyday life).

As mentioned earlier, and from this discussion in Tredoux and Durrheim (2013), you can
conclude that the standard normal distribution is very convenient, since any set of normally
distributed scores can be converted to standard scores to enable us to directly compare data
from different sets. The formula for this conversion is

135

IOP2601/MO001/4/2016




Once scores have been converted to standard scores (z-scores), one can use the standard
table of -scores. The conversion does not change the distribution of scores, only the way
they are represented. For example, compare A with a. Both characters represent the first
letter of the alphabet, but the same information is represented in a different form.

10.4
Study the section on pages 100 to 102 in Tredoux and Durrheim (2013) that explains the
conversion of x-scores to -scores.

You now know the definition and characteristics of a normal distribution. You have learnt what
the characteristics and use of the standard normal distribution are. And you have also learnt
how to transform scores that occur as normal variables (x-scores) to scores that fit in the
standard normal distribution (-scores).
Consider the following scenario. One of the characteristics that were measured as part of the
psychometric assessment of the contestants in the New Stars show is Rule Following. This
is the degree to which someone would be likely to follow the expected rules and regulations
of society. The researchers who were appointed to investigate various aspects of the New
Stars show want to know if the contestants on the show are likely to score less than the
average that could be expected for individuals in society in general. They decide to compare
the contestants scores to the norm group data that they have available. The norm group data
are normally distributed and consist of a group of individuals between 18 and 36 years of age
(N = 1250) with a mean () score of 7 and a standard deviation () of 2 on the measurement
of Rule Following. The New Stars contestants scored less than 5 on average on Rule
Following, while extremely conscientious individuals have been known to score around 10 on
this measurement.
You are asked to help answer the following questions:
(a)
(b)
(c)
(d)
(e)
(f)
(g)

136

Calculate the corresponding -score for a person with a score of 5 on Rule


Following.
Determine the proportion of cases with a score of less than 5 on Rule Following.
Determine the percentage of cases with a score of less than 5 on Rule Following.
Calculate the corresponding -score for a person with a score of 10 on Rule
Following.
Determine the proportion of cases with a score greater than 10 on Rule Following.
Determine the percentage of cases with a score greater than 10 on Rule Following.
Determine the number of cases with a score between 5 and 10 on Rule Following.

IOP2601/MO001/4/2016

Before you try to answer these questions, work through the following activities. These
activities will equip you to answer the questions.
Opposite each -score given in the table, three different values are given:
x
x
x

Mean to
Larger portion
Smaller portion

Remember that the total portion or area below the normal distribution equals 1,00. The area
of the left-hand half below the curve (which represents the negative -scores) equals 0,5, and
so does the area of the right-hand half below the curve (positive -scores).

Instructions
Draw a diagram representing a typical
standardised normal distribution and
indicate the portion below the curve as 0,5
for each of the two halves.

Commentary
The diagram is a sketch of a normal curve
with a vertical line through the centre
indicated as = 0. This sketch serves
merely to represent the two halves of the
area below the curve.

When one wants to know about the portion larger than a given score, one has to look at the
portion to the right of that scores location. Portion smaller than refers to the portion to the
left of that point. To know where the point will be on the curve one first has to compute the score. Areas can be indicated in five different ways:
x
x
x
x
x

Positive -score : larger than (to the right of)


Positive -score : smaller than (to the left of)
Negative -score: larger than (to the right of)
Negative -score: smaller than (to the left of)
Two -scores: between the two scores

=
=
=
=
=

smaller portion in table


larger portion in table
larger portion in table
smaller portion in table
use mean to -values in table

137

IOP2601/MO001/4/2016
Instructions
Make five sketches of the normal
distribution
to
represent
the
five
possibilities mentioned above graphically
(diagrammatically).

Commentary
Remember that a positive -score is
always to the right of the centre line (=
0) and a negative z-score to the left of the
centre line ( = 0).

Indicate = +1,5 on the first two figures by


means of a vertical line in the right-hand
half.
Indicate = -2 on the next two figures by
means of a vertical line in the left-hand
half.

Remember in the rest of the activity that


larger than (>) means to the right of and
smaller than (<) means to the left of.

Indicate two -scores on the fifth diagram:


= +1,5 in the right- hand half and = -2,0
in the left- hand half.

This is how you should represent the different possibilities:


POSITIVE -SCORE: READING PROPORTIONS
Instructions
Commentary
Where it says larger than (>) you should
First use the first two diagrams (where a
colour in the area to the right of the
positive -score of 1,5 is indicated). Write
marked -score. Where it says smaller
larger than (>) below the first diagram
than (<) you should colour in the area to
and smaller than (<) below the second.
Colour in the appropriate areas on the two
the left of the marked -score.
diagrams.
In the first diagram [positive -score and
Now look up the relevant value in table
larger than >] you will have coloured in
A1.1 to indicate the specific area in each
the smaller portion of the area. In the
diagram as a portion.
second diagram [positive -score and
smaller than <] you will have coloured in
the larger portion of the area.
The smaller portion for a -score of 1,5
equals 0,06681 (see table A1.1) and the
larger portion for = 1,5 equals 0,93319.

138

IOP2601/MO001/4/2016
Now we show you how to colour in the portions in diagrams 3 and 4 where you were working
with negative -scores.
NEGATIVE -SCORE: READING PROPORTIONS
Instructions

Commentary

Use diagrams 3 and 4 (where a negative score of -2 is indicated). Write larger than
(>) below the first diagram and smaller
than (<) below the second. Now colour in
the appropriate areas on the two diagrams.

Where it says larger than (>) you should


colour in the area to the right of the
marked -score. Where it says smaller
than (<) you should colour in the area to
the left of the marked -score.

Now look up the relevant value in table


A1.1 to indicate the specific area in each
diagram as a portion.

In the third diagram (negative -score and


larger than (>) you will have coloured in
the larger portion of the area. In the fourth
diagram (negative -score and smaller
than (<) you will have coloured in the
smaller portion of the area.
The smaller portion for a -score of 2,0
equals 0,02275 (see table A1.1) and the
larger portion for = 2,0 equals 0,97725.

Now lets see what happens if you want to look at the area (portion) between two different scores. Here we work with the area smaller than (to the left of) the one value but at the same
time also the area larger than (to the right of) the second value. Again it will become much
clearer if you make a graphic representation. In this part of the activity you use the mean to
column in table A1.1 (although you could work out the same answer using other values).

139

IOP2601/MO001/4/2016
TWO -SCORES: PROPORTION BETWEEN TWO SCORES
Instructions

Commentary

Use diagram 5 of the normal distribution that You should have coloured in the area
you made.
between the two values (smaller than =
1,5 and larger than = -2,0).
Indicate both = 1,5 (to the right of = 0)
and = -2,0 (to the left of = 0) on this Here it is simpler to use the mean to
diagram. Write -2,0< < 1,5 below this values in the table of -scores. The area
figure. Colour in the appropriate areas in the between the two -values is represented by
diagram.
two halves. The left-hand part is the value of
= -2,0 up to the centre line ( = 0); the
Now look up the relevant value in table A1.1 right-hand part is the portion between = 1,5
to indicate the specific area that you and the centre line ( = 0).
coloured in the diagram as a portion.
The mean to portion for a -score of -2,0
is 0,47725 (see table A1.1) and the mean to
portion for a -score of 1,5 is 0,43319.
These two scores have to be summed to
obtain the proportion between the two
scores. (The answer is 0,91044.)

Any proportion can also be expressed as a percentage by multiplying by 100. For example:
0,97725 may be written as 97,72% and 0,02275 as 2,28%.
Once we have computed a portion, we can also compute how many people the portion
represents. This number is obtained by multiplying the portion (or percentage) by the total
number of instances.
NUMBER OF PEOPLE REPRESENTED BY A PROPORTION (OR PERCENTAGE)
Instructions

Commentary

Compute the number of instances of a total Multiply the total number of instances (80) by
of 80 which a portion of 0,7653 (or 76,53%) the portion (or percentage). This gives an
represents.
answer of 61,24, rounded off to 61.
Remember that when you are working with
people the answer is always rounded off to
the nearest figure.

140

IOP2601/MO001/4/2016
Do the following steps whenever you work with -scores:
x
x
x

x
x

Sketch the normal distribution with = 0 in the middle.


Compute the -score (use the formula).
Plot the computed -score on your diagram. Remember,
positive -scores are to the right of = 0 and negative scores to the left of it.
Determine whether you should colour in the area to the left of
(<) or to the right of (>) the -value.
Read the correct proportion (larger portion, smaller portion
or mean to )from the table opposite the relevant -value
and enter them on your diagram.
Write down the necessary computation and give your final
answer in the appropriate format.

10.5
Now try to answer the questions asked earlier (in Activity 10.4).

Your answers should look like this:


(a)

Calculate the corresponding -score for a person with a score of 5 on Rule Following.
(X = 5)



 

-3

-2

-1

3

 

141

IOP2601/MO001/4/2016
(b)

Determine the proportion of cases with a score of less than 5 on Rule Following.
x
x
x
x
x

Look at the diagram in question (a).


You have already computed = -1 in question (a).
Turn to table A1.1.
Find = 1 in the table.
Refer to the graph above, the grey coloured area is what this question is about
and you can no doubt see that this is the smaller portion under this normal
distribution.
Look next to = 1 under the smaller portion in the table and you will see the value
0,15866.

Proportion of cases with a score less than 5 on Rule Following = 0,15866


(c)

Determine the percentage of cases with a score of less than 5 on Rule Following
Proportion (your answer to question b) = 0,15866
Thus the percentage = 15,87%

(d)

Calculate the corresponding -score for a person with a score of 10 on Rule Following.
(X = 10)

142

 

 
,5



-3

-2

-1

3

IOP2601/MO001/4/2016
(e)

Determine the proportion of cases with a score greater than 10 on Rule Following.
Proportion of cases with a score greater than 10 on Rule Following = 0,06681

(f)

Determine the percentage of cases with a score greater than 10 on Rule Following.
Proportion (your answer to question e) = 0,06681
Thus the percentage = 6,68%

(e)

Determine the number of cases with a score between 5 and 10 on Rule Following.

-3

-2

-1

Remember that the total area under the standard normal distribution = 1.
Therefore:
1 (0,06681 + 0,15866)

1 0,22547

0,77453

You will get the same answer if you add the two means to -proportions (0,43319 and
0,34134) in the figure.
N = 1250 (It is given at the beginning of the questions.)
The number of cases = 0,77453 x 1250
= 968,16
= 968 persons

From these calculations it appears that the contestants of the New Stars show fall within the
lowest 15% of individuals in society between the ages of 18 and 36 years regarding the
degree to which they would follow normal rules and regulations. Therefore, it seems they
might be more risk-taking and more likely to challenge the status quo than the average
individual in society in this specific age group.
Instead of only transforming x-scores to -scores, we can also convert -scores to x-scores.
Study pages 102 to 103 in Tredoux and Durrheim (2013) and then work through the following
exercise.

143

IOP2601/MO001/4/2016
Suppose we have to determine the limits within which 80% of individuals scores will fall if we
know that the mean of the scores equals 45,7 and the standard deviation equals 11,42.
LIMITS OF CONFIDENCE
Instructions

Commentary

Draw another normal distribution with = 0 The 80% is distributed symmetrically around
= 0, with 40% to the left and 40% to the
in the middle.
right of = 0.
Now mark two limits on either side of = 0
and colour in the area between these limits. The remaining 20% of the area (portion)
below the curve will take the form of two
Mark the coloured area as equalling 80% tails. So this 20% is divided in two, 10% in
(0,80) with 40% to the left of = 0 and 40% the left-hand tail and 10% in the right-hand
tail.
to the right of = 0.
Now find the corresponding -scores in the Look for the -value that gives a mean to
table which will give the areas indicated on score of 0,4000 (40%) so that the area
between the negative -score with this value
your diagram.
and the positive -score with this value will
If you do not find the exact value in the table, take up a total area of 80%. (Remember, the
use the one that is numerically closest to the normal distribution is perfectly symmetrical.)
value you are looking for.
You could also have looked up the -value
with a smaller portion of 0,1000. This would
The two -values closest to 0,4000 (40%) have given you exactly the same -values.
are
= 1,29 (mean to = 0,40147)
= 1,28 (mean to = 0,39973)
Since 0,39973 is closer to 0,40, we take =
1,28 as the correct answer.

To interpret this -score correctly, refer to the diagram that you drew. We have worked out
that 80% of all scores will fall between = 1,28 and = +1,28. In practice this means that
80% of all values will fall between the following two computed values:
x
x

The mean minus 1,28 multiplied by the standard deviation (indicating the lower limit).
The mean plus 1,28 multiplied by the standard deviation (indicating the upper limit).

Remember we said that the mean is equal to 45,7 and a standard deviation is equal to 11,42.
So in this specific case 80% of all scores will fall between

144

IOP2601/MO001/4/2016
x
x

lower limit: 45,7 - 1,28(11,42)


upper limit: 45,7 + 1,28(11,42)

=
=

45,7 - 14,62 =
45,7 + 14,62 =

31,08, and
60,32.

So 80% of the peoples scores will fall between 31,08 and 60,32.
There are two fixed limits that are commonly used and which
you should remember: 95% and 99%. The -values for 95%
are +1,96 and for 99% they are +2,58.
Here is a graph depicting the areas under the normal distribution curve:

Go through the worked example in Tredoux and Durrheim (2013) on page 104.

In order to be able to use spreadsheets to do -calculations,


work through Box 6.2 on pages 104 to 105 in Tredoux and
Durrheim (2013).

Having completed this learning unit you should be able to


x
x
x
x

describe the normal distribution


describe the standard normal distribution and depict it
graphically
perform linear transformations like computing values and
reading values from tables
determine limits of confidence for observations

145

You might also like