01 Nature of Statistics-1-1
01 Nature of Statistics-1-1
01 Nature of Statistics-1-1
CHAPTER ONE
THE NATURE OF STATISTICS
1.1 Some basic concepts
Like all fields of learning, statistics has its own vocabulary. Some of the words and
phrases encountered in the study of statistics will be new to those not previously
exposed to the subject. The following are some terms that we will use extensively in the
remainder of this book.
Data
The raw material of statistics is data. For our purpose, we may define data as numbers.
The two kinds of numbers that we use in statistics are numbers that result from taking a
measurement and those that result from the process of counting. For example, when a
nurse weighs a patient or takes a patient’s temperature, a measurement, consisting of a
number such as 30 kg or 37 oC, is obtained. A different type of number is obtained when
a hospital administrator counts the number of patients – perhaps 15 – discharged from
the hospital on a given day. Each of these three numbers is a datum, and the three
numbers taken together are data.
Example 1.1
Suppose we wish to study the body masses of all students of Methodist University. It
will take us a long time to measure the body masses of all students of the university and
so we may select 20 of the students and measure their body masses. Suppose we obtain
the measurements in Table 1.1.
In this study, we are interested in the body masses of all students of Methodist
University. The set of body masses of all students of Methodist University is called the
population of this study. The set of body masses in Table 1.1, W = {49, 56, 48, …, 53,
59}, is a sample from this population.
Definition 1.1
2 ELEMENTARY STATISTICAL METHODS
Example 1.2
In a certain study, 900 men were selected from Nsawam. It was found that 25 are
smokers.
(a) What is the population in this study?
(b) What is the sample size?
Solution
(a) The population is men from Nsawam.
(b) The sample size is 900.
Remarks
1. If we wish to study the blood pressures of Ghanaians, then our population consists
of all blood pressures of Ghanaians. If we are interested in the blood pressures of
Ghanaian men, then we have a different population – the blood pressures of
Ghanaian men.
2. In many situations, we cannot afford to study the entire population. Instead, we take
a sample from the population and study this sample. If the sample is representative
of the population, then the information from the sample can be applied to the whole
population. One way of obtaining a representative sample is discussed in
Section 1.6.
3. A population may be finite or infinite. If a population of values consists of a fixed
number of these values, the population is said to be finite, otherwise, it is infinite.
An infinite population consists of an endless succession of values. In practice, the
term infinite population is used to refer to a population that cannot be enumerated in
a reasonable period of time.
Example 1.3
A finite population includes the following:
(i) Students studying Business Administration at the Methodist University.
(ii) All football clubs in the first and second divisions in Ghana.
(iii) All households in Nkawkaw.
Example 1.4
An infinite population includes the following:
(i) The set of real numbers between two integers.
(ii) All fishes in River Volta.
THE NATURE OF STATISTICS 3
the efficacy of drugs in treating illnesses and to assess the likelihood of undesirable side
effects.
Statistical methods are also used in business practice, e.g. to forecast demand for
goods and services. Actuaries use statistical methods to assess risk levels and set
premium rates for insurance and pension industries.
Statisticians also play a vital role in assessing employment levels and needs of the
population for health, economic and social services. Without accurate information from
agencies like Ghana Statistical Services, Customs Excise and Preventive Services
(CEPS), Environmental Protection Agency, the government cannot effectively allocate
its resources.
Research in statistical methods is carried out in universities, government agencies
and in private industry. Statisticians employed in these activities develop new ways to
collect and analyze data for the many types of data and experimental settings
encountered in practical studies.
This scale of measurement applies to quantitative data only. In this scale, the zero point
does not indicate a total absence of the quantity being measured. An example of such a
scale is temperature on the Celsius or Fahrenheit scale. Suppose the minimum
temperatures of 3 cities, A, B and C, on a particular day were 0 C, 20 C and 10 C,
respectively. It is clear that we can find the differences between these temperatures.
For example, city B is 20 C hotter than city A. However, we cannot say that city A has
no temperature. Note that city A has a temperature equivalent to 32 F. Moreover, we
cannot say that city B is twice as hot as city C, just because city B is 20 C and city C is
10 C. The reason is that, in the interval scale, the ratio between two numbers is not
meaningful.
Quantitative Qualitative
Exercise 1(a)
THE NATURE OF STATISTICS 7
1. For each of the following variables, state whether it is quantitative or qualitative and
specify the measurement scale that is employed when taking measurements on each.
(a) gender of babies born in a hospital, (b) marital status,
(c) temperature measured on the Kelvin scale, (d) nationality,
(e) masses of babies in kg, (f) temperature in C,
(g) prices of items in a shop, (h) position in an exam.
(i) the rank of an academic staff in a University.
2. For each of the following situations, answer questions (a) through (d):
(a) What is the variable in the study? (b) What is the population?
(c) What is the sample size? (d) What measurement scale was used?
A. A study of 150 students from St. Ann School, showed that 10% of the students
had blood group A.
B. A study of 100 patients admitted to St. Paul’s Hospital, showed that 25 patients
lived 8 km from the hospital.
C. A study of 50 teachers in Town A showed that 5% of the teachers earn GH
¢800.00 per month.
3. Explain what is meant by descriptive statistics.
4. Explain what is meant by inferential statistics.
5. Define the following terms:
(a) population, (b) qualitative variable,
(c) discrete variable, (d) sample,
(e) continuous variable, (f) quantitative variable.
Experiments
Frequently, the data needed to answer a question are available only as a result of an
experiment. A researcher may wish to know which of several drugs is most effective
for treating headache. The researcher might conduct an experiment by assigning the
8 ELEMENTARY STATISTICAL METHODS
drugs to different patients. Subsequent evaluation of the responses to the different drugs
might enable the researcher to decide which drug is most effective for treating headache.
Surveys
In surveys, the aim of the researcher is to find a way of obtaining information from
individuals, referred to as respondents. Such information can be factual (for example,
the number of cars per household, age of respondents, or income) or can concern the
attitudes of the respondent (for example, his attitude to racial discrimination, or his
liking for a brand of cigarette).
A survey conducted on a whole population of interest is called a census and a
survey conducted on a sample from a population is called a sample survey. Surveys
involve the use of questionnaires to obtain desired information from respondents.
Questionnaires may be administered by post, by telephone, by e-mail or in person.
Personal interview
Here, we gather information through oral questioning.
Disadvantages
It can be very costly.
Requires specially trained interviewers.
Advantages
It usually yields a high proportion of returns because a well-trained enumerator can
establish the necessary rapport to ensure co-operation by the respondent.
Information on conceptually difficult items can be obtained since the enumerator
can explain what is required.
The information obtained is likely to be more accurate than that obtained by other
methods since the interviewer can clarify seemingly unclear questions by explaining
the questions to the respondent.
Visual materials to which the respondent is able to react can be presented.
Telephone interview
This is a variation of the personal interview.
Advantages
It saves time.
It is cheaper than personal interviews.
It is easy to train and direct interviewers.
THE NATURE OF STATISTICS 9
Disadvantages
Telephone subscribers are usually not representative of the whole population. There
is therefore the risk of a biased survey, unless great care is taken in the use of the
method.
Sensitive questions cannot be asked in this type of enquiry.
Its use is limited to urban areas with efficient telephone services.
Postal survey
In postal survey, questionnaires are posted to respondents; they complete them and mail
them back to you. The questionnaires are usually accompanied by a letter that explains
the survey, encourages complete and candid answers and sets a deadline for returning
responses. A stamped addressed envelope is customarily included to facilitate returns.
Advantages
It makes wide geographic coverage possible at comparatively little cost.
There is no need to train interviewers.
It encourages the respondent to answer questions frankly in the privacy of the home
and without the subjective influence of the interviewers.
There is lack of interviewer bias.
Disadvantages
One cannot be sure of the interpretation placed by the respondent on the questions
asked.
There may be a delay in receiving responses.
There is the problem of non-response to the survey. This non-response is certain to
affect the validity of the survey as it is most unlikely that the sections of the sample
that do and do not reply are similar in the characteristics under consideration.
Libraries
A common place to look for secondary data is a library. Here, data can be obtained
from magazines, journals and newspapers.
Government agencies
10 ELEMENTARY STATISTICAL METHODS
Government data can be obtained from publications issued by local, state, national and
international governments. Such data include laws, regulations, statistics and consumer
information.
Internet
Secondary data can be obtained from search engines such as Yahoo, Google, MSN.com,
etc., on the internet.
Sampling methods
A sampling method (or sampling design) is a definite plan for obtaining a sample from a
given population. Practical difficulties in handling certain parts of a population may
point to their elimination from the scope of a survey. Thus, any sample selection
procedure will give some individuals the chance to be included in the sample while
excluding others. The people who have a chance of being included among those
selected, constitute a sample frame. Examples are: the Electoral Register of Ghana
THE NATURE OF STATISTICS 11
(this contains the names of all those who can vote in Ghana), the list of members of
professional associations (statisticians, doctors, lawyers, etc.).
Exercise 1(b)
12 ELEMENTARY STATISTICAL METHODS
1. Give two reasons why it is sometimes necessary to take a sample from a population.
2. State two ways of obtaining primary data.
3. State two sources of secondary data.
4. State two advantages and two disadvantages of the lottery system for taking a simple
random sample from a population.
5. State two disadvantages and one advantage of telephone interview, as a means of
collecting data.
Revision Exercises 1
1. Briefly describe the difference between descriptive statistics and inferential
statistics.
2. A doctor examined a patient to determine the cause of a disease. He took a drop of
blood and used it to determine the state of health of the patient. What aspect of
statistics is the doctor employing in order to form a judgement?
3. In your own words, explain and give an example of each of the following statistical
terms:
(a) population, (b) sample.
4. Mrs. Akrong wants to check whether the pot of soup she is cooking has the right
taste and quantity of salt. She did this by tasting a small portion of the soup scooped
in a ladle. What aspect of statistics is she employing in order to form a judgement?
Briefly explain why she decided to use this particular method?
5. Explain the difference between qualitative and quantitative data. Give examples of
qualitative and quantitative data.
6. List the four levels of measurement and give examples.
7. Explain the difference between:
(a) nominal and ordinal data, (b) a census and a sample survey,
(c) a discrete data and a continuous data.
References
1. Levy, P. S. and Lemeshow, S. (1999). Sampling of populations, Methods and
Appications. John Wiley and Sons Inc., New York.
2. Rao, P. S. R. S. (2000). Sampling Methodologies with applications. Chapman and
Hall, London.
3. Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103,
677 – 680.
THE NATURE OF STATISTICS 13