Sampling Theory
Sampling Theory
Sampling Theory
*CMS 311: Business Statistics Dr. Abraham Kiruga, Department of Finance & Accounting, CUEA
Sampling Theory
Introduction
Sampling theory is a branch of statistics that focuses on selecting and analyzing a subset (sample)
of data from a larger population to make inferences about the whole population. Its primary aim is
to understand how samples relate to populations, allowing researchers to make informed
conclusions about a population without needing to study every individual in it. Sampling theory
helps ensure that the sample is representative of the population, which is crucial for the accuracy
and reliability of the inferences.
Population and Sample: The population is the complete set of elements or observations relevant
to a particular study, while a sample is a subset of this population used to * inferences.
Sampling Methods: Various methods can be used to select a sample, such as simple random
sampling, stratified sampling, cluster sampling, and systematic sampling. The choice of method
depends on factors like the study’s objectives, the nature of the population, and resource
constraints.
Sampling Distribution: This is the probability distribution of a given statistic (e.g., mean or
proportion) based on all possible samples of a particular size from the population. Sampling
distributions help estimate the variability of a statistic and form the foundation for statistical
inference.
Sampling Error: This error arises due to the difference between a sample statistic and the actual
population parameter. Sampling error decreases with larger sample sizes and more representative
sampling methods.
Central Limit Theorem (CLT): CLT is fundamental to sampling theory, stating that the sampling
distribution of the sample mean will approximate a normal distribution as the sample size
increases, regardless of the population's original distribution. This property allows researchers to
make probabilistic statements about population parameters using sample data.
Sampling theory is widely used across various fields, including market research, opinion polling,
and scientific studies, where it enables cost-effective and time-efficient data collection while
preserving the ability to make accurate generalizations.
Important Terms
❖ Sampling design= it is a set of decisions that must be made before the data are collected.
1|Page
SAMPLING THEORY
1.1.Sample
By sample we mean the aggregate of objects, persons or elements, selected from the universe. It
is a portion or sub part of the total population.
The following two methods are used to collect information about the population
❖ Census and;
❖ Sampling
Census: When each and every element or unit of the population is studied
Sampling: When a small part of the population is selected for study.
Why Sampling?
1.1.Advantages
❖ Helps to collect vital information more quickly. Even small samples, when properly
selected, help to make estimates of the characteristics of the population in a shorter time.
❖ The modern world is highly dynamic, therefore, any study must be completed in short
time, otherwise, by the time the survey is completed the situations, characteristics etc may
have changed.
❖ It cuts costs; enumeration of total population is much more costly than the sample studies.
❖ Sampling techniques often increases the accuracy of data. With small sample, it becomes
easier to check the accuracy of the data. Some sampling techniques/ methods make it
possible to measure the reliability of the sample estimates from the sample itself.
❖ From the administrative point of view also sampling becomes easier, because it involves
less staff, equipment’s etc.
1.1.Disadvantages
❖ Sampling is not feasible where knowledge about each element or unit or a statistical
universe is needed.
❖ The sampling procedures must be correctly designed and followed otherwise, what we
call as wild sample, would crop up with mis-leading results.
❖ Each type of sampling has got its own limitations.
❖ There are numerous situations in which units, to be measured, are highly variable. Here a
very large sample is required in order to yield enough cases for achieving statistically
reliable information.
2|Page
SAMPLING THEORY
❖ To know certain population characteristics like population growth rate, population density
etc. census of population at regular intervals is more appropriate than studying by
sampling.
SAMPLING METHODS/DESIGNS
❖ According to Element Selection:
○ Unrestricted = any element from the population has the chance to become a
sample
○ Restricted = certain elements are given the chance to become a sample given
certain qualifications
It is assumed that if the sample is chosen at random and if the size of the sample is sufficiently
large, it will represent all groups in the population
Random sampling is of 2 types; sampling with replacement and sampling without replacement
Sampling is said to be with replacement when from a finite population a sampling unit is drawn
observed and then returned to the population before another unit is drawn. The population in
this case remains the same and a sampling unit might be selected more than once
If on the other hand a sampling unit is chosen and not returned to the population after it has
been observed the sampling is said to be without replacement.
3|Page
SAMPLING THEORY
Random samples may be selected by the help of lottery method or table of random numbers (such
as tippet’s table of random numbers, fischer and Yates numbers or Kendall and Babington smith
numbers.)
Tippets Numbers: It was first developed by Prof L. H. C. Tippet and since then is known by
his name. He developed a list of 10,400 sets of numbers randomly, each set being of four digits
There numbers are written on several pages in unsystematic order.
Grid Method: This method is applied in selection of the areas. Suppose we have to select any
number of areas from a town or any number of towns from a province for survey. For selection,
first a map of the whole area is prepared. The area is often divided into different blocks. A
transparent plate is made equivalent to the size of the map that consists of several seqared holes
in it which carries different numbers. By random sampling method it is decided as to which
numbers are to be included in the sample.
Stratified sampling
In this case the population is divided into groups in such a way that units within each group are
as similar as possible in a process called stratification. The groups are called strata. Simple
4|Page
SAMPLING THEORY
random samples from each of the strata are collected and combined into a simple. This technique
of collecting a sample from a population is called stratified sampling. According to the nature of
the problem relevant criteria are selected for stratification. Among the possible stratifying criteria,
cum age, sex, family income, number of years of education, occupation, religion, race, place of
residence etc. On the basis of characteristics universe can be divided into different strata or
stratum, Each stratum has to be homogeneous from within such a division can be done on the
basis of any single criterion. e.g. on the basis of age we can divide people into below 25 and
above 25 groups, on the basis of education into matriculates and non matriculates etc.
Stratification can also be done on the basis of a combination of any two or more criteria viz. on
the basis of sex and education, we can divide the people into four groups.
❖ Educated women
❖ Un-educated women
❖ Education men
❖ Un educated men
Elements are then selected from each stratum through simple a random sampling method. An
estimate is made for each stratum separately. These estimates are combined to provide an
estimate for the entire population.
Purpose: The primary purpose is to increase the representatives of the sample without
increasing the size of the sample on the basis of having greater knowledge of the population
characteristics.
❖ Advantages
❖ The population is first stratified into different groups and then the elements of the sample
are selected from each group. Therefore, the different groups are sure to have
representation in the sample. In case of random sample, there is possibility that bigger
groups have greater representation and the smaller groups are often eliminated or under
represented.
❖ With more homogenous population greater precision can be achieved with fewer cases.
This saves time in collecting and processing of the data when detailed study about
population characteristics are wanted it is more effective.
❖ As compared to random samples, stratified samples are geographically more
concentrated and thus save time, money and energy, in money from one address to
another.
❖ Disadvantages
❖ Unless there are extreme differences between the strata, the expected proportional
representation would be small. Here a random sampling may give a nearly proportional
representation.
❖ Even after stratification, the sample is selected from each stratum either by simple
random sampling method or by systematic sampling method; as such the draw backs of
both methods can be present.
❖ For application of the stratified method, one must know the characteristics of the
specified population in which the study is to be made. He must also known as to which
characteristics are related to the subject under investigation and therefore can be
considered as relevant for stratification.
❖ The process of stratification becomes more and more complicated and difficult as the
numbers of characteristics to be used for stratification are increased.
5|Page
SAMPLING THEORY
Advantage
❖ The definiteness of proportional representation.
Disadvantage
❖ The researcher may have poor judgment or in adequate information upon which to base
the stratification. the greater the number of characteristics on which we are to boor our
stratification, and the more are the strata the more complicated becomes the problem of
securing proportional representation of each stratum.
Systematic Sampling
This sampling is a part of simple random sampling in ascending or descending orders. In
systematic sampling a sample is drawn according to some predetermined object. Suppose a
population consists of 1000 units, then every tenth, 20th or 50th item is selected. This method is
very easy and economical. It also saves a lot of time.
1.1.Advantages
❖ It is frequently used because it is simple, direct and in- expensive.
❖ When a list of names or items is available, systematic sampling is often an efficient
approach.
1.1.Disadvantages
6|Page
SAMPLING THEORY
❖ One should not use systematic sampling in case of exploring unfamiliar areas because
listing of elements is not possible
❖ When there is a periodic fluctuation in the characteristic under examination in relation to
the order in which the items appear, the methods is ineffective
Multistage sampling
This is similar to stratified sampling except division is done on geographical/location basis, e.g. a
country can be divided into counties and then survey is done in 4 towns in each counties. This
helps to cut traveling costs for a surveyor.
Area sampling
❖ It pertains to the grouping of the population into geographical division before selecting
the respondents. This sampling can be done if there exists a clear delineation of
communities where the respondents can be found.
1.Clustered sampling
This is where a few geographical regions e.g. a location, town or village are selected at random
and say every single household or shop in that area is interviewed this again cuts on costs.
❖ It involves the grouping or division of the elements of the population into heterogenous
groups. It should be noted that each cluster sample is composed of respondents with
different perspectives and interests.
○ In STRATIFIED sampling, the grouping is done per department, that is why there
is homogeneity.
○ In CLUSTERED sampling, each group is composed of members representing a
particular department, that is why there is heterogeneity.
All the elements in these clusters are not to be included in the sample; the ultimate selection
from within the clusters is also carried out on simple or stratified sampling basis.
Purpose: The purpose of a cluster sample is to reduce cost and not essentially to increase
percussion.
Advantage
7|Page
SAMPLING THEORY
2. Judgment Sampling
Here the interviewer selects whom to interview believing that their view is more fundamental since
they might be directly affected e.g. to find out effects of public transport one may choose to
interview only people who don’t own cars and travel frequently to work.
3. Quota Sampling
Definition: Quota sampling involves dividing the population into subgroups (strata) based on
certain characteristics, and then selecting a specified number of participants from each subgroup.
The goal is to ensure the sample reflects the population in terms of specific characteristics (e.g.,
gender, age), but the participants within the quota are chosen non-randomly.
Example: A researcher wants to study the opinions of university students on a new online learning
system. They decide that the sample should include 50 male and 50 female students. Once they
meet the quota for each gender, they stop selecting more participants, even if more people want to
participate.
Advantage
❖ If properly planned and executed, a quota sample is most likely to give maximum
representative sample of the population.
❖ In purposive sampling one picks up the cases that are considered to be typical of the
population in which to one is interested.
❖ The cases are judged to be typical on the basis of the need of the researcher.
❖ Since the selection of elements is based upon the judgment of the researcher, the
purposive sampling as called judgment sample.
❖ The researcher trees in his sample to match the universe in some of the
important known characteristics.
Disadvantage
❖ The defect with this method is that the researcher can easily make esser in judging as to
which cases are typical.
4. Convenience Sampling
Definition: In convenience sampling, participants are selected based on their availability and
willingness to participate. It is often used when quick or easy access to participants is necessary,
but it can lead to bias since the sample may not represent the broader population.
8|Page
SAMPLING THEORY
Example: A researcher standing outside a mall asks the first 100 people who walk by to fill out a
survey about their shopping preferences. The sample is based on convenience, as those selected
are nearby and available at that moment.
Example: A researcher conducting a survey on public transport usage happens to meet people at a
bus stop and asks them to participate in the study. These participants are selected "accidentally,"
as they were available at the time and location of the study.
6. Snowball Sampling
Definition: Snowball sampling is used when participants are difficult to locate. In this technique,
the researcher asks initial participants to refer others who meet the criteria for the study, and this
process continues until the sample size is sufficient.
Example: A study on people recovering from rare diseases could start with a few individuals found
through a support group. The researcher then asks these individuals to refer others who are also
recovering from the same condition, expanding the sample through referrals.
7. Purposive Sampling
Definition: Purposive sampling (also called judgmental sampling) involves selecting participants
based on specific characteristics or criteria that are relevant to the research question. The researcher
uses their judgment to choose participants who are most appropriate for the study.
Example: A study investigating the effectiveness of a leadership program may specifically select
participants who have held leadership positions for at least five years, as they are most likely to
provide relevant insights into the program’s impact.
Advantage
❖ Quote sampling is a stratified cum purposive sampling and thus enjoys the
benefits of both samplings.
❖ It proper controls or checks are imposed, it is likely to give accurate results.
❖ It is only useful method when no sample frame is available.
The theory was introduced by Moivre, Abraham de (1733). Laplace, Pierre Simon de (1810)
formulated the proof of the theorem and according to it;
9|Page
SAMPLING THEORY
if we select a large number of simple random samples, say from any population and
determine the mean of each sample, the distribution of these sample means will tend to be
described by the normal probability distribution with a mean µ and variance σ2/n.
This is true even of the population itself is not normal distribution. Or the sampling distribution of
sample means approaches to a normal distribution irrespective of the distribution of population
from where the sample is taken and approximation to the normal distribution becomes increasingly
close with increase in sample sizes
The Central Limit Theorem (CLT) is a statistical concept that states, that the sample mean
distribution of a random variable will assume a near-normal or normal distribution if the sample
size is large enough. In simple terms, the theorem states that the sampling distribution of
the mean approaches a normal distribution as the size of the sample increases, regardless of the
shape of the original population distribution.
As the user increases the number of samples to 30, 40, 50, etc., the graph of the sample means will
move towards a normal distribution. The sample size must be 30 or higher for the central limit
theorem to hold.
One of the most important components of the theorem is that the mean of the sample will be the
mean of the entire population. If you calculate the mean of multiple samples of the population, add
them up, and find their average, the result will be the estimate of the population mean.
The same applies when using standard deviation. If you calculate the standard deviation of all the
samples in the population, add them up, and find the average, the result will be the standard
deviation of the entire population.
10 | P a g e
SAMPLING THEORY
The central limit theorem forms the basis of the probability distribution. It makes it easy to
understand how population estimates behave when subjected to repeated sampling. When plotted
on a graph, the theorem shows the shape of the distribution formed by means of repeated
population samples.
As the sample sizes get bigger, the distribution of the means from the repeated samples tends to
normalize and resemble a normal distribution. The result remains the same regardless of what the
original shape of the distribution was. It can be illustrated in the figure below:
From the figure above, we can deduce that despite the fact that the original shape of the distribution
was uniform, it tends towards a normal distribution as the value of n (sample size) increases.
Apart from showing the shape that the sample means will take, the central limit theorem also gives
an overview of the mean and variance of the distribution. The sample mean of the distribution is
the actual population mean from which the samples were taken.
The variance of the sample distribution, on the other hand, is the variance of the population divided
by n. Therefore, the larger the sample size of the distribution, the smaller the variance of the sample
mean.
An investor is interested in estimating the return of ABC stock market index that is comprised of
100,000 stocks. Due to the large size of the index, the investor is unable to analyze each stock
independently and instead chooses to use random sampling to get an estimate of the overall return
of the index.
11 | P a g e
SAMPLING THEORY
The investor picks random samples of the stocks, with each sample comprising at least 30 stocks.
The samples must be random, and any previously selected samples must be replaced in subsequent
samples to avoid bias.
If the first sample produces an average return of 7.5%, the next sample may produce an average
return of 7.8%. With the nature of randomized sampling, each sample will produce a different
result. As you increase the size of the sample size with each sample you pick, the sample means
will start forming their own distributions.
The distribution of the sample means will move toward normal as the value of n increases. The
average return of the stocks in the sample index estimates the return of the whole index of 100,000
stocks, and the average return is normally distributed.
Essentials of Sampling
If the sample results are to have any worthwhile meaning, it should possess the following
essentials.
• Representativeness: A sample should be so selected that it truly represents
12 | P a g e
SAMPLING THEORY
Systematic Sampling: A systematic sampling is formed by selecting one unit at random and
then selecting additional units at evenly spaced intervals until the sample has been formed.
Thus method is popularly used in those cases where a complete list of the population from which
13 | P a g e
SAMPLING THEORY
the sample is to be drawn is available. The list may be prepared alphabetically, geographically
numerical etc. The items are serially numbered. The first item is selected at random generally
by following the lottery method. Subsequent items are selected by taking every the item from
the list where 'k' refers to the sampling interval or sampling ratio.
or k ' N / n,
Where N = size of universe
n = size of sample
k = sampling interval
Size of Sample
❖ An important decision that has to be taken in adopting a sampling technique is about the
size of the sample. Size of the sample means the number of sampling units selected from
the population to be investigated.
❖ Different opinions have been expressed by experts on this point. Some suggest that the
sample size should be 5 percent of the size of population while others are of the opinion
that the sample size should be at least 10 percent. However, these views are of little use
in practice because no hard and fast rule can be laid down that sample size should be 5
percent, 10 percent or 25 percent of the universe size.
❖ It may be provided out that mere size alone does not ensure representativeness. A smaller
sample, but well selected sample, may be superior to a larger but badly selected sample.
Similarly, if the size of the sample is small, it may not represent the universe and the
inference drawn about the universe may be misleading. On the other hand, if the size of
sample is very large, it may too burdensome financially, require a lot of time and may
have serious problems of managing it.
❖ Hence the sample size should neither be too small nor too large. It should be optimum.
Optimum size is that one that fulfils the requirements of 'efficiency', 'representativeness',
'reliability and 'flexibility'. The following factors should be considered while deciding
the size of sample
14 | P a g e
SAMPLING THEORY
Nature of Study
• For an intensive and continuous study a small sample may be suitable. But for
studies, which are not likely to be repeated and are quite extensive in nature, a large
sample size may be required.
❖ The error assign out due to drawing inferences about population on the basis of few
observations (sampling), is termed 'sampling error'.
❖ In the complete enumeration survey since the whole population is surveyed, sampling
error in this sense in non-existent. However, the mainly arising at the stage of
ascertainment and processing of data, which are termed non-sampling errors, are common
both in complete enumeration and sample surveys.
Sampling Errors: Even if utmost care has been taken in selecting a sample, the results derived
from a sample study may not be exactly equal to the true value in the population. The reason is
that estimate is based on a part and not on the whole and samples are seldom, if ever, perfect
miniature of the population. Hence sampling gives rise to certain errors known as sampling
errors. However, the errors can be controlled. The modern sampling theory helps in designing
the survey in such a manner that the sampling errors can be made small.
Sampling errors are of two types:
❖ biased, and
15 | P a g e
SAMPLING THEORY
❖ un-biased
Biased Errors: These errors arise from any bias in selection, estimation, etc. For example, if in
place of simple random sampling, deliberate sampling has been used in a particular case some
bias is introduced is the result and hence such errors are called sampling errors.
Un-biased Errors: These errors arise due to "chance" differences between the members of the
population included in the sample and those not included. An error in statistics is the difference
between the value of a statistic and that of the corresponding parameter.
❖ Thus the total sampling error is made up of errors due to bias, if any and the random
sampling error.
❖ The bias error, forms a constant component of error that does not decrease in large
population as the number of sample increases. Such error is also known as cumulative
or non-compensating error. The random sampling error, on the other hand, decreases,
on an average, as the size of sample increases. Such errors are, therefore, known as
non-cumulative or compensating error.
Causes of Bias: Bias may arise due to:
❖ Faulty process of selection;
❖ Faulty work during the collection; and
❖ Faulty methods of analysis
Bias in Analysis: In addition to bias, which arises from faulty process of selection and faulty
collection of information, faulty methods of analysis may also introduce such bias. Such bias
can be avoided by adopting the proper method of analysis.
Avoidance of Bias: If the possibility of bias exists, fully objective conclusion cannot be drawn.
The first essential of any sampling or census procedure must, therefore, be the elimination of all
sources of bias.
16 | P a g e
SAMPLING THEORY
Apart from reducing errors of bias, the simplest way of increasing the accuracy of a
sample is to increase its size. The sampling error usually decreases with increase in
sample size, and infact in many situations the decrease is inversely proportional to the
square root of the sample size.
Sampling Error
Sample Size
From this diagram it is clear that though the reduction in sampling error is substantial for
initial increases in sample size, it becomes marginal after a certain stage. In other words,
considerably greater effort is needed after a certain stage to decrease the sampling error
this is the initial instance.
From this point of view it could be said that there is a strong case for resorting to a sample
survey to provide estimates within permissible margins of error instead of a complete
enumeration survey.
Non-Sampling Errors
As regards non-sampling errors they are likely to be more in case of complete enumeration
survey than in case of a sample survey. When a complete enumeration of units in the universe
is needs, one would expect that it would give rise to date free from errors. However, in practice
it is not so. For example, it is difficult to completely avoid errors of observation or ascertainment.
Similarly, in the processing of data, tabulation errors may be committed, affecting the final
result. Errors arising in this manner are termed as non-sampling errors. Non-sampling error can
occur at every stage of planning and execution of census or survey. Such errors can arise due to
a number of causes such as defective methods of data collection, and tabulation, faulty
definition, incomplete coverage etc. More specifically, non-sampling errors may arise from one
or more of the following factors:
Data specification may be inadequate and inconsistent with respect to the objectives of the study.
Inaccurate or inappropriate method of interview, observation or measurement with
inadequate on ambiguous schedules.
Lack of trained and experienced investigators.
Lack of inadequate inspection and supervision of
primacy staff.
Errors due to non-response.
Errors in data processing operations.
Errors committed during presentation and printing of tabulated results.
17 | P a g e
SAMPLING THEORY
Control of Non-Sampling Errors: In some situations, the non-sampling errors may be large
and deserve greater attention than sampling errors. While, in general, sampling error decrease
with increase in sample size, non-sampling error tends to increase with the sample size.
Increase of complete enumeration non-sampling errors and incase of sample surveys both
sampling and non- sampling errors require to be controlled and reduced at a level at which their
presence does not vitiate the use of final result.
Reliability of Samples: The reliability of samples can be tested in the following ways.
More samples of the same size should be taken from the same universe and their results be
compared. If the results are similar, the sample will be reliable.
If the measurements of the universe are known, then they should be compared with the
measurements of the sample. In case of similarity of measurements, the sample will be reliable.
Types of distribution
Population distribution
It refers to the distribution of the individual values of population. Its mean is denoted by ‘µ’
Sample distribution
It is the distribution of the individual values of a single sample. Its mean is generally written as “
x ”. it is not usually the same as µ
s
Standard error of the mean = S x =
n
Note: this formular is satisfactory for larger samples and a large population i.e. n > 30 and n > 5%
of N.
- The word ‘error’ is in place of ‘deviation’ to emphasize that variation among sample means is
due to sampling errors.
- The smaller the standard error the greator the precision of the sample value.
18 | P a g e
SAMPLING THEORY
Statistical inference
It is the process of drawing conclusions about attributes of a population based upon information
contained in a sample (taken from the population).
It is divided into estimation of parameters and testing of hypothesis. Symbols for statistic of
population parameters are as follows.
Sample Population
Statistic Parameter
Arithmetic mean x µ
Standard deviation s σ
Number of items n N
Statistical estimation
It is divided into point estimation (where an estimate of a population parameter is given by a single
number) and interval estimation (where an estimate of a population is given by a range which the
parameter may be considered to lie),
e.g. a bus meant to take a class of 100 students (population N) for trip has a limit to the maximum
weight of 600kg of which it can carry, the teacher realizes he has to find out the weight of the class
but without enough time to weigh everyone he picks 25 students selected at random (sample n =
25).
These students are weighed and their average weight recorded as 64kg ( X - mean of a sample)
with a standard deviation (s), now using this the teacher intends to estimate the average weight of
the whole class (µ – population mean) by using the statistical parameters standard deviation (s),
and mean of the sample ( x ).
Confidence Interval
The interval estimates or a ‘confidence interval’ consists of a range (an upper confidence limit and
lower confidence limit) within which we are confident that a population parameter lies and we
assign a probability that this interval contains the true population value
19 | P a g e
SAMPLING THEORY
The confidence limits are the outer limits to a confidence interval. Confidence interval is the
interval between the confidence limits. The higher the confidence level the greater the confidence
interval. For example
1.LARGE SAMPLES
These are samples that contain a sample size greater than 30(i.e. n>30)
Example
The quality department of a wire manufacturing company periodically selects a sample of wire
specimens in order to test for breaking strength. Past experience has shown that the breaking
strengths of a certain type of wire are normally distributed with standard deviation of 200 kg. A
random sample of 64 specimens gave a mean of 6200 kgs. Find out the population mean of 95%
level of confidence
Solution
Population mean = χ ± 1.96 S x
Note that sample size is already n > 30 whereas s and x are given thus step i), ii) and iv) are
provided.
Here: X = 6200 kgs
20 | P a g e
SAMPLING THEORY
s 200
Sx = = = 25
n 64
Example
A manager wants an estimate of sales of salesmen in his company. A random sample 100 out of
500 salesmen is selected and average sales are found to be Shs. 75,000. if a sample standard
deviation is Shs. 15000 then find out the population mean at 99% level of confidence
Solution
Here N = 500, n = 100, X = 75000 and S = 15000
Now
Standard error of mean
s N −n
= Sx = x
n n −1
=
15000
x
(500 − 100)
100 (500 − 1)
15000 400
= x
10 499
15000
= (0.895)
10
21 | P a g e
SAMPLING THEORY
Example
Given two samples A and B of 100 and 400 items respectively, they have the means X 1 = 7 ad
X 2 = 10 and standard deviations of 2 and 3 respectively. Construct confidence interval at 70%
confidence level?
Solution
Sample A B
X1 = 7 X 2 = 10
n1 = 100 n2 = 400
S1 = 2 S2 = 3
The standard error of the samples A and B is given by
4 9
S (X − X ) = +
A B 100 400
25 5
= =
400 20
=¼ = 0.25
At 70% confidence level, then appropriate number is equal to 1.04 (as read from the normal tables)
X 1 − X 2 = 7 – 10 = - 3 = 3
We take the absolute value of the difference between the means e.g. the value of /X/ = absolute
value of X i.e. a positive value of X.
Confidence interval is therefore given by
= 3± 1.04 (0.25 ) From the normal tables a z value of 1.04 gives a value of 0.7.
= 3± 0.26
22 | P a g e
SAMPLING THEORY
Example 2
A comparison of the wearing out quality of two types of tyres was obtained by road testing.
Samples of 100 tyres were collected. The miles traveled until wear out was recorded and the results
given were as follows
Tyres T1 T2
Mean X 1 = 26400 miles X 2 = 25000 miles
Variance S 1= 1440000 miles S22= 1960000 miles
2
Solution
X 1 = 26400
X 2 = 25000
Difference between the two means
( )
X 1 − X 2 = (26400 – 25000)
= 1,400
Again we take the absolute value of the difference between the two means
We calculate the standard error as follows
S12 S 22
S (X − X ) = +
A B n1 n2
= 184.4
Confidence level at 70% is read from the normal tables as 1.04 (Z = 1.04).
Thus the confidence interval is calculated as follows
= 1400 ± (1.04) (184.4)
= 1400 ± 191.77
1,208.23 ≤ X ≤ 1591.77
23 | P a g e
SAMPLING THEORY
Pq
Sp = = Standard error for sampling of population proportions
n
Where n is the sample size and q = 1 – p.
The procedure for estimating a proportion is similar to that for estimating a mean, we only have a
different formular for calculating standard error is different.
Example 1
In a sample of 800 candidates, 560 were male. Estimate the population proportion at 95%
confidence level.
Solution
Here
560
Sample proportion (P) = = 0.70
800
q = 1 – p = 1 – 0.70 = 0.30
n = 800
pq
=
( 0.70 )( 0.30 )
n 800
Sp = 0.016
population proportion
= P ± 1.96 Sp where 1.96 = Z.
= 0.70 ± 1.96 (0.016)
= 0.70 ± 0.03
= 0.67 to 0.73
Example 2
A sample of 600 accounts was taken to test the accuracy of posting and balancing of accounts
where in 45 mistakes were found. Find out the population proportion. Use 99% level of confidence
Solution
Here
45
n = 600; p = = 0.075
600
q = 1 – 0.075 = 0.925
24 | P a g e
SAMPLING THEORY
Sp =
pq
=
( 0.075)( 0.925)
n 600
= 0.011
Population proportion
= P ± 2.58 (Sp)
= 0.075 ± 2.58 (0.011)
= 0.075 ± 0.028
= 0.047 to 0.10
pq pq
= (P1 – P2) ± Z +
n1 n2
p1n1 + p2 n2
Where P = always remember to convert P1 & P2 to P.
n1 + n2
2.SMALL SAMPLES
(a) Estimation of population mean
If the sample size is small (n<30) the arithmetic mean of small samples are not normally
distributed. In such circumstances, students t distribution must be used to estimate the population
mean.
In this case
Population mean µ = X ± tsx
X = Sample mean
s
Sx =
n
25 | P a g e
SAMPLING THEORY
( x − x)
2
Example
A random sample of 12 items is taken and is found to have a mean weight of 50 grams and a
standard deviation of 9 grams
What is the mean weight of population
a) with 95% confidence
b) with 99% confidence
Solution
s 9
X = 50; S = 9; v = n – 1 = 12 – 1 = 11; Sx = =
n 12
µ = x’ ± tsx
= 50 ± 5.72 grams
Therefore we can state with 95% confidence that the population mean is between 44.28 and
55.72 grams
At 99% confidence level
−9
µ = 50 ± 3.25
12
= 50 ± 8.07 grams
Therefore we can state with 99% confidence that the population mean is between 41.93 and
58.07 grams
Note: To use the t distribution tables it is important to find the degrees of freedom (v = n – 1). In
the example above v = 12 – 1 = 11
From the tables we find that at 95% confidence level against 11 and under 0.05, the value of t =
2.201
26 | P a g e