ST 221 Population Growth and Errors in Demographic Data

Download as pdf or txt
Download as pdf or txt
You are on page 1of 113

University of Dar es Salaam

ST 221: Population
Dynamics

Course Instructor: Mr. kigahe


Office No: 008 IRA Basement
Department of Statistics
University of Dar es Salaam

Population Dynamics
Course Objectives
• The course focuses on trends and interactions of
demographic components of population growth.
Course Description
• The inter-relation between components of population
growth and socio-economic variables is explored. Detection
of typical errors in demographic data and their correction is
also dealt with.
University of Dar es Salaam

Population Dynamics
Delivery:
• 30 Lectures and 15 Seminars
Assessment:
• 40% Coursework and 60% Final Examination
University of Dar es Salaam

Population Dynamics
Topic 1: Population growth and errors in demographic data
1.1 Recapping components of population growth.
1.2 Common errors in demographic data; alternate measures
of their detection.
1.3 Correction of errors: alternative methods of smoothing of
age structures.
University of Dar es Salaam

Population Dynamics
Topic 2: Stable population analysis and population projections
2.1 The stable population analysis: derivation of Lotka's equations
and application.
2.2 Intrinsic rates and age structure of typical age structures of
observed world populations.
2.3 Projection with growth rates; component projection. Building
the projection matrix and use in projections.
2.4 Packages: FORTRAN, PEOPLE, Others.
University of Dar es Salaam

Population Dynamics
Topic 3: Population-development interactions
3.1 Demographic measures that show the effect of population
growth on development, and their merits and demerits:
size and density, growth rate and doubling time, capital-
output ratio, dependency ratio.
3.2 Dynamising the dependency ratio for long-term reality.
University of Dar es Salaam

Population Dynamics
Topic 3: Population-development interactions
3.3 Introduction to population theories: schools of thought
on the effect of population and development:
malthusian and neo- malthusian, Boserupian and
Simonian, revisionism (the empirical school: Kuznets,
Easterlin etc). Lessons from population ageing in
industrial economies: origin, consequences, coping.
Kamuzora and the African labour intensive reality, and
ageing considerations
University of Dar es Salaam

Recapping Components of
Population Growth
University of Dar es Salaam

Population growth Vs Population Change


Population Growth
• Increase in population size
Population Change
• Increase (growth) or decrease (decay) in population size
University of Dar es Salaam

Demography Vs Population Dynamics


• Demography = the study of human populations (demos =
people in Greek, graphein = to write)

• Population dynamics = the motions/changes in a population


= how a population changes over time (and space)
• Population dynamics can be termed as demographic
processes that leads to population change
University of Dar es Salaam

Population and its related concepts


• Population:
− Size - Number of persons in a given area
− Structure - Age-sex composition
− Distribution - Arrangement in space
− Density - No. of persons per unit of land (sq. Km)
− Quality - Skills (education) and health
University of Dar es Salaam

Population and its related concepts


• Population Dynamics:
− Change in size, age-sex composition, vital rates (death, birth &
migration rates), rural-urban levels, density, etc.
− Growth rate – rate of increase per annum, a function of vital rates
(births, deaths & migration)
− Population momentum – in-built momentum of growth due to age
structure rather than growth rates
University of Dar es Salaam

World Population Dynamics & Distribution


Table 1: World Population – Distribution of Dynamic indicators, 2012
Region Size TFR Growth rate CBR IMR Life % urban
(millions) (%) expectancy
World 7,058 2.4 1.2 20 41 70 51
DCs 1,243 1.6 0.1 11 5 78 75
LDCs 4,464 3.0 1.7 25 49 66 45
Africa 1,072 4.7 2.5 36 67 59 39
Tanzania ==== == == == == == ==
University of Dar es Salaam

Factors influencing pop distribution


• Spatial distribution
− Environmental factors, e.g. topography, water availability, climate,
occupational factors, etc.
• Dynamic distribution
− Levels of socio-economic development e.g. education, occupation,
contraceptive prevalence, culture, etc.
University of Dar es Salaam

Exercise
1. Using Table 1 above, describe the variations in global
population dynamics and distribution
2. Discuss the factors that may have influenced the observed
world population dynamics and distribution
3. Describe population dynamics and distribution in Tanzania
University of Dar es Salaam

Concept of population change


• Popn change is caused by interplay of:
− Births;
− Deaths;
− Migrations: internal & international.
• First two form natural increase of population
− Natural increase is ultimate source of population growth in world
− Within a country migration can be a main determinant of
population growth
University of Dar es Salaam

Basic Model of Population Dynamics


University of Dar es Salaam

Balancing Equation
• Population change can be decomposed in a formula:
Pt - Po = (B – D) + (I – O) = NI ± NM
where
− Pt is population at end of period,
− Po that at beginning of period,
− B stands for births,
− D stands for deaths
− I refers to in-migration
− O refers to out-migration
• This simple equation is called "balancing equation" or component
equation
University of Dar es Salaam

General measures
• General population measures of population change exclude
specific measures of fertility, mortality & migration
• They include:
− Sex ratios: measures the balance of males & females
− Age dependency ratio: measures dependency level
− Population density: number of persons per area
− Population growth rate: annual population increase
− Doubling time: number of years for the size of a population to double
− Population pyramid: age-sex distribution of population
University of Dar es Salaam

General sex ratio


• Sex ratio is the ratio of males to females in a given
population.
• Expressed as the number of males for every 100 females.
• Sex Ratio = Male/Female
Example census 2002 for Tanzania:
• Males: 16,829,861; Females: 17,613,742
• Sex Ratio = 16,829,861/17,613,742
• Thus 95.5 males per 100 females
University of Dar es Salaam

Sex Ratio (SR) at birth


• Ratio of male to female births in a population
• SR at birth =Males births/Female births
Example in District A in 2007
• Male births= 16,517; Female births = 15, 986
• Thus sex ratio at birth= 16,517/ 15, 986
• Thus sex ratio at birth was 103 males per 100 female births
University of Dar es Salaam

Interpretation of sex ratios


• When sex ratio is:
− 100 - number of males & females is the same
− more than 100 - more males than females
− less than 100 - more females than males.
• Sex ratios at birth range between 102-104
University of Dar es Salaam

Interpretation of sex ratios


• Sex ratios decreases with age.
− At age 50+, SRs are less than 100 in many countries except India &
China due to heavy female mortality.
• In-migration areas have higher sex ratios than areas of out-
migration
• SRs have implications on;
− Labour force participation
− Communities’ socio-cultural & Psychological balances
University of Dar es Salaam

Age dependency ratio


• Ratio of persons under age 15 & over age 64 to those aged 15-
64 years.
• DR =No of Dependents/ Working population.
• Shows population in the “dependent” ages as related to the “
productive” ages
• Often used as an indicator of:
• fertility level & economic burden.
• Countries with very high birth rates have high dependency
ratios & a high economic burden
University of Dar es Salaam

Population growth rate


• Defined as rate at which the population is growing per annum
due to births, deaths & migration.
• Rate is expressed as ‘r’ indicating per cent growth per annum.
• Rates could be positive or negative
− +ve referring to increase & -ve decline
• Rates are established by using models that are based on
demographic statistics.
• Two models are commonly used:
− geometric & exponential
University of Dar es Salaam

Population growth rate


Geometric growth rate Exponential growth rate

Where,
r is growth rate,
P1 population at beginning of interval
P2 population at the end of interval
n is interval between counts
e base of natural logarithm (e=2.72)
University of Dar es Salaam

Significance of growth rate models


• These two models are useful for estimating:
− rates of growth in inter-censal intervals.
− size of a population at points of time between censuses
− size of population at any time in future.
− time it would take for a population to reach given size
University of Dar es Salaam

Population Pyramid
• A bar chart arranged horizontally showing distribution of a
population by age & sex.
• Conventionally pyramids are presented such that:
− younger ages are at the bottom
− males on left & females on right.
• The bars show numbers or proportions of males & females
in each single age or age group.
University of Dar es Salaam

Population Pyramid
• Pyramids:
− may be shown using single years age or age groups data
− show a historical view of a population
− differ between countries with high fertility & those with low
University of Dar es Salaam

Population Pyramid of Tanzania, 2002 Census

Male Female
80+
75-79
70-74
65-69
60-64
55-59
Age group

50-54
45-49
40-44
35-39
30-34
25-29
20-24
15-19
10-14
5-9
0-4
15.0 10.0 5.0 0.0 5.0 10.0 15.0 20.0

Per cent
University of Dar es Salaam

Population pyramids
University of Dar es Salaam

Usefulness of pyramids
• Shows population structure, i.e. age-sex composition
• Measure age distribution in a society
• Shows influence of determinants of population change in a
community
− i.e. effects of fertility, mortality & migration
• Gives population history of a community
University of Dar es Salaam

Pyramid reveals impacts of a popn’s past events


• World wars (deaths)
• Baby booms (after war)
• Sino-Japanese war
• Hinoeuma, comes after every 60 yrs
University of Dar es Salaam
University of Dar es Salaam

Young” & “Old” Populations


• Young populations
− populations that have a large proportion of people in the younger
age groups (below 15 years)
− e.g African countries.
• Old populations
− populations that have a large proportion of people in the older age
groups (aging)
• e.g developed countries
University of Dar es Salaam

Young” & “Old” Populations


• These two types of populations have:
− different proportions of the population in the labor force or in
school,
− different medical needs, consumer preferences, & crime patterns.
University of Dar es Salaam

Population momentum
• Refers to the tendency of a population to continue to grow
after replacement-level fertility has been achieved.
• A population that has achieved replacement fertility may still
continue to grow for some decades.
• This is because of the past high fertility
− leads to a high concentration of people in the youngest ages.
University of Dar es Salaam

Specific Measures of Population Change


• Measures of Fertility
• Measures of Mortality
• Measures of Migration
University of Dar es Salaam

Common errors in demographic


data
University of Dar es Salaam

Types of Error and Their Sources


• There are three main types of errors that can occur in any
demographic dataset, whether it represents an entire
population of interest (a census or a population register) or
a sample of this same population. A fourth type of error
only affects sample surveys.
University of Dar es Salaam

Types of Error and Their Sources


• Coverage errors result from a certain segment of the
population being missed from the data collection. For
example, people living in remote areas, nomadic populations
or those travelling during the data collection period may be
missed. Under some circumstances, particularly in sample
surveys, the scope of a data collection may exclude certain
population groups which may be difficult or too costly to
canvass. Such deliberate exclusions are not considered as
coverage errors
University of Dar es Salaam

Types of Error and Their Sources


• Response errors may result because the respondent may not
understand the question asked because of poor wording or
vagueness in the questionnaire. For example, if a survey asked
a voter if he/she liked a candidate it may be that some
voters in fact liked the candidate, but did not vote for this
particular person. A more appropriate question to
determine the voting intentions would have been: “Would
you vote for (name of candidate)?” Some questions are
sensitive and the respondents may deliberately give a vague
or incorrect response or even refuse to answer.
University of Dar es Salaam

Types of Error and Their Sources


• Processing errors occur at various stages of the data
processing, for example, when coding data into specific
categories or when transcribing original answers from one
medium (usually paper) to an electronic medium.
Sometimes these errors are caused by humans while at
other times they may result due to failure of the soft and
hard-ware used.
University of Dar es Salaam

Types of Error and Their Sources


• The above three types of errors are collectively referred to
as the non-sampling errors. On the other hand, sampling errors
are the result of obtaining answers from some but not all
members of the population of interest. This means that by
the luck of the draw one may have gotten higher income
households in a sample than the average household income
for the entire population. This may happen even though
there was no coverage error, response error or processing
error.
University of Dar es Salaam

Sampling Error
• Refer to the difference between the estimate derived from
a sample survey and the 'true' value that would result if a
census of the whole population were taken under the same
conditions.
• These are errors that arise because data has been collected
from a part, rather than the whole of the population.
• Because of the above, sampling errors are restricted to
sample surveys only unlike non-sampling errors that can
occur in both sample surveys and censuses data.
University of Dar es Salaam

Sampling Error
• There are no sampling errors in a census because the
calculations are based on the entire population.
• They are measurable from the sample data in the case of
probability sampling.
University of Dar es Salaam

Factors Affecting Sampling Error


It is affected by a number of factors including:
sample size.
• In general, larger sample sizes decrease the sampling error,
however this decrease is not directly proportional.
• As a rough rule of the thumb, you need to increase the
sample size fourfold to halve the sampling error but bear in
mind that non sampling errors are likely to increase with
large samples.
University of Dar es Salaam

Factors Affecting Sampling Error


The sampling fraction
• this is of lesser influence but as the sample size increases as
a fraction of the population, the sampling error should
decrease.
Sample design
• An efficient sampling design will help in reducing sampling
error.
University of Dar es Salaam

Factors Affecting Sampling Error


The variability within the population.
• More variable populations give rise to larger errors as the
samples or the estimates calculated from different samples
are more likely to have greater variation.
• The effect of variability within the population can be
reduced by the use of stratification that allows explaining
some of the variability in the population.
University of Dar es Salaam

Characteristics of the sampling error


• generally decreases in magnitude as the sample size
increases (but not proportionally).
• depends on the variability of the characteristic of interest in
the population.
• can be accounted for and reduced by an appropriate sample
plan.
• can be measured and controlled in probability sample
surveys.
University of Dar es Salaam

Reducing sampling error


• If sampling principles are applied carefully within the
constraints of available resources, sampling error can be
kept to a minimum.
University of Dar es Salaam

Errors Detection and Correction


in
Demographic Data
University of Dar es Salaam

Demographic Data Evaluation


• Data evaluation is the assessment of the quality of the data.
• In evaluating the data, sometimes it is adjusted in order to
ensure that it is of an acceptable standard.
• The adjustment is done on the basis of the responses to the
questions which were asked during the data collection.
• Questions such as
− Sex of members of household
− Age (in completed years) of members of household
− Residential status of household
University of Dar es Salaam

Methods of Evaluation
• In general, two approaches are used to evaluate the quality
of data,
i. direct methods
ii. indirect methods.
University of Dar es Salaam

Direct methods
• The direct method basically involves the carrying out of
what is referred to as a Post Enumeration Survey (PES).
• In a PES, a sample of households is revisited after the census
and data are again collected but on a smaller scale and later
compared with that collected during the actual census.
• The matching process of the two sets of data can then be
used to evaluate the quality of the census data.
University of Dar es Salaam

Indirect methods
• Indirect methods usually employ the comparison of data
using both internal and external consistency checks.
• Internal consistency checks compare relationships of data
within the same census data, whereas
• external consistency checks compare census data with data
generated from other sources.
• For instance, one can compare data on education obtained
during a census with administrative data maintained by the
Ministry of Education.
University of Dar es Salaam

Ways of Detecting Errors


• Various internal and external consistency checks
• Demographic modeling, using age and cohort information
• Check for non-random patterns of non-response and
missing
• Simple symptoms such as heaping can suggest more serious
problems
University of Dar es Salaam

Detection of Errors in Age Data


• Demographic data are usually classified by age and sex.
• Despite this importance; a variety of irregularities and
misstatements have been noted with respect to age-related
data.
• These irregularities must be detected, adjusted or corrected
before demographic data could be used for any meaningful
analysis.
University of Dar es Salaam

Age heaping
• Demographers use data from single years of age to
determine whether there are irregularities or
inconsistencies in the data
• Age heaping happens if a population tends to report certain
ages (e.g., those ending in 0 or 5) at the expense of other
ages
• Age heaping tends to be more pronounced among
populations or population subgroups with low levels of
education
University of Dar es Salaam

Examples of age heaping


• In some cultures, certain numbers and digits are avoided
• For example, “13” is frequently avoided in the West because
it is considered unlucky
− Hotels in the US and in some Western countries sometimes do
not have floors designated as 13
• The numeral “4” is avoided in Korea and China, since it has
the same sound as the word/character for “death
− Many hotels in China, South Korea, and some other East Asian
countries do not have floors designated as 4
University of Dar es Salaam

Errors in Single Years of Age


Measurement of Age and Digit Preference
i. Indexes of Age Preference
ii. Whipple’s Index
iii. Myers’s Blended Method (Myers’s Index)
iv. United Nations Age Sex Accuracy Index
University of Dar es Salaam

Measurement of Age and Digit Preference


• See the distributed table and comment on the age distribution.
• It’s noticed that ages that end with “0” and “5” have higher
numbers than their neighboring ages.
• This phenomenon is called digit preference.
• In order to measure the degree of digit (age) preference we apply
specific techniques.
• Consider the population of age 30.
• The population in this age are expected to be a number that’s less
than population in age 29 and greater than population in age 31.
University of Dar es Salaam

Measurement of Age and Digit Preference


• So that, in order to measure the bias in age reporting for
this age (30) we can apply the following equation (index):
University of Dar es Salaam

Measurement of Age and Digit Preference


• Or we may consider this age to be the average of five year
ages; that is 28, 29, 30, 31, and 32.
University of Dar es Salaam

Measurement of Age and Digit Preference


• In this case, the two indexes are similar whether a 3-year
group or a 5-year group is used; both indicate substantial
heaping on age 30.
• The higher the index, the greater the concentration on the
age examined;
• An index of 100 indicates no concentration on this age.
University of Dar es Salaam

Whipple’s Index
• Whipple index is on of the widely used indexes to measure
age misreporting.
• The index was invented by the American demographer
George C.Whipple (1866–1924).
• It has been develop to reflect preference of terminal digits;
0 and 5.
• Calculation of the index:
University of Dar es Salaam

Whipple’s Index
• The choice of the range 23 to 62 is largely arbitrary.
• In computing indexes of heaping, the ages of childhood and
old age are often excluded because they are more strongly
affected by other types of errors of reporting than by
preference for specific terminal digits
University of Dar es Salaam

Whipple’s Index
University of Dar es Salaam

Myers’s Blended Method (Myers’s Index)


• Myers’ method also measures preferences for each of the
ten possible digits and proposes a blended index.
• It is based on the principle that in the absence of age
heaping, the aggregate population of each age ending in one
of the digits 0 to 9 should represent 10% of the total
population.
• The index is calculated by summing the number of people
whose age ends with a particular digit for the population
aged 10 and over, and then for the population aged 20 and
over.
University of Dar es Salaam

Myers’s Blended Method (Myers’s Index)


• Each series is then weighted and the results are added to
obtain a blended population.
• Myers’ blended index is obtained by summing the absolute
deviations between the aggregate and theoretical
distributions (10%)
• A summary index of preference for all terminal digits is
derived as one-half the sum of the deviations from 10.0%,
each taken without regard to sign.
University of Dar es Salaam

Myers’s Blended Method (Myers’s Index)


University of Dar es Salaam

Myers’s Blended Method (Myers’s Index)


University of Dar es Salaam

Myers’s Blended Method (Myers’s Index)


University of Dar es Salaam

Myers’s Blended Method (Myers’s Index)


• If age heaping is nonexistent, the index would approximate
zero.
• This index is an estimate of the minimum proportion of
persons in the population for whom an age with an
incorrect final digit is reported.
• The theoretical range of Myers’s index is 0, representing no
heaping, to 90, which would result if all ages were reported
at a single digit.
University of Dar es Salaam

United Nations Age Sex Accuracy Index


• This index which was proposed by the United Nation is
used for evaluation of five-year age-sex data. The index is
also referred to as Joint Score. It has three components;
i. Average sex ratio score (S)
ii. Average male age ratio score (M)
iii. Average female age ratio score (F)

• The index is then computed as: UNAI = 3(S) + M + F.


University of Dar es Salaam

United Nations Age Sex Accuracy Index


• The index of sex-ratio score (SRS) is defined as: The mean
difference between sex ratios for the successive age groups,
averaged irrespective of sign
• The index of age-ratio score (ARS) is defined as: The mean
deviation of the age ratios from 100 percent, also
irrespective of sign
University of Dar es Salaam

Age Ratios
• Age ratios for 5-year age groups are used as indices for
detecting possible age misreporting
• Normally age ratios are expected to be similar throughout
the age distribution, and all of them should be close to a
value of 100
University of Dar es Salaam

Age Ratios
• An age ratio is defined as:
5Px

5ARx = 100 *
1/2 (5Px-5 + 5Px+5)
where: 5ARx = age ratio for ages x to x+4
5Px = population at ages x to x+4
University of Dar es Salaam

United Nations Age Sex Accuracy Index


• The reported age-sex data for a given population is
presumed to be accurate if the age-sex accuracy index is
between 0 and 19.9, inaccurate if the index is between 20
and 39.9, and highly inaccurate if the index is above 40
University of Dar es Salaam

• Example: use the United Nation age-sex accuracy index to assess the age – sex
reporting of the data shown in the table below
University of Dar es Salaam

Errors in Grouped Data


• Intercensal means between two successive censuses.
• If we are dealing with a closed population (no migration),
the difference between the total number of population in
two successive censuses will be solely attributed to births
(increase) and deaths (decrease).
• The basics:
− Population Increases by births and decreases by deaths
− Population increases by immigrants and decreased by emigrants
University of Dar es Salaam

Errors in Grouped Data


• Then expected population in age group a in the second
census (Pta+10) is the population of age group a+10 in the
second census – deaths that occurred in the intercensal
period + migration to the country in the intercensal period
– migration from the country in the intercensal period.
University of Dar es Salaam

Errors in Grouped Data


• That is:
Pta+10= P0a - Da (0-t) + Ia (0-t) - Ea (0-t)
Where:
Pta+10= Population of age group a+10 in the second census
P0a = Population of age group a+10 in the second census
Da (0-t) = Deaths occurred in the intercensal period
Ia (0-t) = Migrants arrived in the country in the intercensal period
Ea(0-t) = Migrants who left the country in the intercensal period
University of Dar es Salaam

Reporting of Extreme Old Age


• Census age distributions at advanced ages, say for those 85
years old and over, suffer from serious reporting problems,
with age exaggeration in older ages generally considered to
be common.
• To avoid this problem we tend to make the last age group
as an open ended group.
• Example: 70+ or 80+
University of Dar es Salaam

Methods for Correcting Age Misreporting


• While errors in an age-sex distribution are not always
matters of age misreporting, this type of error is common
for some populations. The analyst’s task is to distinguish
− age misreporting over parts of the age distribution from
− age misreporting over most or all of the age distribution from
− age-and-sex-selective reporting errors from
− irregularities in an age-sex distribution due to real demographic
events.
• Smoothing techniques have frequently been used for
correcting data for age misreporting.
University of Dar es Salaam

Age Misreporting and Smoothing - Introduction


• In this part, we consider techniques for smoothing the
population age distribution when we believe there are errors in
age reporting.
• What do we hope to accomplish by smoothing?
• Reasons to smooth, and not to smooth
• Types of smoothing methods:
− Light vs. strong
− Preserve vs. modify slightly population totals
• Tips for deciding whether smoothing is needed and which
method might be most appropriate
University of Dar es Salaam

Why Might We Want to Smooth?


Reasons to smooth:
• When ages are misreported
• Planning and policies that require accurate counts by age
may be affected. Examples:
− Children entering school system
− Young males reaching military draft age
− Distribution of older age public benefits
University of Dar es Salaam

Why Might We Want to Smooth?


Reasons to smooth:
• Flawed age-sex structures, when projected into the future,
will also be flawed.
• Flawed age counts used as denominators in demographic
rates (e.g., child mortality) may bias those rates.
University of Dar es Salaam

Methods for Correcting Age Misreporting


Light versus Strong
• Light (or “slight”) smoothing, which gently modifies
irregularities in the age structure
• Strong smoothing, which modifies most irregularities, and
therefore more likely to modify features which may
represent actual facts instead of errors
University of Dar es Salaam

Smoothing Algorithms Compared


Light versus Strong
• Smoothing techniques may be lighter or stronger depending
on the formulas that construct the weighted averages:
− Light Smoothing – formulas that give the greatest weight to what
was reported for the age group in question and smallest weight to
adjacent age groups
− Strong Smoothing – formulas that give greater weight to adjacent
age counts and/or over wider age intervals. Resulting pattern does
not follow the contours of reported data as well as lighter
smoothing.
University of Dar es Salaam

Light Smoothing Formula


University of Dar es Salaam

Strong Smoothing Formula


University of Dar es Salaam

Smoothing Algorithms Compared


University of Dar es Salaam

Smoothing Algorithms Compared


University of Dar es Salaam

Smoothing Algorithms Compared


University of Dar es Salaam

Smoothing Algorithms Compared


University of Dar es Salaam

Methods for Correcting Age Misreporting


• Light (or “slight”) smoothing, which gently modifies
irregularities in the age structure:
− Carrier-Farrag
− Karup-King-Newton
− Arriaga (“light” formula)
− United Nations
University of Dar es Salaam

Methods for Correcting Age Misreporting


• Strong smoothing, which modifies most irregularities, and
therefore more likely to modify features which may
represent actual facts instead of errors
− “Strong” Arriaga formula
University of Dar es Salaam

Methods for Correcting Age Misreporting


• Light (or “slight”) smoothing, which gently modifies
irregularities in the age structure:
a) methods which preserve the enumerated population totals
within each 10-year age group; and
b) methods which modify the enumerated population
totals.
University of Dar es Salaam

Methods for Correcting Age Misreporting


• Strong smoothing, which modifies most irregularities, is also
more likely to modify features which may represent actual
facts instead of errors.
− Also modifies the enumerated population totals within 10-year age
groups.
University of Dar es Salaam

Methods for Correcting Age Misreporting


Preserve versus Modify Totals
• Light (or “slight”) smoothing, which gently modifies irregularities
in the age structure:
a) methods which preserve the enumerated population totals
within each 10-year age group:
Carrier-Farrag, Karup-King-Newton,Arriaga (“light” formula)
b) methods which modify the enumerated population totals in
10-year age groups: United Nations (and Arriaga’s strong
smoothing formula)
University of Dar es Salaam

Methods for Correcting Age Misreporting


Preserve versus Modify Totals
• Strong smoothing, which modifies most irregularities, is also
more likely to modify features which may represent actual
facts instead of errors.
− Also modifies the enumerated population totals in 10-year age
groups (but not the overall population total): “Strong” Arriaga
formula
University of Dar es Salaam

The Carrier-Farrag Technique


• The Carrier-Farrag technique is based on the assumption
that the relationship of a 5-year age group to its constituent
10-year age group is an average of similar relationships in
three consecutive 10-year age groups.
P
5 x+5 = P
10 x / [1 + ( P / P
10 x-10 10 x+10 )1/4 ] and P = P - P
5 x 10 x 5 x+5
where:
5Px+5 represents the population at ages x+5 to x+9;
10Px represents the population at ages x to x+9; and
5Px represents the population at ages x to x+4.
University of Dar es Salaam

The Karup-King-Newton Formula


• The Karup-King-Newton formula assumes a quadratic
relationship among each three consecutive 10-year age
groups.
1 1
5Px = 10Px + (10Px-10 - 10Px+10 ) and
2 16
5Px+5 = 10Px - 5Px

Where 5Px is the first of two 5-year age groups comprising a 10-year
age group 10Px.
University of Dar es Salaam

Arriaga’s Light Smoothing Formula


• Arriaga’s formula assumes that a second degree polynomial
passes by the midpoint of each three consecutive 10-year
age groups and then integrates a 5-year age group.
University of Dar es Salaam

Arriaga’s Light Smoothing Formula


• When the 10-year age group to be separated is the central
group of three, the following formulas (Arriaga, 1968) are
used:
5Px+5 = (-10Px-10 + 11 10Px + 2 10Px+10 ) / 24 and
5Px = 10Px - 5Px+5
where:
5Px+5 is the population ages x+5 to x+9;
10Px is the population ages x to x+9; and
5Px represents the population at ages x to x+4.
University of Dar es Salaam

Arriaga’s Light Smoothing Formula


• When the 10-year age group to be separated is an extreme age
group (the youngest or the oldest), the formulas are different.
For the youngest age group, the following formulas are used:
5Px+5 = (8 10Px + 5 10Px+10 - 10Px+20 ) / 24 and
5Px = 10Px - 5Px+5

For the oldest age group, the coefficients are reversed:


5Px = (- 10Px-20 + 5 10Px-10 + 8 10Px) / 24 and

5Px+5 = 10Px - 5Px


University of Dar es Salaam

The United Nations Formula


• United Nations (Carrier and Farrag, 1959) developed the
following formula:
5P'x = (1/16) (- 5Px-10 + 4 5Px-5 + 10 5Px + 4 5Px+5 - 5Px+10 )
where:
5P'x represents the smoothed population ages
x to x+4.
University of Dar es Salaam

Arriaga’s Strong Smoothing Formula


• If more aggressive smoothing is desired (Arriaga, 1968), this
can be achieved with the following formula:
10P'x = (10Px-10 + 2 10Px + 10Px+10 ) / 4
Where:
10P'x represents the smoothed population
ages x to x+9.
University of Dar es Salaam

What to Do?
• There is no generalized solution for all populations.
• The smoothing technique to be used will depend on the errors
in the age and sex distributions, and so the age structure must
be analyzed before deciding whether the smoothing should be
strong or light.
• While, as Arriaga and Associates (1994) note, differences in
results across procedures are small, a decision to use strong
smoothing should not be taken lightly.
• Recognize that the whole age distribution need not be
smoothed if only part is considered problematic.
University of Dar es Salaam

What to Do?
• Make a graph of the age and sex distributions before making any
decision about whether or not smoothing is required and which
formula or technique would be appropriate for the particular
country's situation (Pyramid, Pyr2, GRPOP-YB).
• In general, a regular saw-tooth pattern across successive age
groups provide a good rationale for smoothing.
• Comparisons among successive censuses and a knowledge of
past trends of mortality, fertility, and migration will help in
appraising the accuracy of the reported age and sex structure of
the population.
University of Dar es Salaam

To Smooth or Not to Smooth? - Cautions


1. Caution – Since strong smoothing may erase actual
demographic history, a decision to use it should be considered
very carefully.
2. Caution – Even if smoothing produces more plausible age
distributions, it may not improve distortions in sex ratios by
age (and vice versa).
3. Caution - The population age distribution may not need to be
fully smoothed across all age groups if only part of it is
considered problematic.
4. Caution – If underreporting exists at a particular age, instead
of smoothing, one may need “filling.”
University of Dar es Salaam

Exercise
• Apply the various smoothing methods to the results of two
censuses from your country (if possible).
• Decide which method (if any) seems to work best for your
data.
• Does your data need smoothing at all ages?

You might also like