Dietz - SMMR - 1993
Dietz - SMMR - 1993
Dietz - SMMR - 1993
http://smm.sagepub.com/
Published by:
http://www.sagepublications.com
Additional services and information for Statistical Methods in Medical Research can be found at:
Subscriptions: http://smm.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://smm.sagepub.com/content/2/1/23.refs.html
What is This?
The basic reproduction number R o is the number of secondary cases which one case would produce in a
completely susceptible population. It depends on the duration of the infectious period, the probability of
infecting a susceptible individual during one contact, and the number of new susceptible individuals
o may vary considerably for different infectious diseases but also for
contacted per unit of time. Therefore R
the same disease in different populations. The key threshold result of epidemic theory associates the
outbreaks of epidemics and the persistence of endemic levels with basic reproduction numbers greater than
one. Because the magnitude of 0 allows one to determine the amount of effort which is necessary either to
R
o for a given
prevent an epidemic or to eliminate an infection from a population, it is crucial to estimate R
disease in a particular population. The present paper gives a survey about the various estimation methods
available.
1 What is Ro ?
The concept goes back to demography where it is usually called the ’net reproduction
rate’. The head of the Statistical Office of Berlin, Richard B6ckh’ calculated in 1886
what he called the ’total propagation of the population’ (’die totale Fortpflanzung der
Bevolkerung’). Using a life-table for females of the year 1879 he summed the products of
the survival probabilities for all reproductive years between 14 and 53 and the rates of
giving birth to a girl. He concluded that 2.172 female babies would be born to a
representative women who throughout her life would be subject to current age-specific
mortality and fertility rates. The formula for this quantity was given by Sharpe and
Lotka2 in 1911, but without symbol and name. At the time of writing this survey it is not
clear to the author where exactly Lotka introduced the notation Ro and the name ’net
reproduction rate’ for the first time. In 1939 he published a summary of his contribu-
tions to demography in French. In a comparison of the advantages and disadvantages of
various indices for natural increase he writes: ’La reproductivit6 nette, Ro, introduite
par Boeckh, a plus de m6rite, car elle donne une mesure essentiellement ind6pendante
de la r6partition de la population par ages.’
Let p (a) denote the probability of a woman surviving age a and let r3(a) be the rate of
giving birth to a girl for an individual of age a then
Address for correspondence: Professor K Dietz, Department of Medical Biometry, University of Tübingen,
Westbahnhofstr. 55, D-7400 Tübingen, Germany.
i.e. it is ’the number of female children that a female just born may expect to bear during
her reproductive life, ignoring the possibility of mortality’.4)
The index zero of Ro is explained by the notation’ .
i.e. it is the moment of order zero of the net maternity function <f>(a) p(a) r3(a). =
The ’natural rate of increase’ of the population is usually denoted by r. It satisfies the
transcendental equation
human cases which one case could generate during the infectious period via the vector
population. The paper of Diekmann et al.,14 however, defines Ro as the number of
secondary cases per generation, i.e. for host-vector diseases one would take the square
root of the classical basic reproduction number. In terms of the threshold condition
Ro = 1 this makes no difference. If one defines Ro like Macdonald,6 the minimum
proportion to be vaccinated in the human population if one had a vaccine for the
prevention of a vector-borne disease, would be given by the expression’
The same equations which describe the spread of malaria between humans and
mosquitoes can also be used to describe the spread of a sexually transmitted disease
(STD) in a heterosexual population. This analogy has already been noticed by ROSSI7:
’The venereal diseases may be looked upon as metaxenous diseases in which the two
sexes take the part of the two hosts.’ .
Hethcote and Y orke18 use the malaria equations of Ross for the description of the
heterosexual transmission of gonorrhea. They speak of the ’second generation contact
number’ which ’is the average number of women (second generation) adequately
contacted by men (first generation) who were adequately contacted by an average
infectious woman during her infectious period.’ If one retains the notion of Ro as the
average number of secondary cases per generation, then one has to determine the
geometric mean of the number of secondary cases if the opposite sex generated by one
newly infected case. Similarly one would also define the basic reproduction number in
the case of vector-borne diseases as the geometric mean of the reproduction numbers per
generation.
The definition of the basic reproduction number does not depend on the assumptions
about the immunity after the infectious period. This means that Ro is the same for the
so-called SIS and SIR models, i.e. for models where an individual is either susceptible
again after the infection or completely immune (’resistant’).
Usually the transmission dynamics are described by a nonlinear system of differential
equations which is then linearized. Let K denote the number of persons contacted per
unit of time by one infectious individual. The proportion of contacted persons who are
infected given a contact is denoted by h. The rate of tranfer to a non-infectious state
(either susceptible or immune) is denoted by y. Let the initial size of the population be
denoted by No and let the initial number of infectious individuals be equal to one. Then
the following differential equation describes the initial spread of the epidemic, where it
is assumed that No is large such that it is justified to apply differential equations. Let Y
denote the number of infectives and X the number of susceptibles then
Since X equals No - 1 initially, we see immediately that Y will increase if and only if
Ro Kh/y is greater than one. The duration of the infectious period with mean D equals
=
1/y and the total number of persons contacted during this period is EcD. The fraction h of
those individuals contacted will be infected. This deterministic model implictly assumes
an exponentially distributed infectious period, and the rate of generating secondary cases
r3 Kh defines a Poisson process as long as the individual is infectious. This means that
=
the number of secondary cases per initial case has a geometric distribution:
In the epidemiological literature the terms elimination and eradication are often used
synonymously. Both refer to a state of zero prevalence of the infection. But a prevalence
of zero may refer to a stable or an unstable equilibrium. Here stability refers to the result
of an introduction of an infectious case into the population.
If the zero equilibrium is stable then the introduction of one case will at most lead to a
small number of generations of secondary cases, and therefore to a return to the original
prevalence of zero. This situation can either occur naturally, without special interven-
tions because of low local contact rates and infection probabilities per contact, or because
of permanent interventions like vaccination programmes which reduce the proportion
of X/N susceptibles sufficiently such that Ro XIN is less than one. Such a stable
prevalence of zero is called elimination.
If however the zero equilibrium is unstable the introduction of an infectious case will
lead to a major epidemic and potentially to a subsequent endemic state, depending on
the size of the population and the rate of introduction of new susceptibles. A prevalence
of zero would be the result of a time-limited intervention after which Ro is allowed to go
back to its original level. Such a programme is called eradication. A unique successful
example is the global smallpox eradication programme. Presently it is planned to
eradicate polio globally by the year 2000. Measles are targeted for elimination by the year
2000 in the USA and in Europe. Because of the ongoing risk of introduction of cases one
would have to continue vaccination for the maintenance of a stable prevalence of zero
until a global measles eradication programme was successful - which may mean forever.
completely immunized, i.e. the susceptibility is reduced to zero. For the remaining
fraction 1- r, the susceptibility is reduced to the value 1- s. If c denotes the coverage of
the vaccination programme, i.e. the proportion of the population vaccinated, then the
vaccination programme reduces the reproduction number to the value
From this it follows that the minimum coverage for elimination or eradication is given by
For many vaccines it may be appropriate to set s 0, i.e. the vaccine has an all or
=
nothing effect: either the vaccine completely protects the individual (which happens
with probability r) or the individual is completely susceptible (which happens with
probability 1- r). In the terminology of Smith et al.21 this corresponds to a ’Model 2’
vaccine. The other extreme assumes r 0 such that all vaccinated individuals have the
=
same reduced susceptibility 1- s. Such a model may be appropriate for inactivated polio
vaccines. The corresponding assumption is referred to as ‘Model 1’. If r and s are smaller
than 1 then the lower bound for the coverage may be greater than 1, which would mean
that even for 100% coverage eradication or elimination is not possible. From the lower
bound given above this conclusion depends very much on the value of the basic
reproduction number.
2.3 Chemotherapy or reduction of the contact rate
If we consider a control programme which reduces either the duration of the
infectious period by chemotherapy or the contact rate by changes in behaviour then the
reproduction number R * is given by the following formula ,1, .
where r and rD denote the reductions for the corresponding parameters indicated in the
index. For elimination or eradication R* has to be less than one. Therefore the rK rD has
to be greater than Ro . This inequality clearly shows that Ro can be interpreted as the
minimum absolute elimination or eradication effort, if we are dealing with a
homogeneous population, and a control method which effects everybody in a non-
selective way. This means that the minimum proportional reduction of the susceptible
fraction is given by
which is highly non-linear function of Ro . For Ro > 10, control programmes have to be
a
nearly perfect requiring a reduction of the parameters by more than 90%; whereas for
values of Ro between 1 and 2 a reduction of the parameters by less than 50% will already
be successful. This non-linear relationship between the minimum proportional reduc-
tion in the transmission factors K and D, and the basic reproduction number Ro, is the
key to understanding the puzzle that apparently the same control programme may be
successful in one situation and not at all in other situations.
For the evaluation of control strategies it is not only important to estimate by how
much Ro exceeds 1 but also to determine how sensitive Ro is with respect to changes of
individual parameters entering into a formula of Ro . In the simple formula given above
this is straightforward. But if individual parameters enter in a nonlinear way, then
important conclusions may be drawn about the relative effect of different control
strategies. We shall illustrate this point in the next section in comparing larvicides versus
adulticides for malaria control, and use of condoms versus reduction of the number of
partners in the prevention of STDs.
The last parameter is sometimes broken down into two factors describing infectivity and
a factor describing the susceptibility.22 Obviously if all individuals are alike then one
cannot identify the individual components of this product. If however there is
heterogeneity with respect to infectivity and susceptibility, then the ratios between the
values in the corresponding subgroups may be estimated.
The estimation of Ro on the basis of the individual parameters will be particularly
relevant in populations where the infection is absent and there is a problem of assessing
the risk of an epidemic. It is obvious that an attempt to estimate Ro is only meaningful
for diseases, where contacts are clearly defined such that they can be counted. This
excludes whole classes of infectious diseases like measles and cholera, where the spread
is either airborne or due to contamination of food and/or water. Because of this
limitation, attemps to determine Ro directly from individual parameters have been
restricted: on the one hand to vector-borne infections based on estimates of the number
of human bloodmeals which one vector takes per unit of time and the number of vectors
contacting one human host per unit of time; and on the other side for STDs based on
estimates of the number of new partners per person per unit of time and the number of
contacts per partner. .
probability of choosing man as a host. In the mosquito there is a latency period I2, the
so-called extrinsic cycle, which is necessary for the development of sporozoites after the
mosquito was infected by gametocytes. Because of the high death rate ~2 ? only a small
fraction exp ( - ~,212 ) survives this latency period. Let D1 be the duration of the
infectious period in the human host and D2 be the duration of the infectious period in the
vector, and let h, and h2 be the probabilities per bloodmeal of infecting a susceptible
human host by an infectious mosquito and vice versa, respectively. Putting all these
parameters together one arrives at the following formula
*Dye C. The analysis of parasite transmission by bloodsucking insects. Annu Rev Entomol 1992; 37: 1-19.
Larvicides would reduce the emergence rate V2 and alduticides would reduce the life
expectancy D2 of the vector. The corresponding reductions are denoted by r v and rD2.
We can express R* on the one hand as a function of rV2 and alternatively as a function of
rD2 . Because the two expressions have to be equal, one can derive the following formula
which expresses r,, in terms of rDz :
All parameters which are not affected by the two control methods can be cancelled. It is
clear that the situation without control (r&dquo;2 rn = 1) must be a solution of this equation.
=
But we also see that the reduction in mosquito emergence through larvicides r, which
would bring about the same value of R* as a given reduction in mosquito survival
described by rD is much greater. For rD 2 we obtain r, --:= 10 and for rD
=
4 we have =
cost and effectiveness of the intervention methods by which we can affect the various
factors.’
2 ’Models commonly assume more uniformity than there is in reality, and in terms of
control this usually leads to exaggerated optimism.’
Also the number of contacts during the infectious period may undergo a disease-induced
reduction. Therefore the present estimates for Ro in the papers quoted above have to be
considered very tentatively. New models have to be developed which would allow one to
derive estimates of Ro if one had the appropriate data. Also, further data are needed to
improve estimates of the infection probability per contact and the duration of the
infectious period. The models for Ro also need extension in terms of behavioural
variables such that alternative interventions can be compared with respect to their
impact in reducing Ro .
Models which take partnership duration into account produce highly nonlinear
formulae with respect to the parameters describing HIV transmission. For constant
infection probability per contact h one gets the following formula:
where
and
Here N is the number of new partners during the infectious period and hp is the infection
probability per partner. The individual parameters have the following definitions (in the
order of appearance in the formula for N):
The formula for the infection probability per partner can be considerably simplified if
one introduces the probability generating function f (x ) for the number of contacts per
partner:
.
where c is the average number of contacts per partner. The expected number of contacts
of an infected partner with a susceptible partner is given by
because the average duration of such a partnership is ([Lo + ~,1 + a) - 1. Let C denote the
total number of expected contacts with susceptible partners during the infectious period:
This simple formula clearly shows how Ro depends jointly on the number of partners
and the number of contacts during the infectious period. If C N, i.e. if only one =
For a given number of total contacts C, this formula provides a lower bound for the
number of partners for Ro > 1:
where
more effective in reducing R* than the same reduction of h depends on q. For q > 1; i.e.
for N> hC, the best advice would be to reduce h, i.e. to use condoms. For q< 1; i.e. for
N < hC, a given proportional reduction of the number of partners N would reduce Ro
more than a similar reduction in h. In applying this conclusion the two caveats of
Molineaux3° given in Section 3.1 should be kept in mind.
Here D is the duration of the infectious period and td is the initial doubling time.
Cairns42 applies a more detailed model to establish a relationship between the initial rate
of growth and Ro which is now no longer explicitly determined because it has to be
calculated as the eigenvalue of a certain matrix, taking into account the heterogeneity in
the contact rates and variable infectivity.
This means for the values reported by Smith the estimates of Ro would be between 1.4
and 1.6, with the consequence that immunizing about 38% of the population would have
been sufficient to prevent the epidemic. (If one takes the inverse of the proportion of
susceptibles one would require immunizing about 3 of the population.) It should be
stressed however that the formula given above assumes that the total population was
susceptible at the beginning of the epidemic. In general one has the following formula43:
where uo denotes the initial proportion of susceptibles. If for example due to previous
epidemics already 25% of the population would be immune, i.e. uo 0.75, then Ro =
would be 1.9. The differences of the various estimates do not appear to be large, but one
has to keep in mind that small differences of Ro between 1 and 2 imply important
differences of the resulting prevalence.
In 1967 an outbreak of smallpox occurred in a socially isolated community in
Abakaliki in south-eastern Nigeria. It was investigated by Thompson and Foege.44 It
would be interesting to find out more details about this epidemic because starting with
Bailey and Thomas45 many statistical analyses of this data set have been published (see
Becker46 for references). Bailey and Thomas 41 list the number of days between 30 cases.
Part of the population consisted of 120 members of a religious group who in principle
refused to be vaccinated. If one assumes that initially all individuals except the
introductory case were susceptible then the formula given above yields an estimate of
1.15 for Ro. Bailey and ThomaS41 mention however that ’about a quarter of them
appeared from scars in fact to have been vaccinated at some time or other’. This suggests
that the initial proportion of suceptibles was 75%, which would bring the final
proportion of susceptibles to 50% which corresponds to Ro of 1.62. This religious group
lived together with other individuals who did not belong to this group and who were
vaccinated. A re-analysis of this epidemic, based on the original report, is in preparation.
The survey paper by Becker 47 in the present issue contains a description of his
estimation approach for Ro using martingale theory.
where R2 and 12 have the same meaning as above and Y2 is the equilibrium prevalence of
mosquitoes with sporozoites in their salivary glands, i.e. the proportion of infectious
DietzSl,52 provides a system of formulae which relate the equilibrium variables of the
prevalence of infection in the human and the vector host and the force of infection with
the square of the basic reproduction number. The five models differ with respect to their
assumptions about density dependent regulation:
Model I: No superinfections in man and mosquito.
’
From the formula for Ro based on the force on infection one can see that it corresponds
to Model II in the list given above. Najera27 used these formulae with data collected in a
field project in Nigeria, which was carried out to test whether malaria could be
eradicated in Africa with a combination of spraying DDT and mass drug administration.
For example, in the treated zone for the period mid-July to mid-September 1967 he gets
the following three estimates for Ro :
0 &dquo;
,.....
1.02
(based on the proportion of infectious mosquitoes);
..
(based on the force of infection and the proportion infectious in the human
2.88
population);
135.77 (based on the vectorial capacity). ..
It is
no surprise that the first two estimates differ because they are based on different
assumptions about the density-dependent regulation. The first one is based on Model III
and the second one on Model II.
The observation that estimates of Ro based on prevalences in equilibrium situations in
the human and the vector host tend to be smaller than expected from direct observations
of the transmission intensity has also been discussed by Barbour. 53 He considers a model
for schistosomiasis which mathematically speaking is equivalent to Model III given
above. According to this model the basic reproduction number is only determined by the
proportion of susceptible snails. Since these proportions are usually in the order of 1 to
10% the corresponding values for Ro are only slightly above 1. Because this is considered
to unrealistic, Barbour takes into account immunity in the human host and derives
be
equations which would allow one to estimate RÕ by the inverse of the product of the
proportion of susceptible human hosts and susceptible vectors, suitably modified to
adjust for temporary immunity. By using data from the field he obtains an estimate of
R 2= 6.5. Similar attempts to estimate R 2 for other vector transmitted diseases have
been carried out by Roger S51 for the African trypanosomiases and by Hasibeder et al.55
for canine leishmaniasis. The particular emphasis of the second paper is the develop-
ment of methods for assessing the influence of heterogeneous biting rates of sandfiies on
dogs. Formulae are derived which allows one to estimate Ro if the proportion of
susceptibles in the various subgroups and the ratios of the contact rates between the
subgroups are known. Effects of heterogeneity on the relationship between the
equilibrium prevalence and the heterogeneity in contact rates has also been investigated
by Dietz.51 Assuming that there is a gamma distribution of contact rates in the human
population and that vectors choose hosts randomly proportional to their contact rates
(which is called ’proportional mixing’ in the epidemiological literature) then the
prevalence in equilibrium always decreases as the variance increases. If Q2 denotes the
variance of the contact rate it is shown that the slope of the variance with respect to the
square of the basic reproduction number equals ( 1 + 2Q2 )-1, i.e. it decreases for
increasing variance. This means that values for Ro which are based on the assumption of
homogeneous mixing, and which use the equilibrium proportion of susceptibles,
invariably underestimate Ro .
6.2 Estimation of Ro based on age-specific prevalence data
In the epidemiology of infectious diseases the data available are often cross-sectional
surveys of populations with the aim of determining the age specific prevalence of
antibodies with respect to a particular infection indicating past infection. This means
that one has only censored data in the statistical sense, i.e. one only knows that for a
certain age of an individual the age of infection is less than the current age if antibodies
are present, and that the age of infection is greater than the current age if antibodies are
absent. Such age specific prevalence data show a typical increase which can be described
by so-called catalytic models according to Muench. S6 Muench introduced the term ’force
of infection’ for the hazard function of a susceptible to be infected in an endemic
situation. Dietz12 showed that this hazard rate can be related to the basic reproduction
number in the following way:
where Vt denotes the death rate of the population and x is the force of infection. In this
model it is assumed that there is a constant death rate and an age independent force of
infection. If one calculates the average age A at infection one gets the following
expression: . _
-
If one denotes the life expectancy of a human host by eo I-L -1 then the formula for the
=
This means that the basic reproduction number Ro can simply be estimated by the ratio
of the life expectancy and the average age at infection. In the original paper Dietz 12
denoted the average age at infection by X-’ which yields the formula Ro 1 + eo/A. =
Usually jjL is negligible compared to X such that the difference can be neglected.
Anderson and May 57 provide numerous estimates of Ro based on this formula for
common childhood diseases in industrialized and developing countries.
In general, for age-dependent mortality and age-specific force of infection, one gets
the following formula for the average age A at infection:
where
and
This shows that Ro can no longer be simply estimated by the inverse of the proportion of
susceptibles in the population. This formula leads to the statistical problem of estimating
the force of infection on the basis of a data set where all observations are censored.
Grenfell and Anderson 61 use the maximum-likelihood method for some polynomial
function describing the age-specific force of infection.
Recently, Keiding62 presented a nonparametric method to estimate the age-specific
prevalence and the corresponding smoothed force of infection. He applies this method to
a data set for hepatitis A in Bulgaria and obtains Ro 3.8. =
7 Concluding remarks
The present review tries to give a summary of attempts to define Ro in a meaningful way,
to derive estimates of Ro based on epidemic and endemic situations and to interpret Ro
for the evaluation of control strategies. During the last twenty years Ro has emerged as a
basic concept in infectious disease epidemiology, but it has also become apparent how
difficult it is to apply in actual field situations. It is hoped that this survey will stimulate
further research in this direction.
Acknowledgements
This work has been supported in part by SIMS (Societal Institute of the Mathematical
Sciences) under support from the US National Institute on Drug Abuse (NIDA grant
DA 04722).
I thank Christopher Dye, Chin Long Chiang, Paul Fine, Helmut Knolle, Bob May
and particularly Herb Hethcote for useful comments on the first draft of this paper, and
Michael Haber for his help in locating the report on the smallpox epidemic in Abakaliki.
The article is based on a paper presented at the ASA meeting in Atlanta, 1991.
for Industrial and Applied Mathematics, 1975: 29 Bradley DJ. Epidemiological models - theory
122-31. and reality. In: Anderson RM ed. The
14 Diekmann O, Heesterbeek JAP, Metz JAJ. On population dynamics of infectious diseases: theory
the definition and the computation of the basic and applications. London: Chapman and Hall,
reproduction ratio R o in models for infectious 1982: 324.
diseases in heterogeneous populations. Journal 30 Molineaux L. The pros and cons of modelling
of Mathematical Biology 1990; 28: 365-82. malaria transmission. Transactions of the Royal
15 Elandt-Johnson RC. Definition of rates: some Society of Tropical Medicine and Hygiene 1985;
remarks on their use and misuse. American : 743-47.
79
Journal of Epidemiology 1975; : 267-71.
102 31 Bonneux L, Howeling H. Is een epidemie von
16 Nåsell I. Hybrid models of tropical infections. heteroseksueel overgedragen HIV-infectie
Berlin: Springer, 1985. mogelijk in Europa? Nederlandse Organisatie
17 Ross, R. The prevention of malaria. Second van Tijdschrift-uit Geneeskd 1989; 133: 1922-26.
edition with an addendum on the theory of 32 Blower SM, Anderson RM, Wallace P.
happening. London: John Murray, 1911: 685. Loglinear models, sexual behavior and HIV:
18 Hethcote HW, Yorke JA. Gonorrhea: epidemiological implications of heterosexual
transmission dynamics and control. Berlin: transmission. Journal of AIDS 1990; 3: 763-72.
Springer, 1984: 53. 33 Stigum H, Grønnesby JK, Magnus P, Sundet
19 Svensson Å. Analyzing effects of vaccines. JM, Bakketeig LS. The potential of spread of
Mathematical Biosciences 1991; : 407-12.
107 HIV in the heterosexual population in Norway:
20 Greenland S, Frerichs RR. On measures and a model study. Statistics in Medicine 1991; 10:
models for the effectiveness of vaccines and 1003-23.
vaccination programmes. International Journal 34 Knolle H. Age preference in sexual choice and
of Epidemiology 1988; 17: 456-63. the basic reproduction number of HIV/AIDS.
21 Smith PG, Rodrigues LC, Fine PEM. The 1990;
Biometrical Journal 32: 243-56.
Assessment of the protective efficacy of 35 May RM, Anderson RM. Transmission
vaccines against common diseases using case- dynamics of HIV infection. Nature 1987; 326:
control and cohort studies. International Journal 137-42.
of Epidemiology 1984; : 87-93.
13 36 Peterman TA, Stoneburner RL, Allen JR, Jaffe
22 Scalia-Tomba G. Asymptotic final size HW,Curran JW. Risk of HIV transmission
distribution of the multitype Reed-Frost from heterosexual adults with transfusion-
process. Journal of Applied Probability 1986; associated infections. Journal of American
: 563-84.
23 Medical Association 1988; 259: 55-63.
23 Macdonald G. Theory of the eradication of 37 Kaplan EH. Modelling HIV infectivity: Must
malaria. Bulletin of the World Health sex acts be counted. Journal of AIDS 1990; 3: