Standard Vs Population Reference Curves in Obstetr

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Editorial ajog.

org

Standard vs population reference curves in


obstetrics: which one should we use?
Cande V. Ananth, PhD, MPH; Justin S. Brandt, MD; Anthony M. Vintzileos, MD

What is a “standard,” and how does it differ from a


A bnormal fetal growth shapes disease risks later in the
perinatal period, at infancy, and in childhood stages
through chronic diseases and even death later in life.1
“reference” curve? In our view, authors and readers poorly
appreciate their distinctions. The goal of this Editorial is
Identification of fetal growth abnormalities, at both ex- focused exclusively on clarification of the similarities and
tremes of growth and particularly at the lower threshold of differences between standard and population reference
fetal size, continues to be debated vigorously.2 There are at curves and highlighting the strengths and weaknesses of
least 2 reasons that have fueled this ongoing debate. First, each approach.
fetuses grow at different growth velocities, and divergence in
growth is a phenomenon that is relatively well-observed at
later gestations. It is therefore important that the heteroge-
A note on terminology: standard vs (population) reference
neity in any given biometric parameter be taken into
curves
A growth “standard” is constructed by the selection of only a
consideration when the percentile distributions are estimated.
part of the population, often those with no complications or
Fortunately, cutting-edge advances in the biostatistical
with normal outcomes (ie, “low-risk” subjects). In contrast, a
modeling literature have paved ways to address this effec-
“reference” (population) curve is based on an unselected group
tively.3,4 Second, and arguably a critical issue, is the selection
of subjects and combines both low- and high-risk subjects and
of the right growth curve on which fetal size is assessed. After
cases with normal and abnormal outcomes, thus being equated
all, the cohort composition of all growth charts is not the
with a “population” curve. We also must define the commonly
same, and the prevalence of abnormal fetal growth varies
used terms “nomogram” and “normogram.” The term nomo-
across different growth charts.2,5,6
gram is a mathematic term that was derived from the Greek
In this issue of the American Journal of Obstetrics and Gy-
words “nόmo2,” which means “law,” and “grammή,” which
necology, Hoftiezer et al7 develop a prescriptive (standard)
means “line,” and describes how the data points can be used
birthweight chart based on approximately 1.6 million well-
(https://www.merriam-webster.com/dictionary/nomogram-SE
dated, singleton infants who were born in the Netherlands
E), which has nothing to do with the nature of the studied
(2000e2014). The birthweight charts were constructed after
population. The term normogram, which is derived from the
the exclusion of preexisting maternal high-risk conditions
word “normal,” is a graph that depicts the distribution or range
and high-risk conditions that developed later in pregnancy.
of normal values, regardless of the type of population studied.
The charts were derived from infants born at 23e42 weeks
Therefore, both terms can be applied to both standard and
gestation to “healthy” mothers after uncomplicated preg-
reference population curves. The clinical implications of the
nancies and spontaneous onset of labor (ie, low-risk sub-
use of standard vs reference curves depend on the specific
jects). The authors concluded that their “standards”
clinical scenarios of interest. Thus, an understanding of the
resemble the fetal weight “reference” charts and have greater
pros and cons for each approach is of paramount importance.
ability to discriminate between normal and abnormal
A glossary of various terms related to the assessment of size
birthweights.7
and growth is described in the Table.
From Division of Epidemiology and Biostatistics (Dr Ananth) and the
Division of Maternal-Fetal Medicine (Dr Brandt), Department of Obstetrics, The pros and cons of the use of standard curves
Gynecology, and Reproductive Sciences, Rutgers Robert Wood
Johnson Medical School, New Brunswick, NJ; the Environmental and
We illustrate a few key considerations that guide the con-
Occupational Health Sciences Institute, Rutgers Robert Wood Johnson struction of a standard curve. First, given the exclusion of
Medical School, Piscataway, NJ (Dr Ananth); and the Department of high-risk subjects, as was done in the study by Hoftiezer
Obstetrics and Gynecology, NYU Winthrop Hospital, NYU Long Island et al,7 an important consideration is the potential for appli-
School of Medicine, Mineola, NY (Dr Vintzileos). cability of the nomogram to the general population. These
Received Feb. 18, 2019; revised Feb. 18, 2019; accepted Feb. 27, 2019. standard curves are developed for a specific purpose, and the
The authors report no conflict of interest. clinical impact of the application to pregnancies outside of
Corresponding author: Cande V. Ananth, PhD, MPH. ananthcv@rwjms. their intended study population remains uncertain. Extreme
rutgers.edu caution should be exercised when such standard curves are
0002-9378/free applied clinically to birthweight references. This was the
ª 2019 Elsevier Inc. All rights reserved.
https://doi.org/10.1016/j.ajog.2019.02.060
conclusion in a large Canadian study8 that compared the
INTERGROWTH 21st standards9e11 to Canadian birthweight
Related article, page 383. reference curves and the conclusion of others12 who have
cautioned regarding the premature adoption of the standards.
APRIL 2019 American Journal of Obstetrics & Gynecology 293
Editorial ajog.org

TABLE
Glossary of terms to define growth and size
Term Definition
Reference Statistical summary of the frequency distribution of fetal size of a reference population (descriptive); this
descriptive depiction of fetal size across gestational age is based on an unselected population.
Standard Nomogram of the frequency distribution of fetal size of an ideal population (prescriptive); this prescriptive
depiction of fetal size across gestational age is based on a selected, low-risk population and reflects
aspirational fetal size.
Nomogram A graphic representation that consists of several lines marked off to scale and arranged in such a way that,
by the use of a straightedge to connect known values on 2 lines, an unknown value can be read at the point
of intersection with another line (Merriam Webster’s definition); a nomogram can be applied to both
standards and references.
Growth chart Reference or standards of fetal size, not velocity; the data may be based on cross-sectional or longitudinal
ascertainment of ultrasound or birthweight data.
Descriptive chart Reference of fetal size that is based on a specific unselected population.
Prescriptive chart Standard of fetal size that is based on a selected, low-risk population.
Birthweight chart Reference of newborn infant size based on birthweights that are assumed to correlate with gestational age at
delivery; because preterm infants are more likely to be pathologically small, birthweight-for-gestational-age
charts generally underdiagnose small for gestational age at preterm gestations.
Ultrasound-based chart Reference or standards of fetal size based on sonographic biometric parameters of fetal size that correlate
with birthweight.
Individualized chart Individualized assessment of fetal size based on early ultrasound assessments and projected trajectory of
fetal growth.
Customized chart Statistical summary of the frequency distribution of fetal size that incorporates maternal and fetal
physiologic parameters (such as maternal race, parity, body mass index, and fetal sex).
Size Quantitative assessment of fetal size or estimated weight at a specific gestational age (usually from
cross-sectional assessments).
Velocity Quantitative assessment of the change in fetal size over time, which reflects fetal growth (usually from
a longitudinal study).
Growth Quantitative assessment of velocity (rate of change) that is ascertained longitudinally.
Fetal growth restriction Estimate of fetal size at a specific gestational age that is below a predefined threshold (usually the bottom
percentile) based on a specific reference or nomogram; although this distinction is intended to identify
fetuses who are at risk for adverse perinatal outcomes, some of these fetuses are not at risk for these
complications; the prevalence of fetal growth restriction is dependent on the reference or nomogram.
Small for gestational age Birthweight at a specific gestational age that is below a predefined threshold (usually the bottom percentile)
based on a specific reference or nomogram; although this distinction is intended to identify neonates who
are at risk for adverse neonatal outcomes, some small-for-gestational-age neonates are not at risk for these
complications; the prevalence of small for gestational age is dependent on the reference or standard.
Pathologic small size Distinction of small fetal size associated with adverse perinatal outcomes; optimally, only these fetuses
would be identified as fetal growth restricted; however, in clinical practice, some may be characterized
as normally grown and not exposed to antenatal surveillance.
Constitutional small size Distinction of small fetal size that denotes normal fetal size; optimally, none of these fetuses should be
identified as fetal growth restriction because they are not at risk for adverse perinatal outcomes; however,
in clinical practice, some of these fetuses may be characterized as fetal growth restricted and exposed
to potential, iatrogenic risks that are associated with false-positive antenatal surveillance and
associated interventions.
Ananth. Standard vs population reference curves in obstetrics. Am J Obstet Gynecol 2019.

Second, should the curves that are generated be based on the singleton infants, born to ‘healthy’ mothers after uncompli-
exclusion of a particular set of high-risk conditions and not cated pregnancies and spontaneous onset of labor.” However,
others? For instance, the standard curves by Hoftiezer et al7 their study included preterm births that started at 23 weeks
were based on the exclusion of maternal high-risk conditions gestation. Are such newborn infants really low-risk or are
only and not fetal or neonatal factors. The authors stated that births at preterm gestations an abnormal phenomenon? It
the “final low-risk study population consisted of live-born would seem that any growth chart based on preterm
294 American Journal of Obstetrics & Gynecology APRIL 2019
ajog.org Editorial

birthweights would have to be “descriptive” in nature; yet, Fourth, and perhaps the most underappreciated, issue,
Hoftiezer et al7 have created a standard that appears to be more pertains to the purpose for which a nomogram (standard or
effective at the identification of growth abnormalities than the reference) is intended to identify. For a given fixed percentile
Dutch birthweight charts. Another approach, which is mainly cutoff, the use of standard curves will increase the sensitivity
relevant to fetal growth curves, could have excluded not only in the classification of fetuses as growth-restricted or
all mothers with complications or high-risk conditions but also newborn infants as SGA; inevitably, the specificity will be
all neonates who experienced complications at or after birth. decreased. Thus, standard curves may be viewed as
But again, is the prospective clinical application of such curves “screening” tests (“Clinical implications”).
appropriate when the final outcome is not yet known? In short,
the inclusion criteria for a standard curve may reflect selection The pros and cons of using population reference curves
bias that limits the clinical applicability of the standards across Population reference curves are developed for a defined
specific populations. population and without any subject exclusions. The advan-
The selection bias that remains inherent in the development of tage of the use of population reference curves is that these can
each standard also prevents appropriate comparisons between be used prospectively to evaluate all patients, low- and high-
the standards. This point is emphasized by the recent confusion risk, without the need to know the outcome. The disadvan-
and debate regarding the adoption of the newly developed tage of the use of population curves is that these may not be
growth standardsethe INTERGROWTH 21st project,9-11 WHO comparable from population to population, given the dif-
Multicentre Growth Reference Study,13 or the NICHD’s Fetal ferences in racial and ethnic composition and other biologic
Growth Studies14einto clinical practice. Because the scientific factors that affect growth. Thus, investigators are “forced” to
community has grappled with the uncertain clinical utility of construct and use fetal and birthweight reference curves from
these standards, researchers have started to compare them with their own population.
established reference curves. As we have demonstrated earlier, Population curves based on birthweights have been used
the dramatically different assumptions and the inherent selec- widely to identify fetuses with pathologic growth. As a
tion bias complicate these comparisons. consequence of including preterm deliveries, these references
For example, INTERGROWTH 21st was applied to all underdiagnose impaired fetal growth at preterm gestations
singleton live births in Canada (excluding Quebec) from when there are high rates of growth restriction. On the other
2002e2012. With the use of the birthweight-for-gestational hand, ultrasound curve of Hadlock et al,19 which remains the
age, the frequency of small for gestational age (SGA) and most widely used fetal growth reference in the United States,
associated neonatal morbidity/mortality rates was determined was based on on-going pregnancies, thus including normal
and compared with the Canadian birthweight reference. The fetuses. This is the most likely reason for resembling the
study found important differences in the frequencies of SGA standard curves of Hoftiezer et al.7
and neonatal morbidity and mortality rates that were asso- In contrast to a standard curve, for a given fixed percentile
ciated with specific percentile categories. Although it is cutoff, the use of population reference curves will increase
possible that the difference reflects real “biologic” differences, specificity but may result in lower sensitivity in the detection
it is more likely that these differences are the consequence of of fetuses or newborn infants with abnormal growth. Thus,
varying cohort composition. population reference curves may be viewed as “diagnostic”
Third, the prospective clinical application of standard curves tests20 (“Clinical implications”).
that are created based on exclusion of preexisting high-risk
conditions may be problematic because they do not correct Clinical implications
for complications that may occur later in gestation. In real life, What are the clinical implications for the adoption of a
a proportion of women who are recruited early in pregnancy standard vs a population reference curve? Although the
inevitably will experience (pathologic) complications that are answer may be seemingly trivial, an important, yet under-
likely to affect fetal size and growth. These pathologies, appreciated, implication pertains to an understanding of the
depending on the gestational age at which they occur or are aims and scope of the generated curves. Charts from unse-
diagnosed, will have important implications in the identifica- lected (reference curves) vs selected (standard) populations
tion of newborn infants that are SGA or large for gestational each have a distinct role with respect to clinical implications.
age for pathologic vs constitutional reasons.15 Should these Application of a chart from a standard curve to assess
patients be later excluded because of the development of high- “newborn infant size” invariably will render the newborn
risk conditions such as preeclampsia or gestational diabetes infant relatively “large” (ie, higher percentile distribution) in
mellitus? Thus, the practicability of standard curves should be comparison with a reference curve. Although such distinc-
examined cautiously. It is important to emphasize that, no tions may be obvious for well-grown newborn infants, the
matter which curves are used, we may never be able to capture problem surfaces at the extremes of growth (percentiles at <5
all at-risk fetuses with a single assessment of fetal size and that or >95). It is in these percentile ranges where the application
longitudinal assessments of fetal growth that include individ- of an appropriate chart matters.
ualized and customized growth charts may be reasonable al- To put this in perspective, consider the evaluation of the
ternatives.16-18 effectiveness of a new test to identify subjects with a disease.21

APRIL 2019 American Journal of Obstetrics & Gynecology 295


Editorial ajog.org

If the test is used as a screening tool and applied to the general human fetal and neonatal size and growth charts. Stat Med 2018. https://
population of unselected subjects, then the intent of the test is doi.org/10.1002/sim.8000 [Epub ahead of print].
4. Ohuma EO, Altman DG, for the International Fetal and Newborn
to maximize specificity. In contrast, if the test is applied to a Growth Consortium for the 21st Century (INTERGROWTH-21st Project).
select (high-risk) population, then the test may be classified as Statistical methodology for constructing gestational age-related charts
a “diagnostic” test. The intent is to maximize the sensitivity. using cross-sectional and longitudinal data: The INTERGROWTH-21st
An understanding of the distinctions between a screening project as a case study. Stat Med 2018. https://doi.org/10.1002/sim.
vs a diagnostic test is akin to the development of reference 8018 [Epub ahead of print].
5. McCowan LM, Figueras F, Anderson NH. Evidence-based national
curves in an unselected population (reference curves) vs a guidelines for the management of suspected fetal growth restriction:
selected population of low-risk subjects (standards). If comparison, consensus, and controversy. Am J Obstet Gynecol
reference curves are developed in an unselected population, 2018;218(suppl):S855–68.
then the intent (albeit implicit) is more so to identify normal 6. Sovio U, Smith GCS. The effect of customization and use of a fetal
newborn infants that are not classified as being small or growth standard on the association between birthweight percentile and
adverse perinatal outcome. Am J Obstet Gynecol 2018;218(suppl):
growth-restricted. This approach will yield increased speci- S738–44.
ficity and lower sensitivity of the reference curve. In contrast, 7. Hoftiezer L, Hof MHP, Dijs-Elsinga J, Hogeveen M,
if the curve is developed in a selected population of low-risk Hukkelhoven CWPM, van Lingen RA. From population reference to
women, then the intent is to identify most, if not all, fetuses national standard: new and improved birthweight charts. Am J Obstet
or newborn infants who are small or growth-restricted. This Gynecol 2019;220:383.e1–17.
8. Liu S, Metcalfe A, Leon JA, et al. Evaluation of the INTERGROWTH-
approach will yield increased sensitivity in the identification 21st project newborn standard for use in Canada. PLoS One 2017;12:
of small or growth-restricted newborn infants. Of course, the e0172910.
sensitivity and specificity trade-offs can be modified by the 9. Villar J, Altman DG, Purwar M, et al. The objectives, design and
choice of different percentile cut-offs for the biometry. implementation of the INTERGROWTH-21st Project. BJOG
2013;120(suppl 2):9–26, v.
10. Villar J, Cheikh Ismail L, Victora CG, et al. International standards for
Conclusions newborn weight, length, and head circumference by gestational age and
The philosophies that underlie fetal growth standard and sex: the Newborn Cross-Sectional Study of the INTERGROWTH-21st
reference curves are different, and an understanding of these Project. Lancet 2014;384:857–68.
differences for clinical or research purposes should guide the 11. Papageorghiou AT, Kennedy SH, Salomon LJ, et al. The
selection of the appropriate growth curves. For a fixed INTERGROWTH-21st fetal growth standards: toward the global inte-
gration of pregnancy and pediatric care. Am J Obstet Gynecol
percentile cutoff, the use of standard curves as screening tests 2018;218(suppl):S630–40.
will result in the maximum sensitivity in the identification of 12. Zeitlin J, Vayssiere C, Ego A, Goffinet F. More validation is needed
fetuses or neonates with pathologic growth abnormalities, but before widespread adoption of INTERGROWTH-21st fetal growth
at the expense of more false-positive results. The reverse is true reference standards in France. Ultrasound Obstet Gynecol 2017;49:
for reference population curves. The appropriate percentile 547–8.
13. WHO Multicentre Growth Reference Study Group. WHO child growth
cutoff depends on the balance between sensitivity and false- standards based on length/height, weight and age. Acta Paediatr Suppl
positive rates, which can be dictated only by the clinical sce- 2006;450:76–85.
narios and the objective of the research. We caution that any 14. Grewal J, Grantz KL, Zhang C, et al. Cohort profile: NICHD fetal
comparison of growth standard with population reference growth studies-singletons and twins. Int J Epidemiol 2018;47:
curves will remain confounded by differences in cohort com- 25-25l.
15. Ananth CV, Vintzileos AM. Distinguishing pathological from consti-
positions, which may explain the differences in test charac- tutional small for gestational age births in population-based studies. Early
teristics and the underlying biologic differences. - Hum Dev 2009;85:653–8.
16. Francis A, Hugh O, Gardosi J. Customized vs INTERGROWTH-21st
standards for the assessment of birthweight and stillbirth risk at term. Am
REFERENCES J Obstet Gynecol 2018;218(suppl):S692–9.
1. Crispi F, Miranda J, Gratacos E. Long-term cardiovascular conse- 17. Gardosi J. Counterpoint. Am J Obstet Gynecol 2019;220:74–82.
quences of fetal growth restriction: biology, clinical implications, and 18. Gardosi J, Francis A, Turner S, Williams M. Customized growth
opportunities for prevention of adult disease. Am J Obstet Gynecol charts: rationale, validation and clinical benefits. Am J Obstet Gynecol
2018;218(suppl):S869–79. 2018;218(suppl):S609–18.
2. Grantz KL, Hediger ML, Liu D, Buck Louis GM. Fetal growth standards: 19. Hadlock FP, Harrist RB, Martinez-Poyer J. In utero analysis of fetal
the NICHD fetal growth study approach in context with INTERGROWTH- growth: a sonographic weight standard. Radiology 1991;181:129–33.
21st and the World Health Organization Multicentre Growth Reference 20. Burkhardt T, Schaffer L, Zimmermann R, Kurmanavicius J. Newborn
Study. Am J Obstet Gynecol 2018;218(suppl):S641e55.e28. weight charts underestimate the incidence of low birthweight in preterm
3. Ohuma EO, Altman DG, for the International Fetal and Newborn infants. Am J Obstet Gynecol 2008;199:139.e1–6.
Growth Consortium for the 21st Century (INTERGROWTH-21st Project). 21. Ananth CV, Smulian JC, Vintzileos AM. Epidemiology of antepartum
Design and other methodological considerations for the construction of fetal testing. Curr Opin Obstet Gynecol 1997;9:101–6.

296 American Journal of Obstetrics & Gynecology APRIL 2019

You might also like