Statistical Literacy
Statistical Literacy
Statistical Literacy
data
Author:
Abstract
In this article, the report on a typology of the demands of statistical and mathematical products
(Stamps) embedded in media items related to the COVID-19 (coronavirus) pandemic. The
typology emerged from a content analysis of a large purposive sample of diverse media items
selected from digital news sources based in four countries. The findings encompass nine
categories of Stamps: (1) descriptive quantitative information, (2) models, predictions, causality
and risk, (3) representations and displays, (4) data quality and strength of evidence, (5)
demographics and comparative thinking, (6) heterogeneity and contextual factors, (7) literacy
and language demands, (8) multiple information sources, and (9) critical demands. We illustrate
these categories via selected media items, substantiate them through relevant research
literature, and point to categories that encompass new or enhanced types of demands. Our
findings offer insights into the rich set of capabilities that citizens (including both young people
and adults) must possess in order to engage these mass media demands, critically analyze
statistical and mathematical information in the media, evaluate the meaning and credibility of
news reports, understand public policies, and make evidenced-informed judgments. Our
conclusions point to the need to revise current curricular frameworks and conceptual models
(e.g., regarding statistical and probability literacy, adult numeracy), to better incorporate
notions such as blended knowledge, vagueness, risk, strength of evidence, and criticality.
Furthermore, more attention is needed to the literacy and language demands of media items
involving statistical and mathematical information. Implications for further research and
educational practice are discussed. Although statistical literacy has become a key competence
in today’s data-driven society, it is usually not a part of statistics education. To address this
issue, we propose an innovative concept for a conference-like seminar on the topic of statistical
literacy. This seminar draws attention to the relevance and importance of statistical literacy,
and moreover, students are made aware of the process of science communication and are
introduced to the peer review process for the assessment of scientific papers.
Keywords
Real world data, Statistical literacy, use, misuse, Covid 19 pandemic
I-Introduction
COVID-19 has swept across the world, overwhelming healthcare systems and raising countless
questions about how best to diagnose patients, treat infections, save lives, and contain the
pandemic. In short order, researchers have launched randomized trials to uncover
pharmacologic interventions that hold the promise of preventing or lessening the severity of
the disease. But getting results takes time. And time was a luxury that doctors on the frontlines
of the coronavirus fight could ill afford in the early months of the pandemic. Desperate for
medical insights without delay—and hoping to address other questions not answerable in a
specialized research environment—researchers, pharmaceutical companies, and government
agencies immediately turned to health information captured through insurance claims,
electronic medical records, patient registries, and other so-called “real-world” data sources.
By analyzing trends in COVID-19 datasets, the research community rapidly helped fill in
knowledge gaps around disease symptoms, risk factors, racial disparities, and more. Such
observational methods also hinted at which treatments seemed to be making an impact—and
which were not—all in near real time. But harnessing this type of real-world data is a tricky
business. It requires high-quality data collection and proper methodological considerations.
There are established guidelines on how best to plan, execute, and report observational studies
in a way that ensures the validity and relevance of the evidence gathered. Yet researchers and
clinicians can sometimes neglect those guidelines, especially during a health crisis in which the
rush to publish has spawned some suspect research practices, according to some observers.
The pandemic thus presents an unprecedented opportunity to leverage diverse, real-world data
sources to inform medical and regulatory responses to the public health emergency. “That’s
[the] balance that needs to be maintained,” says Winterstein, who served as president of the
International Society for Pharmacoepidemiology until this past August. “On the one hand, you
need real-world data in order to have complete evidence for decision making. But at the same
token, you have to follow proper epidemiological methods and consider and address the biases
in the data before making any causal inferences.”
To pave the way for a successful inclusion of statistical literacy into statistics education, it is
necessary to present concrete concepts that make the teaching of statistical literacy exciting for
students and lecturers. There are various recent papers on teaching statistical literacy but there
is no concrete teaching concept proposing a specific lecture or seminar on statistical literacy.
For instance, Valentini, Carbonara, and De Candia presented a gradual approach adopted by the
Italian National Statistical Institute (Istat) to promote statistical literacy to university students.
Christensen discussed demands placed on statistics training in the 21st century against the
backdrop of statistical (il)literacy by means of exemplary case studies. The article of Kadijevich
and Stephens deals with data discovery using automated analytics and is directed toward
statistics educators to make them (more) aware of the global context concerning modern
statistical literacy and data science. Watson and Callingham are concerned that statistics and
probability still receive inadequate attention in the classroom, leaving students without the
statistical literacy needed to make sense of the claims being made in the current COVID-19
situation.
In addition to the current literature, we would like to present in this article an innovative
concept for a conference-like seminar on statistical literacy entitled “Statistical Literacy—
Misuse of Statistics and Its Consequences.” The seminar pursues several goals: The participants
elaborate the contents and results of a specific scientific publication and uncover erroneous
conclusions in media reports drawn from these studies. Students not only criticize media
reports, but also write a “press release” about the publication under investigation in the
framework of their term paper. Furthermore, students are introduced to the (double-blind)
peer review process for the assessment of scientific papers by evaluating term papers of other
participants with the help of guidelines and guiding questions provided by the supervisors. As a
part of the final event of the seminar, participants present their topics and hold a subsequent
scientific discussion.
The mass media is the primary vehicle through which most citizens are informed of current
affairs on key social and economic matters (Miller & Krosnick, 2000). Consequently, the critical
interpretation of media reports has been a focus of research within mathematical literacy, adult
numeracy, and statistical literacy. The rapid progress of the COVID-19 pandemic has impacted
on many facets of life, prompting governments to both monitor and make predictions about
the crisis in order to formulate and enact evidence-based public policy aimed at protecting their
citizens. This has required the analysis of vast amounts of quantitative data and information. As
a result, citizens have been exposed to a wide range of pandemic-related media items
published by various actors in the public and private sphere.
Despite the importance of such capabilities, we argue that there has been limited systematic
research on the actual demands of media items with which adults need to engage. Our goal is
to begin addressing this knowledge gap since identifying the characteristics of StaMPs and their
associated demands, is essential for revising current conceptual frameworks for vital adult
capabilities or basic skills, making educational decisions, and designing effective curricula.
Furthermore, since StaMPs have been used in the media to document and predict the impact of
the pandemic on many facets of life across the globe, our typology may be relevant to the
evaluation of other current or future national and international challenges and crises.
II- Presentation of Answers to the Research Question
In this section, we review research literature related to the role of StaMPs in constructing and
supporting claims made in the news media, and to the associated capabilities required for
informed, critical, and responsible citizenship and for responding to changing government
guidelines aimed at reducing the spread and impact of the pandemic.
At the same time, the capabilities adults need to act as “smart consumers” of the news media
have been the subject of significant theory development, including conceptual frameworks that
describe key competencies, such as adult numeracy and statistical literacy—these encompass
multiple knowledge bases and dispositions (Gal, 2002; Geiger et al., 2015; PIAAC Numeracy
Expert Group, 2009; Tout et al., 2021). In fields such as journalism and political science, interest
in “numbers in the media” has generated suspicion that numbers are often used only as
“rhetorical devices” that appeal to readers’ emotions, prejudices, and fears (e.g., Roeh &
Feldman, 1984). Such devices have been identified as pivotal within media constructions of
social problems (Himmelstein, 2014) and the exercise of political power (Rosa et al., 2016;
Rose, 1991).
There has also been an increasing emphasis in the literature on the need for citizens to
develop critical capabilities. Calls have been made for educational approaches that can promote
an understanding of how mathematics and statistics are used to serve social power structures,
or manipulate and shape public opinion (e.g., Frankenstein, 1989; Geiger, et al., 2020;
Weiland, 2017). For example, the focus of a recent essay by Skovsmose (2021) describes three
different relationships between mathematics and crises (such as the current pandemic): (1)
modelling or picturing of crises as a way of representing reality; (2) construing a crisis in that
mathematics is integrated into the development of a phenomenon; and (3) formatting a crisis in
a way where mathematics is used to shape the reading and interpretation of a crisis. The third
relationship is particularly relevant to the ways in which StaMPs are used by the media in
shaping a readers’ interpretation of developments related to the COVID-19 pandemic.
The COVID-19 crisis has seen StaMPs used in the media (both print and digital) not just for
descriptive or predictive purposes, but also as evidence to justify unprecedented interventions
by governments into citizens’ daily lives, such as the restriction of travel and work options,
social distancing, and the use of digital data for surveillance (Zuboff, 2019). StaMPs have also
been used to explain the impact of the pandemic on the economy (e.g., layoffs, business
closures) and to justify why public resources must be diverted to new causes. Citizens’
compliance with government directions has saved lives but has also had detrimental financial,
health, and personal consequences. If citizens from all walks of life, including from vulnerable
groups (Gal et al., 2020), are to make informed decisions and act responsibly within a pandemic
context, they must be capable of critically evaluating the meanings and credibility of StaMPs
within media items (cf. Glik, 2007; Steen, 1999). They should also be able and willing to engage
with opinions of experts, ask relevant questions, evaluate the quality of given evidence, draw
conclusions, and make decisions (Fischer, 2000; Kollosche & Meyerhöfer, 2021).
In order to develop a comprehensive empirically based typology of the characteristics of
StaMPs in pandemic-related media items, we have adopted a rigorous exploratory approach to
data collection. Our approach extends, or differs from, previous research on the demands of
COVID-19 media in the following ways: (1) content analysis was conducted on a larger sample
of 300 media items, published on the websites of four leading news outlets from different
countries, with heterogeneous demographic, geographical, and economic characteristics and
different patterns of how the pandemic evolved in each country; (2) multiple media outlets
were purposefully selected that represent diverse political orientations (left, right, and
centrist) and appeal to heterogeneous audiences; and (3) the salience of chosen media items
for readers was purposefully enhanced by selecting about half of the items from lead or
section-lead articles (i.e., appeared as the headline or a highlighted article on the outlet’s
website for a specific day), others from key sections related to news, health, COVID-19, and
business.
Consistent with the explorative nature of our research question, our approach to the content
analysis of selected media items was reductive, by way of inductive category formation
(Mayring, 2014). Krippendorff (2004) suggests that this method is appropriate when different
authors write texts about a common theme—as is the case with articles about pandemic-
related issues. The process of induction was initiated by a first pass over a subsample of media
items, consistent with an open coding technique drawn from grounded theory (Strauss &
Corbin, 1998). This pass was aimed at identifying tentative categories of mathematical and
statistical entities or ideas within each media item. The approach was informed by, but not
limited to, ideas drawn from conceptual frameworks that describe foundational elements of
statistical literacy, civic statistics, and adult numeracy . Emergent broad categories of StaMPs
were identified and defined, including several ideas not included in such prior frameworks.
Then, additional items in the sample were examined, using a process involving constant
comparison and ongoing discussion between the researchers, in order to refine the emerging
categories. Consistent with Mayring’s (2015) approach, this process was re-applied, leading to
further refinement (reducing redundancy, overlapping or lack of clarity), until all categories and
their definitions stabilized.
The Hydroxychloroquine Affair
The good, bad, and ugly of real-world data can be seen in stark relief when considering the
story of hydroxychloroquine, an antimalarial drug once touted by US President Donald Trump
for its supposed antiviral effects. [Randomized trials later showed that hydroxychloroquine
offers little if any benefit to coronavirus patients and could possibly be harmful.
The good: After initial reports of hydroxychloroquine’s promise, researchers working within
both public and private health systems dug into their hospital records and found that the drug
failed to reduce the risk of ventilation or death.
The bad: Unfortunately, this relatively quick “correction” within the scientific community was
not quick enough. Infectious disease specialists had already started widely prescribing the drug.
The FDA also issued an emergency use authorization (EUA)—and all because of little more than
a preliminary observational study involving just 36 patients. The agency later revoked the
emergency waiver and doctors stopped administering hydroxychloroquine, but not before the
government had stockpiled more than 60 million doses and tens of thousands of patients
needlessly received the useless medicine. Unfounded assertions from politicians about the
drug’s effectiveness didn’t help matters.
The ugly: A little-known data analytics firm called Surgisphere Corporation in Palatina, IL,
reported a chilling mortality risk among patients with COVID-19 who took hydroxychloroquine
—a finding that almost derailed randomized testing of the drug. Articles were quickly retracted
and trials restarted after critics challenged the study’s methods and the legitimacy of the
company’s dataset, purportedly amassed from medical records of nearly 100,000 infected
patients treated in 671 hospitals worldwide. It is still unclear whether, as some suspect, any
authors engaged in deliberate fraud—a deception that experts say would be easier with
hospital record collections than with prospective trial data amassed under the oversight of
monitoring committees and review boards. Either way, experts say the incident damaged public
trust and delayed scientific progress.
It’s not that observational studies and real-world data analytics don’t have their place in the
pandemic response. When it comes to drug treatments, these methods can, for example, help
researchers better understand how risk factors such as age, sex, underlying health conditions,
and medication use drive outcomes to coronavirus infection. And even if limited in their
capacity to definitively establish the efficacy of therapeutic interventions, they can evaluate
safety or help confirm the results of randomized trials regarding toxicity and clinical benefit in
daily medical practice or in different patient populations.
III- Conclusion and Recommendation
Our findings about the broad and rich demands of StaMPs in the pandemic media have
significant implications for teaching and learning in relation to developing citizens’ capacity to
engage with numerate environments and practice. These implications are relevant to all
contexts of formal and informal education across a diverse range of information needs.
The outcomes of this study also have implications for future research and educational agendas,
such as the following:
1. Refining our typology of categories of StaMPs that citizens encounter in the media,
including their demands and characteristics. This requires the extension of the current
categories to accommodate the demands of media items on other significant civic
topics, such as global warming or equity, and using additional and more diverse sources
of media outlets. Furthermore, it is important for future studies to go beyond analysis of
text (print or digital) and examine the nature of StaMPs communicated via gestures and
oral (spoken) modes of expression, such as by experts or presenters on TV news, radio
programs, or podcasts.
2. Promoting educational programs aimed at developing the capability of citizens to use
blended knowledge and critically evaluate both the meaning and strength of evidence
that are used to support the claims and arguments of experts and other commentators
in the media. An example is developing the ability to evaluate whether evidence from
mathematical and statistical modelling is used in a credible way in the media, when
predicting the progression of a crisis.
3. Generating educational programs that go beyond simple and abstract notions of
probability, and include comprehension of levels of vagueness, uncertainty and risk as
they are communicated in the media.
4. Examining citizens’ actual practices and behaviors when engaging and evaluating the
different categories of statistical and mathematical products, i.e., StaMPs, in media
items that report on key civic topics (e.g., a pandemic, global warming) and on related
policy decisions.
The above implications illustrate key foci for research and education that are of relevance to
both current circumstances and future crises. Ongoing work on the topics outlined in this article
is essential to enable mathematics and statistics education to extend beyond existing
theoretical models and serve the expanding needs of diverse societies around the globe.
In addition to criticism of media reports and/or statistical methods used, students should be
able to write a “press release” for their own term paper. In this way, students are intended to
be sensitized to the important process of science communication. The press release should
contain a concise presentation of the main contents and results of the underlying publication. It
should arouse the curiosity of the potential reader and make the content of the term paper
particularly interesting. The press release is aimed at a broad, interested, but not statistically
literate audience, and should have the length of 600–700 characters including spaces. Particular
emphasis is to be placed on the simultaneous consideration of statistical correctness and
interest arousing formulation.
The press release should be detached from the sections of the term paper, and should be
placed between the title page and the table of contents. It is suggested that students receive all
press releases in collected form before the start of the final event, to be prepared for the
presentations.
Statistical literacy requires not only a series of basic skills (such as reading, comprehension, and
communication) but also higher order cognitive skills of interpretation, prediction, and critical
thinking. The ability to interpret statistics critically and to refute claims is not innate; these skills
need to be taught adequately if students are to become informed individuals. To address this
issue and to contribute to the improvement of statistics education, we have presented an
innovative concept for a seminar on statistical literacy in this article. This seminar not only
creates an awareness of statistical literacy among the participants, but is also versatile and
exciting for participants and lecturers because of its manifold objectives including performing
detective work (revealing false conclusions in media reports) as well as writing of a press
release and a review report. Since the only prerequisite of participation is basic knowledge in
descriptive and inferential statistics, the proposed seminar can be conducted cross- and
extracurricularly.
We have successfully carried out a seminar according to this concept in the summer term 2020.
Accordingly, one seminar per year, each in the summer term, is planned. Moreover, the
upcoming seminar will focus even more on current problems and is planned to be dedicated to
big data literacy.
Finally, with this article, we hope to have provided a concept for the direct incorporation of
statistical literacy into statistics education that will set a standard. We are very interested in an
exchange with and feedback from other lecturers on our seminar concept.
IV- References
1. Berger, M. L., Sox, H., Willke, R. J., Brixner, D. L., Eichler, H. G., Goettsch, W., ... &
Mullins, C. D. (2017). Good practices for real‐world data studies of treatment and/or
comparative effectiveness: recommendations from the joint ISPOR‐ISPE Special Task
Force on real‐world evidence in health care decision making. Value in Health, 20(8),
1003-1008.
2. Pottegård, A., Kurz, X., Moore, N., Christiansen, C. F., & Klungel, O. (2020).
Considerations for pharmacoepidemiological analyses in the SARS‐CoV‐2 pandemic.
Pharmacoepidemiology and Drug Safety, 29(8), 825-831.
3. Hanel, P. H. P., Maio, G. R., & Manstead, A. S. R. (2019). A new way to look at the data:
Similarities between groups of people are large and important. Journal of Personality
and Social Psychology, 116(4), 541–562.
4. Kahneman, D. (2013). Thinking fast and slow. New York: Farrar, Straus and Giroux
Publications.
5. Moore, D., & Notz, W. I. (2006). Statistics: concepts and controversies (6th ed.). New
York: W.H.Freeman.
6. Neylon, C. (2009). Scientists lead the push for open data sharing. Research Information.
Europa Science 41(8), 22–23.
7. Stuart, A., & Ord, N. (2009). Research design and statistical analysis (second edition ed.).
London,Lawrence Erlbaum.