Framework For Instruction and Assessment On Elementary Inferential Statistics Thinking Angustias VALLECILLOS
Framework For Instruction and Assessment On Elementary Inferential Statistics Thinking Angustias VALLECILLOS
Framework For Instruction and Assessment On Elementary Inferential Statistics Thinking Angustias VALLECILLOS
Angustias VALLECILLOS
University of Granada. Spain
[email protected]
Antonio MORENO
I.E.S. Trevenque. Granada. Spain
[email protected]
ABSTRACT
The main objective in this paper is to describe a framework to characterize and assess the learning of
elementary statistical inference. The key constructs of the framework are: populations and samples and their
relationships; inferential process; sample sizes; sampling types and biases.
To refine and validate this scheme we have taken data from a sample of 49 secondary students sample
using a questionnaire with 12 items in three different contexts: concrete, narrative and numeric. Theoretical
analysis on the results obtained in this first research phase has permitted us to establish the key constructs
described below and determine levels in them. Moreover this has allowed us to determine the students’
conceptions about the inference process and their perceptions about sampling possible biases and their
sources.
The framework is a theoretical contribution to the knowledge of the inferential statistical thinking domain
and for planning teaching in the area.
1
2nd International Conference on the Teaching of Mathematics
Crete, Greece, July 1-6, 2002
1. Introduction
One of the characteristic features of the current society is the enormous technological
development that has been applied for the social and economic improvement of the citizens. In this
technological society information and communication play key roles and education should provide the
citizens with the necessary elements to develop within. The access to information, the use of data,
data analysis and the taking of informed decisions in uncertain situations, the understanding and the
capacity of criticism of the information provided by the media, etc., form part of the new formative
necessities of citizens in the current world. As an answer to these new social necessities the
educational systems have introduced reforms in the curricula that affect statistical education in many
countries and at all teaching levels, for example, MEC(1990), Junta de Andalucía (1992; 1994;
1997), NCTM(2000). One of the novelties of the reforms in Spain has been the introduction of
statistical inference in the curricula for the compulsory teaching level (ESO, 12-16 years old) and the
Bachillerato (16-18 years old). Parallel to this, the introduction of more and new statistical contents,
and at more elementary teaching levels each time, outlines a bigger necessity of further research on
the learning of these contents and their throughout the student's schooling years. Although we
already have some results of research carried out in this respect in the field of data analysis and of
probability, this field can be considered mainly, as emergent and developing (Shaughnessy, 1992;
Mokros and Russell, 1995; Gal and Garfield, 1997; Batanero and cols., 1994; Jones and cols., 2000).
In the field of statistical inference the research works carried out are even more scarce (Watson,
2000; Jacobs, 1996; Moreno and Vallecillos, 2001; Vallecillos, 1998; in print). Jones and cols. (2000)
propose a framework to characterize the children’s statistical thinking based on the cognitive
development model described by Biggs and Collis (1991). In our work we have tried to develop a
similar framework for the case of statistical inference thinking, so finally we can have an applicable
general framework for elementary, descriptive and inferential statistics. To do that, on a review of
previous research works and based on our own researching experience on the topic, we have built
an initial theoretical framework to evaluate the learning of statistical inference in secondary
education students. Then, we have elaborated a questionnaire that 49 students of this level have
completed and we have analyzed the results obtained. Finally, by incorporating the obtained
information, we have refined the initial framework and we have elaborated the conclusions of this
phase of the study.
This theoretical framework of analysis developed to evaluate the learning of the basic statistical
inference has been validated with secondary level students but it can be used to plan teaching of the
topic and to evaluate the learning of the students in introductory courses at the university level too.
2. Theoretical Framework
Teachers need a good knowledge about how students understand statistical concepts and how
they engage in solving problems. Students exhibit statistical thinking over the different school levels
and develop in time. So the framework is situated in a general cognitive development model (Bigg
and Collis, 1982; 1991). These authors describe three levels of observed learning outcome:
1.Unistructural responses, those taking in to consideration only one aspect of the concept or
task considered;
2.Multistructural responses, those in which several aspects of the concept or task are
considered but not all, and
3.Relational responses, those in which all aspects are considered and integrated exhibiting an
integrated understanding and a meaningful learning.
Situated in this general cognitive model (Bigg and Collis, 1982; 1991), Jones and cols. (2000)
formulate a framework to characterize children’s statistical thinking. They define four constructs,
describing, organizing, representing and analyzing and interpreting data. Within each one of these
constructs they establish four thinking levels representing a continuum from idiosyncratic to analytic
reasoning. Results of the study, authors say, confirm that children’s statistical thinking can be
described according to the framework. Our initial framework for inferential statistical thinking is also
situated in the general cognitive model (Bigg and Collis, 1982; 1991) and is like Jones and cols.
(2000) framework with four construct and four thinking levels within each one. Nevertheless, we
consider two related aspects for determining construct and levels in the framework: the statistical
content and the result of the questionnaire filled in by students. In the initial framework we have
determined the constructs and level in statistical content based; afterwards we have considered the
students’ responses to the questionnaire too in order to establish constructs and levels in them in the
inferential statistical framework. We have established four constructs, population and samples and
their relationships (PS), inferential process (IP), sample sizes (SS) and sampling types and biases
(ST), and four thinking levels in each one.
3. Method
3.1. Aims of this research
The objectives of this paper are mainly three: a) to develop an initial framework to characterize
and assess the learning of basic statistical inference; b) to elaborate a questionnaire to asses
statistical inference learning at secondary level; c) to test framework in order to get the first
objective and validate and refine it with the questionnaire results.
3.2. The constructs
We seek to describe and to fix, in the first place, the elements and key concepts of statistical
inference for the basic training of the students at introductory teaching levels. To do that we will use
the expression “construct” that is used in the field of Psychology to describe complex phenomenon
such as the personality, motivation, etc., of difficult definition. For us each “construct” represents a
category of concepts all of them under only one epigraph in which they can be described. We
believe that the description of the samples, the populations of the ones that have been extracted and
their relationships; the questions related with size, the selection methods and the possible sources of
biases in the selection of samples are important conceptual nuclei that are in the basis of learning in
statistical inference that can be described in these terms. All these elements have already been
recognized previously as such by teachers, researchers or curricular documents. Our proposal
includes a novelty: we have included as a differentiated construct the one that we have called
“Inferential process” because we find that it deserves a special mention. Indeed, the students
sometimes do not distinguish well between population and sample and they are not conscious,
therefore, that the conclusions obtained in the study of a sample are not those that we need, and it is
necessary to make them aware that a generalization under the conditions of the study is carried out
and therefore subject to certain limitations and to the possibility of error. Other times the students do
not admit the generalization possibility and they only believe in the carrying out of census and so it is
necessary to make them reflect about the impossibility of these in certain situations, such as
destructive tests or with unbroachable temporary or economic costs. We describe the key constructs
below:
A) Populations and samples and their relationships
We try to understand the ideas of the students about the sample and population concepts as well
as the relationship between them. These concepts are intuitively used in many environments of daily
life, outside the school environment. Concepts such as the variability and sample representativeness
have a great incidence in many aspects of social life. Kahneman and cols. (1982) have investigated
thoroughly on these aspects and find that people reason using heuristics that lead them to erroneous
conclusions most times. Among secondary level students the presence of thinking heuristic has also
been detected (Rubin and cols., 1991; Moreno and Vallecillos, 2001). In another order of things, we
are also interested in discovering if the scheme ‘part-everything' used in the teaching of contents of
numerical type such as the fractions and rational numbers, is also used in this context and how it is
used.
B) Inferential process
We try to understand how the students conceive the process that allows them to describe the
population on the basis of the information obtained from the observation of one of its samples. To do
that we have determined the students’ conceptions (Artigue, 1990) about the process, such as
theoretical models built which supposedly guide the students’ answers.
C) Sample sizes
In order to get a good learning relative to the sample concept it is necessary to keep two aspects
that are essential in it in mind: the sensitization of the students about the importance of the sample
size and the appreciation of the same when judgements are emitted or decisions are made based on
samples. The works of Kahneman and their colleagues determined the “law of the small numbers”
as a very widely believed among the population, even among people with statistical training. This
belief is part of the representativeness heuristic leading people to believe that the samples, even the
very small ones, always reproduce the population's characteristics from which they proceed,
showing an insensitivity towards the size of the sample (Kahneman and cols., 1982).
D) Sampling types and biases
The sampling based on the randomization of the statistical units provides the representative
samples of the populations under study. In this section we consider two aspects basic for the good
teaching of the topic: the sensitization of the students about the importance of the randomization in
the selection of the samples used as well as about the presence of biases in any other case and of
the derived pernicious effects of the use of biased samples.
3.3. The inferential statistical thinking framework
In Table 1 we described it.
3.4. Participants
Participants are 49 secondary students from two Spanish high schools distributed in two different
courses. 30 students from 3º de ESO (14-15 years old) without any previous statistics information
and 19 from COU (17-18 years old). COU is the last course at secondary level and these students
had some statistical knowledge.
3.5. Questionnaire
The questionnaire was made up of two different parts with 12 questions each one about
elementary inference concepts. Items are presented in three contexts, concrete, narrative and
numeric. We include two different items, one of Part I and one of Part II of the questionnaire for
readers illustration. The complete version of the questionnaire may be obtained from authors on
request.
Item I.1. We have a bag with 100 balls of the colors red and green. We want to
study the number of balls of each color. To do that we take 25 balls from the bag
and we observe that 14 of them are red and 11 are green. Write:
a) The set objects we are studying:
b) The sample observed:
Item II.3. The town council is starting a campaign for explain to the citizens
what they may do when they need to get rid of old furniture. They want to know if
the instructions have been clear and understandable. The population of the city is
300.000 people and so they decide to ask 2000 adult citizens about their opinion.
They are asked in small and big quarters, some male and some female, some old
and some young people, some who live in flats and who live family houses and so
on. They think that they have a varied group of people. They are 73% of these
people say that the given instructions are clear and the 27% say not.
¿What can you say to the town council about the percentage of adults in the
whole city who think that the given instruction are clear?:
a) 50% because probably half of the people think the instructions are
clear and the other half think they are not.
b) 73% because the adults asked gave a general idea about the results as
if the whole population were asked.
c) I can’t say anything because the result of the inquiry could have been
anything.
d) I can’t say anything because I can’t ask all the adults in the city.
e) Were ................................. because.....................................
3.6. Procedure
Third course ESO’s students filled in Part I questionnaire in one 60 minute session and Part II in
another 60 minute session. In some questions in Part I of the questionnaire the researcher intervened
for concrete material handling required or to explain to students what they are being asked. Then
they fill in the questionnaire individually. COU students use only a 60 minute session to individually
fill in both parts of the questionnaire.
3.7. Results
A) Populations and samples and their relationships (PS): a lot of students have not identified the
sample and population studied correctly, although there are notable differences in correspondent
items success percentages in the different contexts. The higher age group (COU) have got better
global results than the ESO group and in the numerical context. About two thirds of the ESO
students can not identify either population or sample while in COU only a fifth of them cannot do so.
B) Inferential process (IP): we have grouped students’ responses under three headings
characterizing each one determined conceptions. They are summarised below:
C1) Correct conception: the inference process is a chance ruled process and can not permit the
precise population characteristics determine on the basis of the information obtained from one of its
samples.
C2) Identity conception: the inference process permits to us describe the population with
characteristics identical to the one of its samples.
C3) Previous conception: the population has characteristics described by previous ideas and not
for the ones observed in the extracted sample.
C4) Deterministic conception: the population can only be described by doing a census and not by
studying samples extracted from it.
In this category we have found very great differences between contexts: not all conceptions
appear in each context, e. g., in narrative context the previous conception do not appear and the
deterministic conception only appear in the narrative contexts.
C) Sample sizes (SS): in the lower age group (ESO) about 50% of the students do not take in to
consideration the sample size and in the COU group the success percentage is a little better but only
a quarter of all the students relate the sample size and the population characteristic estimation.
D) Sampling types and biases: most of all the students recognizes the different sampling types
and most of the higher age group students, the different kinds of random sampling too, e. g., simple
versus stratified sampling.
5. Conclusions
In this paper we have presented an initial inferential statistical framework for instruction and
assessing secondary student learning of the same. We have in synthesis described the four
constructs and the four levels within each one that the scheme constitutes. We have tested it with
secondary students from two different courses in Spain. With the results obtained from the
questionnaire filled in by them we have revised and completed the inferential statistical framework
that we have describe before. As a first general conclusion we have experimented several
difficulties in two different areas mainly, of a theoretical and of a didactic al nature. In the theoretical
area, to determine the essential theoretical aspects, concepts or constructs that are basic and
essential and so it is necessary to include them it in any general elementary curriculum for statistical
education for all citizens in order to make peoples aware and be able to take informed decisions. In
the didactic area, once the adequate curriculum content has been determined, how do the students
get the bests results?. The inferential statistical framework in our actual personal contribution to
these problems. This research is now completing its instructional slope, developing classroom
resources for testing it and for a global revision of the inferential statistical framework.
Acknowledgement: To the Research Projects PB97-0827 and BS02000-1507, financed by the
Ministry of Science and Technology, Madrid, Spain.
REFERENCES
Artigue, M. (1990). Épistémologie et Didactique. Recherches en Didactíque des Mathématiques, 10(2-3), 241-
286.
Batanero, C.; Godino, J. D.; Vallecillos, A.; Green, D. R. and Holmes, P. (1994). Errors and difficulties in
understanding elementary statistical concepts. International Journal of Mathematics in Science and
Technology, 25(4), 527-547.
Biggs, J. B. and Collis, K. F. (1982). Evaluating the quality of learning: The SOLO taxonomy. New York:
Academic Press.
Biggs, J. B. and Collis, K. F. (1991). Multimodal learning and intelligent behavior. In H. Rowe (Ed.):
Intelligence: Reconceptualization and measurement, (pp. 57-76). Hillsdale, NJ: Lawrence Erlbaum
Associated Inc.
Gal, I. and Garfield, J. B. (1997). The Assessment Challenge in Statistics Education. Amsterdam: IOS Press.
Jacobs, V. (1996). Children’s informal interpretation and evaluation of statistical sampling in surveys. Ph. D.
University of Wisconsin-Madison.
Jones, G. A.; Thornton, C. A.; Langrall, C. W.; Mooney E. S.; Perry, B. y Putt, I. J. (2001). A Framework for
Characterizing Children’s Statistical Thinking. Mathematical Thinking and Learning, 30(5), 269-309.
Junta de Andalucía (1992). Decreto 106/1992 de 9 de Junio (BOJA del 20) por el que se establecen las
enseñanzas correspondientes a la ESO en Andalucía.
Junta de Andalucía (1994). Decreto 126/1994 de 7 de Junio (BOJA del 26 de Julio) por el que se establecen las
enseñanzas correspondientes al Bachillerato en Andalucía.
Junta de Andalucía (1997). Currículo de Bachillerato en Andalucía.
Kahneman, D.; Slovic, P. and Tversky, A. (1982). Judgement under uncertainty: Heuristics and biases.
Cambridge: Cambridge University Press.
Mokros J. and Russell, S. J. (1995). Children’s concepts of average and representativeness. Journal for
Research in Mathematics Education, 26, 2-39.
Moreno, A. and Vallecillos, A. (2001). Exploratory Study on Inferential’ Concepts Learning in Secondary Level
in Spain. In M. van der Heuvel (Ed.): Proceedings of the 25 th Conference of the International Group
of the Psychology of Mathematics Education (PME), p. 343. The Netherlands: Freudenthal Institute
and Utrecht University.
MEC (1990). Ley Orgánica 1/1990 de Ordenación General del Sistema Educativo (LOGSE, BOE de 4 de Octubre).
Madrid.
NCTM. (2000). Principles and Standards for Schools Mathematics. Reston, VA: NCTM.
Rubin, A.; Bruce, B. and Tenney, Y. (1990). Learning About Sampling: Trouble at the Core of Statistic.
Proceedings of the ICOTS III. University of Otago, Dunedin, New Zealand.
Shaughnessy, J. M. (1992). Research in Probability and Statistics: Reflections and Directions. In D. Grouws
(Ed.): Handbook on Research in Mathematics Education, pp. 465-494. London: McMilla n Publishing
Co.
Vallecillos, A. (1998). Research and Teaching of Statistical Inference. Proceeding of the First International
Conference on the Teaching of Mathematics, pp 296-298. Boston: J. Wiley & Sons, Inc.
Vallecillos, A. (in print). Some Empirical Evidences concerning the Difficulties and Misconceptions that Occur
in the Learning of the Logic of Hypothesis Testing. International Statistical Review.
Watson, J. M. and Moritz, J. B. (2000). Developing Concepts of Sampling. Journal for Research in
Mathematics Education [Online], 31(1), 44-70.