Academia.eduAcademia.edu

Inferential statistics and the graphics calculator

1998, Proceedings of the Third Asian Technology …

ABSTRACT: Recent teaching and learning of elementary statistics have been influenced by the use of statistical packages on microcomputers, which have permitted data storage and flexible data analysis. Scientific calculators have routinely provided statistical capabilities for ...

Inferential statistics and the graphics calculator Barry Kissane The Institute of Education Murdoch University, Australia [email protected] ABSTRACT: Recent teaching and learning of elementary statistics have been influenced by the use of statistical packages on microcomputers, which have permitted data storage and flexible data analysis. Scientific calculators have routinely provided statistical capabilities for some time, but generally they have been too limited to be used as alternatives to computer statistics packages, at least at the undergraduate level. In this paper, graphics calculators are regarded as devices that combine some of the advantages of each of these two kinds of technology for early work in statistics. The first generation of graphics calculators, while providing significant data analysis opportunities, were insufficient for the needs of students of early undergraduate statistics, since the important aspects of inferential statistics were not accessible. Later models include capabilities dealing with hypothesis testing, the construction of confidence intervals and the tabulation of probability distributions. It is suggested that these meet most of the statistical needs of introductory courses. The small size and cost of graphics calculators increase the prospect that individual students will have ready access to them at all times, with significant curriculum implications identified. Programming capabilities of graphics calculators permit student explorations dealing with important concepts in statistical inference to be conducted, some examples of which are described in the paper. Introduction In the space of one generation, the practice of statistics has changed considerably as a consequence of developments in technology. The computer has had three significant effects on statistical practice. Firstly, data storage and statistical computations are now routinely handled by computers rather than more tedious and error-prone human equivalents. Secondly, the processing capabilities of the computer have given rise to new data analysis opportunities, most notably exploratory data analysis, first popularised by John Tukey in the 1970's. The third change wrought by computers, which allow for visual displays of data to be readily made, is to provide graphical as well as numerical representations of data. Whereas computers with suitable software for statistical purposes were at first only available on mainframes, located remotely from users, it is now quite common for statistical analyses to be conducted on desktop microcomputers equipped with powerful statistical software. Changes of such magnitude can reasonably be expected to exert considerable influence on the undergraduate statistics curriculum. It is clearly important that students learn statistics in realistic ways, so that they become adept at making use of suitable technology, such as computer software packages. Employers of graduates these days naturally expect that students will have acquired some competence with suitable combinations of hardware and software by the time they enter the job market. To bring this goal about, students need to be taught to use particular combinations of software and hardware efficiently and some space for this new need must be created in the curriculum. As well as curriculum space, physical equipment must also be provided in order for students to have reasonable access to appropriate computers and software; this is often much more difficult for institutions to provide adequately, since much elementary statistics teaching involves large undergraduate classes. One solution to this problem is to insist that students purchase their own computer and suitable statistical software, but there are many students for whom this is not yet a reasonable expectation, particularly students in developing countries or from less affluent families in developed countries. Questions of access to technology are clearly of critical importance since, without reasonable access for all students, it is unreasonable and unrealistic to demand student competence. In many circumstances, access to technology is much easier to provide with graphics calculators than with microcomputers, as argued elsewhere (e.g., Kissane, Bradley & Kemp 1994; Kissane, 1995). Even in relatively affluent countries such as Australia and the United States, student access to computers is often rather meagre; the situation is clearly much worse in most parts of most developing countries and in many parts of Asia. The case for graphics calculators as an alternative to computers seems particularly strong in economic situations where resources for teaching and learning mathematics are limited. In addition, it seems unwise to consider appropriate hardware and software for statistics without acknowledging the need to consider technological needs of other aspects of the mathematics curriculum, as Biehler (1991) noted: A major problem with mathematical tools is that each one requires specific knowledge, for both students and teachers. As time in school is very limited, more complicated tools for probability education cannot be chosen without taking into account which tools should be used in the entire mathematics curriculum. This is why the usefulness outside probability is important. (p.181) With reasonable access to technology, we might expect that teaching elementary inferential statistics might focus on statistical concepts and thinking more than was previously possible, to take advantage of the potential of technology to take care of most of the relevant routine computation. As well as appropriate content concerned with using the software efficiently, the modern statistics curriculum must also develop the underlying statistical ideas so that students understand them well enough to be able to critically interpret the results of data analyses. The evolution of the statistical calculator Early models of scientific calculators included some statistical capabilities dealing with univariate data or, for more expensive models, bivariate data. These calculators typically contained commands for calculating means, standard deviations (both biased and unbiased estimates) as well as summary data for using raw score formulas for computing statistics of these kinds. In the case of calculators equipped for handling bivariate data, least squares linear regression coefficients and Pearson product-moment correlation coefficients were also calculated automatically. Some calculators included commands for accessing cumulative normal distribution tables, obviating the need for students to refer to tables of values. The major advantage of scientific calculators was associated with arithmetic: students permitted to use such a calculator did not have to engage in a great deal of tedious arithmetic computation in order to calculate statistics of interest. The computational easing also meant that data could be analysed directly, rather than having to be rounded to produce more convenient numbers. Actual measurements could be used, rather than integer approximations for computational convenience, and it was no longer necessary to group data into convenient intervals for computational ease. In theory, although not necessarily in practice, students could analyse their data directly, given access to such a calculator. For the purpose of conducting inferential tests, the calculator provided most of the necessary raw materials for computing statistics (such as a t-statistic), and required of users only that they manipulate the calculator results appropriately. The main limitation of these calculators concerned data storage. Because of memory limitations, scientific calculators do not store individual data points as they are entered. Rather, appropriate raw score summary statistics are obtained as data are entered, and these are used to compute statistics on request. A consequence of this is that students do not know for sure what data have been entered, and most key punching errors are not detectable. Similarly, although mechanisms for deleting data are available, it is difficult to undertake data editing and not possible to undertake data transformations. As well as storage limitations, scientific calculators did not allow flexibility of data analysis; users were restricted to a small range of numerical descriptors of a data set. These limitations were all overcome with the development of graphics calculators, mainly because data were stored in the calculators. Early models, in the mid and late 80's, provided for the storage of data, thus allowing easy checking, editing and transformations of data to be carried out. The second generation of graphics calculators, in the early 1990's, incorporated space for several variables (usually six) and organised them into data lists for ease of access and handling. As the adjective 'graphics' suggests, an additional advantage of these was their capacity to create graphical displays of data as well as numerical summaries. To an extent, this generation of graphics calculators allowed students to take advantage of technology in the three ways noted above: automatic storage and computation, exploratory data analysis and visual displays. Compared with a computer statistics package, graphics calculator limitations were considerable: data sets cannot be too large, screens are small and (generally) monochromatic and a limited range of analyses (only concerned with descriptive statistics) were available. Despite these limitations, graphics calculators seemed promising for student work in statistics, as their small size and relative cost made it easier to provide technological support for large numbers of students than is the case for computers. Recently, a third generation of graphics calculators has appeared, offering a little more support for students of introductory statistics. The substantial differences from the second generation for statistics include the provision of hypothesis testing capabilities, automatic computation of confidence intervals and the routine availability of the main probability distributions likely to be encountered by students in the early years of their study of statistics. Three of the four manufacturers of graphics calculators aimed at the upper secondary and lower undergraduate levels (Casio, Sharp and Texas Instruments) have produced third generation calculators with these capabilities, with, naturally enough, small differences in styles of operation and capabilities. For each calculator, data stored (by key entry, electronic transfer or calculator simulation) can be subjected to inferential statistical analysis in flexible and quite powerful ways. In most cases, an intuitive local syntax is used to effect suitable analyses. Examples of inferential capabilities To illustrate some of these capabilities, the screens below are taken from a Casio cfx-9850G Plus graphics calculator. Figure 1 shows the calculator instructions needed to perform a one-way analysis of variance when there are three levels of a particular treatment involved. The raw data have been entered into Lists 1, 2 and 3 as indicated. Figure 1: Setting up a one-way analysis of variance on a Casio cfx-9850G Plus The resulting calculator output is shown in Figure 2. In this case, the calculator screen size does not permit all of the output to be displayed on a single screen, and it is necessary for the user to scroll down to obtain the last three rows (as shown on the second screen of Figure 2). Other information of relevance to such an analysis, such as cell means and standard deviations, is available in the calculator in the usual way, but is not reported at the same time as the ANOVA summary table. This information can be used to conduct ad hoc post comparisons tests or to interpret better the results of the ANOVA. Figure 2: ANOVA summary table Some kinds of hypothesis testing (such as t-tests and z-tests for means) are aided by graphical displays of results as well as numerical tests of significance. Another example is Chi-squared hypothesis testing associated with contingency tables, such as tests of independence and tests of fit to a model. Unlike tests of means, raw data for contingency tables are stored in data matrices, using the matrix area of calculators. Figure 3 shows the output for a particular analysis in graphical rather than numerical form, to illustrate this kind of possibility. As for other analyses, further information is available from the calculator than what is routinely provided by the significance test. In this case, a matrix of expected values is automatically constructed and can be recalled by the user if desired. Figure 3: Graphical display of a Chi-squared test on a Casio cfx-9850G Plus Figure 4 shows an example of the construction of a confidence interval, once data have been entered into the calculator. Third generation calculators typically include both one-sample and two-sample confidence intervals, using either z or t distributions. The choices of which confidence interval to construct are, of course, made by the user. Figure 4: Calculator screens showing the construction of a confidence interval about a single mean on a Casio cfx-9850G Plus Figure 5 shows an example of the third embellishment of graphics calculators, involving values for the main probability distributions of relevance to introductory study of inferential statistics. Although some graphics calculators provided normal distribution values (as did some scientific calculators in fact), third generation models routinely provide information about the distributions typically encountered by students when first studying inferential statistics: normal, Student's-t, binomial, Poisson and Chi-squared. The screen on the left of Figure 5 displays the details of the probability distribution sought (in this case, the cumulative binomial, with p = 0.25 and n = 20), while the screen on the right shows the actual value, in this case Prob (x ≤ 5) ≈ 0.61717. Figure 5: Cumulative binomial distribution values from a Casio cfx-9850G Plus For all three kinds of inferential analysis (hypothesis testing, confidence interval construction and probability determination), communication between the user and the calculator is fairly intuitive, so that students familiar with the operations of their calculator for basic descriptive statistics will not need a lot of extra time to master the commands needed. Learning with a calculator One of the reasons for delaying the study of inferential statistics until the end of secondary school or the early undergraduate years is that it is notoriously difficult for students to learn, on account of the probabilistic ideas involved. This point is emphasised in a recent major treatise on learning probability: "... probability is a difficult subject to learn and teach. ... In probability, paradoxes or counterintuitive ideas occur at the very heart of the subject, in the definition, and subsequently in relatively simple applications. This is borne out by the vast psychological research on people's misconceptions as well as in the difficulty experienced by pupils in applying probabilistic notions in problems. People do not seem to have probabilistic intuition in the same way they have geometric or visual intuition." (Kapadia & Borovcnik, 1991, p.2) The first and most obvious implication of the use of a modern graphics calculator for statistical inference is that less time may be needed for many computational aspects, such as constructing confidence intervals and conducting hypothesis tests. While clearly a graphics calculator will be quicker and more convenient than performing relevant computations by hand, it is likely for most students that it will also be more affordable, accessible and available than a computer statistics package. The convenience of the calculator, assuming that it is either owned by or on longterm loan to a student, far exceeds that of a computer for very many students (especially those without personal access to appropriate hardware and software at home). This is particularly so at the introductory levels, often the first undergraduate year, where class enrolments frequently exceed the number of available computers on a campus, hours of use are somewhat restricted, security issues make computer access inconvenient and, even when the computer is readily available, statistical software is often quite sophisticated and a little daunting for students at the introductory levels. As noted above, it is frequently necessary for students to develop eventually some expertise with computer software for statistics, however inconvenient it may be to access computers on which to use it; this does not necessarily mean however that it is a wise educational strategy to start with such software. Mathematics teachers (at all levels and in all countries) seem to agree that there is insufficient time available for teaching and learning. So, activities that seem likely to demand less student time than the present activities are probably advantageous. The time saved by avoiding hand computations, waiting for access to a computer or navigating the menu structures or data requirements of a large statistics package can be used to engage in learning activities to develop better intuitions about inferential statistics. Concepts such as those of sampling distribution, hypothesis test, confidence interval, statistic, error rates and statistical significance are critically important to understanding statistical inference and very difficult to acquire, as suggested in the quote above. Providing more time to develop the concepts, activities to illuminate them and opportunities to explore them will not resolve the difficulties for students, but it is not unreasonable to expect that it may lessen them. The following sections briefly describe some of the learning possibilities available with a modern graphics calculator. Sampling distributions The notion of a sampling distribution is critical to classical statistical inference. One way to help students' develop intuitions about this concept is to provide them with the means of exploring what happens when successive random samples are taken from a population. A short calculator program is needed to do this, using the pseudo-random number generator built into all graphics calculators. An example of such a program for the Casio cfx-9850G is shown in Figure 6; the program assumes that a (finite) population of M values is stored in List 1 and has the effect of selecting a simple random sample of size N from this population, storing it in List 6 and displaying the mean of the sample. (The program includes only eight lines of code, but two calculator screens are necessary to display them all.) Figure 6: A program to select simple random samples Students can either be given program steps and asked to enter them into their calculator, or electronic transfer of programs between calculators or between a computer and calculators is possible. When executed, the program selects a sample of the desired size, stores it and then displays the mean of the sample. For example, output from three successive runs of the program is shown in Figure 7. Figure 7 : Means of three samples from a population obtained from a program There are at least two ways of using such a program to help develop a notion of a sampling distribution. One possibility is for each of the students in a class to select a sample, find its mean and report the result. The set of results can then be analysed as an empirical sampling distribution. It is not a very large jump from this idea to that of the general notion of a sampling distribution, and even a class of thirty students will readily generate a distribution in this way that shows some of the important features. A second way of using such a program is for students to generate sets of samples by themselves and construct empirical sampling distributions. In either case, explorations of the effects of varying the sample size are relatively easy to carry out. Ideas of this kind are not new. (They were suggested in Kissane (1981), for example.) What is new is the prospect that they might be accessible easily to all students in a class. Programs of this kind can also be written to automatically select a number of samples of a certain size and then to generate an empirical distribution of the resulting means, effectively automating the process of repetitive sampling. A similar sort of approach can be taken to explore resampling methods for inferential statistics, such as those described by Borovcnik & Peard (1996, p.260). Confidence intervals The idea of a confidence interval is not an easy one to grasp and the construction of confidence intervals can be a tedious operation. With a modern graphics calculator, the construction of a confidence interval is easy, freeing up some space to think about the ideas involved. Thus, students can use a random sample chosen as described above to readily construct a confidence interval for the mean of a population, and compare their interval with the actual population mean. As with sampling distributions, each member of the class can do this empirically and the results of all compared, or the same student can repeat the operation several times and record the outcome each time. Roughly speaking (since the results of course only hold in general rather than in particular cases), a class will find that about 90% of their 90% confidence intervals will actually contain a population mean, while almost all of their 99% confidence intervals will do so. Another way in which a graphics calculator may be helpful in developing students' concepts of confidence intervals arises from its capacity to quickly explore the relationships between confidence intervals and levels of confidence or between confidence intervals and hypothesis testing. To illustrate some of these possibilities, Figure 8 shows a pair of confidence intervals constructed about the difference between the means of two independent samples, using a t-distribution. Students can use the graphics calculator commands to generate quickly a succession of confidence intervals to different levels of confidence, in order to see that the width of the intervals increases with increasing confidence. Discussions about why that should be so will help students understand the ideas involved. Figure 8: 95% and 99% confidence intervals for the difference of two means Similarly, comparisons of confidence intervals with the associated t-test can lead to fruitful discussions exploring the relationship between the construction of a confidence interval and the notion of statistical significance, as realised with a hypothesis test. In this case, a graphical representation of the relevant two-tailed test is shown in Figure 9. Figure 9: Graphical representation of t-test of difference between two means Students can use the calculator to find the smallest probability level that would lead to a confidence interval that included zero; such a question will demand that they explore the connections between the relevant ideas. Strictly speaking, of course, these kinds of activities can only suggest what the relationships between important concepts are; the theoretical results concern infinite populations and rest in part on assumptions of normality of population distributions. More sophisticated students might explore informally the effects of violating such assumptions, by comparing results with populations (approximately) normally distributed with those from populations which are clearly not normally distributed. For such explorations, it is most convenient to work with simulated data, using the calculator's random number generation capabilities. While some graphics calculators provide a special command for simulating random normal deviates, useful for these sorts of explorations, for others a transformation of uniform random numbers is needed. One possibility is z = cos (2πa) -2lnb , which generates a random variable z approximately distributed as N(0,1) from a pair of independent random variables a and b, each of which is uniformly distributed on the interval (0,1). Hypothesis testing There are several ways in which informal calculator activities may help to develop student intuitions about hypothesis testing. As noted earlier, some of these may involve exploring the links between hypothesis testing and the construction of confidence intervals. In a similar way, students can use a random sampling program like the one above to select a sample and to then conduct an hypothesis test. If each of the students in a class goes through the same exercise, it is possible to see that the type 1 error rate describes how often a null hypothesis is rejected when it is true. As before results are only approximations, of course, but will help students to appreciate the meanings of the relevant concepts. Another kind of possibility is to explore the relationships between significance tests and data, especially for small data sets, to develop an intuitive feel for the fragility of results from small samples. For example, for the contingency table shown in Figure 10, the null hypothesis of no association will be rejected at the 5% level of significance. Figure 10: A Chi-squared test of association on a 2 x 3 table To help interpret the meaning of such a significance test, students can readily change the raw data and conduct further tests of the same kind. For example, Figure 11 shows that the null hypothesis of no association would not have been rejected at the 5% level, had the frequencies of 18 and 9 instead been 17 and 10. Figure 11: Examining the effect on the test of a slight change in the original data Similar kinds of explorations can be conducted for other hypothesis tests such as ttests and z-tests. Such activities may help students to appreciate some of the limitations of classical hypothesis testing, as well as providing some powerful encouragement to students for making sure that data entries are checked carefully. Conclusion Graphics calculators appear to be powerful and flexible tools for introductory teaching and learning of inferential statistics, and may be more accessible to students than are computers. The limited requirements of first courses involving inferential statistics are probably met by most modern graphics calculators, which are also useful for other mathematical purposes as well. This paper has suggested a number of ways in which the calculators might be used to enhance students' intuitions about important inferential ideas such as sampling distributions, confidence intervals and hypothesis testing. References Biehler, R. 1991, Computers in probability education, In R. Kapadia & M. Borovcnik (Eds), Chance Encounters: Probability in Education, Dordrecht, Kluwer Academic, pp 169-211. Borovcnik, M. & Peard, R. 1996, Probability, In Bishop, A.J., Clements, K., Keitel, C., Kilpatrick, J. & Laborde, C. (Eds.), International Handbook of Mathematics Education, Dordrecht, Kluwer Academic, pp 239-287. Kapadia, R. & Borovcnik, M. 1991, The educational perspective, In R. Kapadia & M. Borovcnik (Eds), Chance Encounters: Probability in Education, Dordrecht, Kluwer Academic, pp 1-26. Kissane, B. 1981, Activities in inferential statistics, National Council of Teachers of Mathematics Yearbook, Reston, VA.: NCTM, 182-193. Kissane, B. 1995, The importance of being accessible: The graphics calculator in mathematics education, Proceedings of the First Asian Technology Conference in Mathematics, Association of Mathematics Educators: Singapore, 161-170. Kissane, B. 1997, Chance and data: New opportunities provided by the graphics calculator, in W.C. Yang & Y.A. Hasan (Eds), Computer Technology in Mathematical Research and Teaching, Penang, Malaysia, School of Mathematical Sciences, pp 80-88. Kissane, B., Bradley, J., & Kemp, M. 1994, Graphics calculators, equity and assessment, Australian Senior Mathematics Journal, 8(2), 31-43. Original Source This paper is reproduced with permission from: Kissane, B. 1998, Inferential statistics and the graphics calculator, in W.C. Yang, K. Shirayanagi, S.-C. Chu & G. Fitz-Gerald(Eds) Proceedings of the Third Asian Technology Conference in Mathematics, Tsukuba, Japan, Singapore: Springer. (ISBN 981-4021-15-6), pp 111-121. http://www.cs.runet.edu/~atcm/EPATCM98 Please see http://www.atcminc.com for further details.