Part 7 PDF
Part 7 PDF
Part 7 PDF
REVIEW ARTICLE
Descriptive Statistics
The Specification of Statistical Measures and Their Presentation in Tables and Graphs
Albert Spriestersbach, Bernd Rhrig, Jean-Baptist du Prel, Aslihan Gerhold-Ay, Maria Blettner
SUMMARY
Background: Descriptive statistics are an essential part of
A set of medical data is based on a collection of the
data of individual cases or objects, also called
observation units or statistical units. Every case, for
biometric analysis and a prerequisite for the understanding
of further statistical evaluations, including the drawing of
example every study participant, patient, every experi-
inferences. When data are well presented, it is usually mental animal, every tooth or every cell shows compa-
obvious whether the author has collected and evaluated rable parameters (such as body weight, gender, erosion,
them correctly and in keeping with accepted practice in pH). Each of these parameters, also called variables,
the field. has a specific parameter value (gender = male, age =
30 years, weight = 70 kg) for each observation unit (for
Methods: Statistical variables in medicine may be of either
example the patient). The aim of descriptive statistics
the metric (continuous, quantitative) or categorical
is to summarize the data, so that they can be clearly
(nominal, ordinal) type. Easily understandable examples
illustrated (13).
are given. Basic techniques for the statistical description
The property of a parameter is specified by its so-
of collected data are presented and illustrated with
called scale of measure. Generally two types of para-
examples.
meters are distinguished. A variable has a metric level
Results: The goal of a scientific study must always be (= quantitative data) if it can be counted, measured or
clearly defined. The definition of the target value or clinical weighed in a physical unit (as in cm or kg) or at least
endpoint determines the level of measurement of the can be recorded in whole numbers. Data with a metric
variables in question. Nearly all variables, whatever their
scale of measure can be further classified into conti-
level of measurement, can be usefully presented
nuous and discrete variables. In contrast to discrete
graphically and numerically. The level of measurement
variables, continuous variables can take any value.
determines what types of diagrams and statistical values
Examples for metric continuous parameters are body
are appropriate. There are also different ways of presenting
height in cm, blood pressure in mmHg or the creatinine
combinations of two independent variables graphically and
concentration in mg/L. One example for a metric
numerically.
discrete parameter is the number of erythrocytes per
Conclusions: The description of collected data is microliter of blood.
indispensable. If the data are of good quality, valid and The gender of man cannot be measured, but is clas-
important conclusions can already be drawn when they are sified into two categories. Parameters which can be
properly described. Furthermore, data description provides classified into two or more categories are described as
a basis for inferential statistics. categorical parameters (= qualitative data). A further
Key words: statistics, data analysis, biostatistics, classification of a categorical parameter is into nominal
publication characteristics (unordered) and ordinal characteristics
(ordered according to rank). Good basic portrayals of
Cite this as: Dtsch Arztebl Int 2009; 106(36): 57883 the descriptive statistics of medical data can be found
DOI: 10.3238/arztebl.2009.0578 in text books (49). Figure 1 gives a review of types of
parameters, as well as graphs to be used and statistical
Institut fr Medizinische Biometrie, Epidemiologie und Informatik, Johannes measures.
Gutenberg-Universitt Mainz: Prof. Dr. rer. nat. Blettner, Spriestersbach, Different procedures are necessary for the statistical
Gerhold-Ay evaluation of metric and categorical parameters in graphic
Zentrum Prventive Pdiatrie, Zentrum fr Kinder- und Jugendmedizin, Univer- and tabular forms. The graphs used here and the evalua-
sittsmedizin der Johannes Gutenberg-Universitt Mainz: Dr. med. du Prel,
M.P.H. tion tables were created with the statistics package SPSS
MDK Rheinland-Pfalz, Referat Rehabilitation/Biometrie, Alzey: Dr. rer. nat. for WINDOWS (Version 15). As example, we are using
Rhrig data of 176 sportsmen and women.
FIGURE 1 FIGURE 2
1.1.1. The box plot diagram 1.2. Numerical description of a continuous parameter
Box plots offer a visual impression of the position of the The distribution of a recorded continuous parameter can
first and third quartiles (25th and 75th percentile) and of be numerically described with the following statistical
the median (central value). Also minimum, maximum, measures: Minimum, maximum, quartiles (with median),
and the breadth of scatter of all case values of a contin- range (difference between maximum and minimum),
uous parameter are recognizable. 50% of the values of skewness (indicates whether the distribution is symmetric
distribution are within the box (= interquartile range). A or not), arithmetic mean value, and standard deviation
box with a greater interquartile range indicates greater (= square root of the variance) (6). When the parameter
scatter of the values. Figure 2 shows an example of the "skewness" is between 1 and +1, the distribution is
Discussion
The exact description of data collected in a study is sen-
sible and important. The correct descriptive presentation
of the results is the first step in evaluating and graphi-
cally presenting the results (79, 11). The description is
the basis of the biometric evaluation and is the indis-
pensable starting point for further methodological pro-
cedures such as statistical significance tests. The
descriptive presentation of study results usually occupies
most of the space in publications. The description
covers the graphical and tabular presentation of the
TABLE 2 results. The exact assessment of the scale level of the
parameter is important as the scale level determines the
Distribution of the parameters smoking and cough type and procedure, both in the descriptive, and in the
explorative (= generating of hypothesis) and confirma-
Cough
No Yes Total tory (= biometric testing of hypothesis) evaluation. The
Smokers No Number 21 8 29
selection of a suitable statistical test procedure for con-
% of smokers 72% 28% 100% trolling the significance is determined by the scale level
% with cough 75% 29% 52% of the investigated parameters.
Yes Number 7 20 27 In normally distributed data, the arithmetic mean
% of smokers 26% 74% 100% value is the same as the median. The skewness has the
% with cough 25% 71% 48% value zero. Unfortunately there rarely is a normal distri-
Total Number 28 28 56 bution in natural systems as in parameters collected in
% of smokers 50% 50% 100% patients. It is sensible to give the arithmetic mean value,
% with cough 100% 100% 100%
as well as the median for continuous data. A normal dis-
tribution cannot be assumed when the two values are
very different. The arithmetic mean value cannot be cal-
culated in data with a purely ordinal scale. It is often
asked whether graphical or numerical presentations are
Example of grouped FIGURE 7 preferable in data description. Graphics serve to give a
box plots first impression and to visually illustrate the situation of
distribution parameters. It can be difficult to read the
exact values of the median or the percentiles on the y-axis
in a box plot diagram. For this reason, calculation and
presentation of the exact statistical characteristic values
is indispensable.
In the individual case, the information of further bio-
metric measures is of course valuable, including measures
not mentioned in the article. Examples would be effect
sizes, confidence intervals, Cohen's kappa, relative risk,
and cumulated values.
The use of suitable validated statistics software like
SPSS or SAS is recommended for statistical evaluation
of data.
Manuscript received on 4 February 2009, revised version accepted on 7. Trampisch HJ, Windeler J: Medizinische Statistik. Berlin, Heidelberg,
16 March 2009. New York: Springer 2000; 2. Auflage: 5282.
Translated from the original German by Rodney A. Yeates, M.A., Ph.D. 8. Hilgers RD, Bauer P, Schreiber V: Einfhrung in die Medizinische Sta-
tistik. Berlin, Heidelberg, New York: Springer 2003; 343.
REFERENCES 9. Altman DG: Practical statistics for medical research. Boca Raton,
London, New York, Washington D.C.: Chapman & Hall/CRC 1999;
1. Greenfield MLVH, Kuhn JE, Wojtys EM: A statistics primer: descriptive 1045.
measures for continuous data. Am J Sports Med 1997; 25: 7203.
10. Zawalski R: Messung der Hautfaltendicke am Handrcken mit Hilfe
2. McHugh ML: Descriptive statistics, part I: level of measurement. einer Mikrometerschraube [Dissertation]. Mainz: Fachbereich Medi-
JSPN 2003; 8: 357. zin der Johannes Gutenberg-Universitt; 1997.
3. Overholser BR, Sowinski KM: Biostatistics primer: part I. Nutr Clin 11. Du Prel JB, Rhrig BBM: Kritisches Lesen wissenschaftlicher Artikel.
Pract 2007; 22: 62935. Dtsch Arztebl Int 2009; 106: 1005.
4. SPSS Incorporated: SPSS 16.0 Schneller Einstieg. Dublin: SPSS Inc.
Coresponding author
2007; 5562.
Prof. Dr. rer. nat. Maria Blettner
5. Bortz J: Statistik fr Sozialwissenschaftler. Berlin Heidelberg New Institut fr Medizinische Biometrie,
York: Springer 1999; 5. Auflage: 1747. Epidemiologie und Informatik
Johannes Gutenberg-Universitt
6. Sachs L: Angewandte Statistik: Anwendung statistischer Methoden. 55101 Mainz, Germany
Berlin, Heidelberg, New York: Springer 2004; 11. Auflage: 1177. [email protected]