A Brief Introduction To Error Analysis and Propagation: Georg Fantner February 2011

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

A brief introduction to error analysis and propagation

Georg Fantner

February 2011

Contents
1 Acknowledgements 2

2 Random and systematic errors 2

3 Determining random errors 2


3.1 Instrument Limit of Error (ILE) and Least Count . . . . . . . . . . . . . . 2
3.2 Estimated Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3 Average Deviation: Estimated Uncertainty by Repeated Measurements . . . 3
3.4 How to Compute the Standsrd Deviation . . . . . . . . . . . . . . . . . . . 5
3.4.1 Why n-1? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.4.2 When should the SD be computed with a denominator of n? . . . . 5
3.5 Conflicting Uncertainty Numbers . . . . . . . . . . . . . . . . . . . . . . . . 6
3.6 Why make many measurements? Standard Error in the Mean. . . . . . . . 6
3.6.1 When to use the standard error and when the standard deviation? . 7
3.6.2 How are Standard Error and Standard Deviation Related? . . . . . . 7

4 Relative vs. Absolute Errors 8

5 Propagation of Errors 8
5.1 Addition and Subtraction: z=x+y or z=x-y . . . . . . . . . . . . . . . . . . 8
5.2 Multiplication by an exact number . . . . . . . . . . . . . . . . . . . . . . . 9
5.3 Multiplication and Division: z = x y or z = x/y . . . . . . . . . . . . . . . . 10
5.4 Products of Powers z = xm + y n . . . . . . . . . . . . . . . . . . . . . . . . 11
5.5 Mixtures of multiplication, division, addition, subtraction, and powers. . . . 11

6 Significant Digits 11

7 Rounding off answers in regular and scientific notation 13

1
1 Acknowledgements
This text is based on texts written by Vern Lindberg and others.

2 Random and systematic errors


No measurement made is ever exact. The accuracy (correctness) and precision (num-
ber of significant figures) of a measurement are always limited by the degree of refinement
of the apparatus used, by the skill of the observer, and by the basic physics in the experi-
ment. In doing experiments we are trying to establish the best values for certain quantities,
or trying to validate a theory. We must also give a range of possible true values based
on our limited number of measurements.
Why should repeated measurements of a single quantity give different values? Mistakes on
the part of the experimenter are possible, but we do not include these in our discussion. A
careful researcher should not make mistakes! (Or at least she or he should recognize them
and correct the mistakes.) We use the synonymous terms uncertainty, error, or deviation
to represent the variation in measured data. Two types of errors are possible. Systematic
error is the result of a mis-calibrated device, or a measuring technique which always makes
the measured value larger (or smaller) than the ”true” value. An example would be using
a steel ruler at liquid nitrogen temperature to measure the length of a rod. The ruler will
contract at low temperatures and therefore overestimate the true length. Careful design
of an experiment will allow us to eliminate or to correct for systematic errors. Even when
systematic errors are eliminated there will remain a second type of variation in measured
values of a single quantity. These remaining deviations will be classed as random errors,
and can be dealt with in a statistical manner. This document does not teach statistics in
any formal sense, but it should help you to develop a working methodology for treating
errors.

3 Determining random errors


How can we estimate the uncertainty of a measured quantity? Several approaches can be
used, depending on the application.

3.1 Instrument Limit of Error (ILE) and Least Count


The least count is the smallest division that is marked on the instrument. Thus a meter
stick will have a least count of 1.0 mm, a digital stop watch might have a least count of
0.01 sec. The instrument limit of error, ILE for short, is the precision to which a
measuring device can be read, and is always equal to or smaller than the least count.
Very good measuring tools are calibrated against standards maintained by the National
Institute of Standards and Technology. The Instrument Limit of Error is generally taken

2
to be the least count or some fraction (1/2, 1/5, 1/10) of the least count). You may wonder
which to choose, the least count or half the least count, or something else. No hard and
fast rules are possible, instead you must be guided by common sense. If the space between
the scale divisions is large, you may be comfortable in estimating to 1/5 or 1/10 of the
least count. If the scale divisions are closer together, you may only be able to estimate to
the nearest 1/2 of the least count, and if the scale divisions are very close you may only be
able to estimate to the least count.
For some devices the ILE is given as a tolerance or a percentage. Resistors may be specified
as having a tolerance of 5%, meaning that the ILE is 5% of the resistor’s value.

3.2 Estimated Uncertainty


Often other uncertainties are larger than the ILE. We may try to balance a simple
beam balance with masses that have an ILE of 0.01 grams, but find that we can vary the
mass on one pan by as much as 3 grams without seeing a change in the indicator. We
would use half of this as the estimated uncertainty, thus getting uncertainty of 1.5 grams.
Another good example is determining the focal length of a lens by measuring the distance
from the lens to the screen. The ILE may be 0.1 cm, however the depth of field may be
such that the image remains in focus while we move the screen by 1.6 cm. In this case the
estimated uncertainty would be half the range or 0.8 cm.

3.3 Average Deviation: Estimated Uncertainty by Repeated Measure-


ments
The statistical method for finding a value with its uncertainty is to repeat the mea-
surement several times, find the average, and find either the average deviation or the
standard deviation. Suppose we repeat a measurement several times and record the differ-
ent values. We can then find the average value, here denoted by a symbol between angle
brackets, < t >, and use it as our best estimate of the reading. How can we determine the
uncertainty? Let us use the following data in table 1 as an example. Column 1 shows a
time in seconds.
A simple average of the times is the sum of all values (7.4+8.1+7.9+7.0) divided by the
number of readings (4), which is 7.6 sec. We will use angular brackets around a symbol to
indicate average; an alternate notation uses a bar is placed over the symbol.
Column 2 of Table 1 shows the deviation of each time from the average, (t− < t >). A
simple average of these is zero, and does not give any new information.
To get a non-zero estimate of deviation we take the average of the absolute values of the
deviations, as shown in Column 3 of Table 1. We will call this the average deviation, Dt.
Column 4 has the squares of the deviations from Column 2, making the answers all positive.
The sum of the squares is divided by 3, (one less than the number of readings), and the
square root is taken to produce the sample standard deviation. An explanation of why we

3
Table 1: Values showing the determination of average, average deviation, and standard
deviation in a measurement of time. Notice that to get a non-zero average deviation we
must take the absolute value of the deviation.

Time t/s (t− < t >)/s |(t− < t >)|/s (t− < t >)2 /s2
7.4 -0.2 0.2 0.04
8.1 0.5 0.5 0.25
7.9 0.3 0.3 0.09
7.0 -0.6 0.6 D 0.36
E
2
< t >= < t− < t >>= < |t− < t > | >= (t− < t >) =
7.6 0 0.4 rD 0.247
E
standard deviation (t− < t >)2 =
0.5

divide by (N-1) rather than N is found below. The sample standard deviation is slightly
different than the average deviation, but either one gives a measure of the variation in the
data.

For a second example, consider a measurement of length shown in Table 2. The average
and average deviation are shown at the bottom of the table.

Table 2: Example of finding an average length and an average deviation in length. The
values in the table have an excess of significant figures. Results should be rounded as
explained in the text. Results can be reported as (15.5 ± 0.1) m or (15.47 ± 0.13) m. If
you use standard deviation the length is (15.5 ± 0.2) m or (15.47 ± 0.18) m.

Length, x, m |x− < x > |, m (x− < x >)2 /m2


15.4 0.06667 0.004445
15.2 0.26667 0.071112
15.67 0.133337 0.017777
15.77 0.23333 0.054443
15.5 0.03333 0.001111
15.4 0.06667 0.004445
Average: 15.46667 m 0.133333 m St. dev. 0.17512

4
3.4 How to Compute the Standsrd Deviation
How to calculate the standard deviation

1. Compute the square of the difference between each value and the sample mean.

2. Add those values up.

3. Divide the sum by n-1. This is called the variance.

4. Take the square root to obtain the Standard Deviation.

3.4.1 Why n-1?


Why divide by n-1 rather than n in the third step above? In step 1, you compute the
difference between each value and the mean of those values. You don’t know the true
mean of the population; all you know is the mean of your sample. Except for the rare cases
where the sample mean happens to equal the population mean, the data will be closer to
the sample mean than it will be to the true population mean. So the value you compute
in step 2 will probably be a bit smaller (and can’t be larger) than what it would be if you
used the true population mean in step 1. To make up for this, divide by n-1 rather than n.

But why n-1? If you knew the sample mean, and all but one of the values, you could
calculate what that last value must be. Statisticians say there are n-1 degrees of freedom.

3.4.2 When should the SD be computed with a denominator of n?


Statistics books often show two equations to compute the SD, one using n, and the other
using n-1, in the denominator. Some calculators have two buttons.

The n-1 equation is used in the common situation where you are analyzing a sample of
data and wish to make more general conclusions. The SD computed this way (with n-1 in
the denominator) is your best guess for the value of the SD in the overall population.

If you simply want to quantify the variation in a particular set of data, and don’t plan
to extrapolate to make wider conclusions, then you can compute the SD using n in the
denominator. The resulting SD is the SD of those particular values. It makes no sense to
compute the SD this way if you want to estimate the SD of the population from which
those points were drawn. It only makes sense to use n in the denominator when there is
no sampling from a population, there is no desire to make general conclusions.

The goal of science is always to generalize, so the equation with n in the denominator

5
should not be used. The only example I can think of where it might make sense is in quan-
tifying the variation among exam scores. But much better would be to show a scatterplot
of every score, or a frequency distribution histogram.

3.5 Conflicting Uncertainty Numbers


In some cases we will get an ILE, an estimated uncertainty, and an average deviation and
we will find different values for each of these. We will be pessimistic and take the largest
of the three values as our uncertainty. [When you take a statistics course you should learn
a more correct approach involving adding the variances.] For example we might measure
a mass required to produce standing waves in a string with an ILE of 0.01 grams and an
estimated uncertainty of 2 grams. We use 2 grams as our uncertainty.

The proper way to write the answer is

1. Choose the largest of (i) ILE, (ii) estimated uncertainty, and (iii) average or standard
deviation

2. Round off the uncertainty to 1 or 2 significant figures.

3. Round off the answer so it has the same number of digits before or after the decimal
point as the answer.

4. Put the answer and its uncertainty in parentheses, then put the power of 10 and unit
outside the parentheses.

3.6 Why make many measurements? Standard Error in the Mean.


We know that by making several measurements (4 or 5) we should be more likely to get a
good average value for what we are measuring. Is there any point to measuring a quantity
more often than this? When you take a statistics course you will learn that the standard
error in the mean is affected by the number of measurements made.

The standard error in the mean in the simplest case is defined as the stan-
dard deviation divided by the square root of the number of measurements.

The following example illustrates this in its simplest form. I am measuring the length
of an object. Notice that the average and standard deviation do not change much as the
number of measurements change, but that the standard error does dramatically decrease
as N increases.

6
Table 3: Influence of the number of samples on the standard deviation and standard error.

Number of Measurements, N Average Standard Deviation Standard Error


5 15.52 cm 1.33 cm 0.59 cm
25 15.46 cm 1.28 cm 0.26 cm
625 15.49 cm 1.31 cm 0.05 cm
10000 15.49 cm 1.31 cm 0.013 cm

3.6.1 When to use the standard error and when the standard deviation?
We can consider the difference between standard deviation and standard error like this:

• The standard deviation (SD) is how spread out THINGS in the population are, and
this is calculated (somehow) from the data in your sample. It is useful in describing
the population itself.

• The standard error (SE) is how spread out the SAMPLE MEAN will be around the
true population mean. It is useful in describing how close your results will be to the
right answer.

As a simple rule we can decide on if we use the standard deviation or the standard
error by deciding if we are measuring one value multiple times (use standard error), or if
we are measuring one quantity in multiple cases (use standard deviation).Another way to
decide if you imagine you could make a perfect measurement, would you always get the
same number, then use the standard error.

This means that when we want to describe for example a population of cells, by mea-
suring their length, we will calculate the mean and the standard deviation, because there
is no ”right length”. When we want to measure the temperature in our incubator, we
will calculate the average temperature and the standard error, because there is a ”right
temperature”.

3.6.2 How are Standard Error and Standard Deviation Related?


It so happens that there is a very simple relationship between SD and SE. You can calculate
SE by the following formula:
SD
SE = √
n
Now this is really quite a simple and beautiful little formula.

7
• It starts out with the way the world IS (that’s SD - how spread out the data are, and
there is virtually NOTHING you can do about it).

• It then talks about how hard you WORK (that’s the sample size ”n”), and you ARE
in control of that. Please note that is how hard you work, not how smart).

• It then tells you HOW GOOD your average is likely to be with that amount of effort
(the Standard Error).

4 Relative vs. Absolute Errors


When stating the error of a measurement, we can either state it as the absolute value of
the error we calculated (absolute error), for example R = (33 ± 1.65)kΩ. Or we could state
our error percentage R = (33 ± 5%)kΩ. The relative error is the fractional uncertainty.
Percentage error is the fractional error multiplied by 100%. While strictly speaking the
relative error and the percentage error are different things, they are often used synony-
mously. In practice, either the percentage error or the absolute error may be provided.
Thus in machining an engine part the tolerance is usually given as an absolute error, while
electronic components are usually given with a percentage tolerance.

5 Propagation of Errors
Suppose two measured quantities x and y have uncertainties, Dx and Dy, determined by
procedures described in previous sections: we would report (x ± Dx), and (y ± Dy).
From the measured quantities a new quantity, z, is calculated from x and y. What is
the uncertainty, Dz, in z? There are two ways to get an estimate for the error of z. In
the simplified version the guiding principle in all cases is to consider the most pessimistic
situation. In this case we add the individual uncertainties. This certainly gives us the
safe limit of our estimate, but sometimes we want to be more restrictive in our answers.
In the proper statistical treatment of error propagation we use the standard deviations to
calculate the resulting uncertainty
The examples included in this section also show the proper rounding of answers. The
examples use the propagation of errors using average deviations.

5.1 Addition and Subtraction: z=x+y or z=x-y


Derivation: We will assume that the uncertainties are arranged so as to make z as far from
its true value as possible.

Average deviations Dz = |Dx| + |Dy| in both cases. With more than two numbers added
or subtracted we continue to add the uncertainties.

8
Using average errors Using standard deviations
p
∆z = |∆x| + |∆y| + . . . ∆z = (∆x)2 + (∆y)2 + . . .

Example: w = (4.52 ± 0.02)cm, x = (2.0 ± 0.2)cm, y = (3.0 ± 0.6)cm. Find z = x + y − w


and its uncertainty.

z = x + y − w = 2.0 + 3.0 − 4.5 = 0.5cm


For the simplified method we get:

∆z = ∆x + ∆y + ∆w = 0.2 + 0.6 + 0.02 = 0.82


rounding to 0.8 cm, So z = (0.5 ± 0.8)cm
When using the standard deviation we get:
p
∆z = 0.22 + 0.62 + 0.022 = 0.633

So z = (0.5 ± 0.6)cm.

5.2 Multiplication by an exact number


When multiplying a measurement value with an exact number, multiply the uncertainty
also with the exact number.
Example: The radius of a circle is r = (3.0 ± 0.2)cm. Find the circumference and its
uncertainty.

C = 2πr = 18.850cm
∆C = 2π∆r = 1.257cm (The factors of 2 and π are exact)
C = (18.8 ± 1.3)cm

We round the uncertainty to two figures since it starts with a 1, and round the answer to
match.

9
5.3 Multiplication and Division: z = x y or z = x/y
Derivation: We can derive the relation for multiplication easily. Take the largest values
for x and y, that is:

z + ∆z = (x + ∆x)(y + ∆y) = xy + x∆y + y∆x + ∆x∆y


Usually ∆x << x and ∆y << y so that the last term is much smaller
than the other terms and can be neglected. Therefore:
z = xy,
∆z = y∆x + x∆y

which we write more compactly by forming the relative error, that is the ratio of ∆z/z,
namely:
∆z ∆x ∆y
= + + ...
z x y
Using average errors Using standard deviations

∆y
∆z ∆x
r
z = x + y + ... ∆z ∆x 2
 
∆y
2
z = x + y + ...

Example: w = (4.52 ± 0.02)cm, x = (2.0 ± 0.2)cm. Find z = w · x and its uncertainty.

z = w · x = (4.52)(2.0) = 9.04cm2
∆z 0.02cm 0.2cm
= + = 0.1044
9.04cm2 4.52cm 2.0cm
∆z = 0.1044 · (9.04cm2 ) = 0.944cm2 ⇒ 0.9cm2

Using the standard deviations:


s 2  2
∆z 0.02cm 0.2cm
= + = 0.1
9.04cm2 4.52cm 2.0cm
∆z = 0.9cm2
therefore
z = (9.0 ± 0.9)cm2

10
5.4 Products of Powers z = xm + y n
Using average errors Using standard deviations

∆y
∆z
r
z = |m| ∆x
x + |n| y + . . . ∆z m∆x 2
 
n∆y
2
z = x + y + ...

5.5 Mixtures of multiplication, division, addition, subtraction, and pow-


ers.
If z is a function which involves several terms added or subtracted we must apply the above
rules carefully. This is best explained by means of an example.
Example: w = (4.520.02)cm, x = (2.00.2)cm, y = (3.00.6)cm. Find z = wx + y 2

First we compute v = wx to get v = (9.0 ± 0.9)cm2


Next we compute ∆(y 2 ):

∆(y 2 ) 2∆y 2 · 0.6cm


= = = 0.40
y2 y 3.0cm
∆(y 2 ) = 0.40(9.00cm2 ) = 3.6cm2
finally we compute
∆z = ∆v + ∆(y 2 ) = 0.9 + 3.6 = 4.5cm2 ⇒ 4cm2
z = (18 ± 4)cm2

6 Significant Digits
The rules for propagation of errors hold true for cases when we are in the lab, but doing
propagation of errors is time consuming. The rules for significant figures allow a much
quicker method to get results that are approximately correct even when we have no un-
certainty values. A significant figure is any digit 1 to 9 and any zero which is not a place
holder. Thus, in 1.350 there are 4 significant figures since the zero is not needed to make
sense of the number. In a number like 0.00320 there are 3 significant figures –the first three
zeros are just place holders. However the number 1350 is ambiguous. You cannot tell if
there are 3 significant figures –the 0 is only used to hold the units place –or if there are 4
significant figures and the zero in the units place was actually measured to be zero. How
do we resolve ambiguities that arise with zeros when we need to use zero as a place holder
as well as a significant figure? Suppose we measure a length to three significant figures as
8000 cm. Written this way we cannot tell if there are 1, 2, 3, or 4 significant figures. To

11
make the number of significant figures apparent we use scientific notation, 8 × 103 cm cm
(which has one significant figure), or 8.00 × 103 cm (which has three significant figures),
or whatever is correct under the circumstances. We start then with numbers each with
their own number of significant figures and compute a new quantity. How many significant
figures should be in the final answer? In doing running computations we maintain numbers
to many figures, but we must report the answer only to the proper number of significant
figures.

In the case of addition and subtraction we can best explain with an example. Suppose one
object is measured to have a mass of 9.9 gm and a second object is measured on a different
balance to have a mass of 0.3163 gm. What is the total mass? We write the numbers with
question marks at places where we lack information. Thus 9.9???? gm and 0.3163? gm.
Adding them with the decimal points lined up we see

09.9????
00.3163?
+
10.2????
= 10.2gm

In the case of multiplication or division we can use the same idea of unknown digits. Thus
the product of 3.413? and 2.3? can be written in long hand as

3.413?
2.3?
×
?????
10219?0
6816?00
+
7.8?????
= 7.8

The short rule for multiplication and division is that the answer will contain a number of
significant figures equal to the number of significant figures in the entering number having
the least number of significant figures. In the above example 2.3 had 2 significant figures
while 3.413 had 4, so the answer is given to 2 significant figures. It is important to keep
these concepts in mind as you use calculators with 8 or 10 digit displays if you are to avoid
mistakes in your answers and to avoid the wrath of physics instructors everywhere. A good
procedure to use is to use use all digits (significant or not) throughout calculations, and
only round off the answers to appropriate ”sig fig.”

12
7 Rounding off answers in regular and scientific notation
In theexamples we were careful to round the answers to an appropriate number of significant
figures. The uncertainty should be rounded off to one or two significant figures. If the
leading figure in the uncertainty is a 1, we use two significant figures, otherwise we use one
significant figure. Then the answer should be rounded to match.

13

You might also like