$$\require{cancel}$$

# 1.5 Measurement Errors

No matter how hard we try, it isn't possible to make a measurement with absolute accuracy. Our tools lack perfect calibration and perfect resolution, and the nature of the universe adds in its own amount of uncertainty. This means that to be complete, a measurement should have an error attached. The theory of observational errors was worked out by the German mathematician Karl Friedrich Gauss in the 18th century.

At the simplest level, error can be assessed just by considering the accuracy of our tools. If we measure the width of a table with a tape measure, we might get the result 89 mm. If the smallest division on the tape measure is 1 mm, than the probable (or standard) error in your measurement is 0.05mm. So we would quote the measurement as 89 ± 0.5 mm.

Error can also come from having to calculate values based on known values that already have error (e.g. we don't actually know the temperature of the center of the Sun, but derive it using theoretical calculations that take into account the mass, radius, and surface temperature of the Sun). It also comes from having to deal with variations over time (The wind speed in the evening may average 10mph, but have an error due to lulls and gusts) The error is always expressed the same way: using a ± symbol and the value (or values if the error is asymmetric) of the error. We can see some examples:

• 78.65 ± 0.02 K implies a range of 78.67 to 78.63 K

• 0.0022 ± 0.0006 kg implies a range of 0.0028 to 0.0016 kg

• 670 ± 50 Mpc implies a range of 720 to 620 Mpc

• 0.17893 ± 0.00023 J implies a range of 0.17916 to 0.17870 J

• 33 ± 8 ms implies a range of 41 to 25 ms

• 91,000 ± 3500 W implies a range of 94,500 to 87,500 W

We can often improve on our results by making a number of separate measurements and taking the average. The result might be 88.2 mm. The standard error is given by the scatter in the individual measurements — usually represented by the Greek letter σ (sigma). Let's say it was 0.7 mm. The new and more accurate result is 88.2 ± 0.7 mm. Notice that our repeated measurements have allowed us to quote the width to a higher precision of three significant figures. Scientists always try to increase the reliability of their data by making multiple observations. Often, this is the only way they can reduce the uncertainty in their measurements.

In symbols, if a measurement X has a standard error σ, the result should be quoted as X ± &sigma. What does this actually mean? It means that the true value of the quantity we have measured is probably in the range X+&sigma to X-σ. The quantity X can be any measurement — it might be the strength of the Earth's magnetic field, or the temperature of the core of the Sun, or the number of stars in the Milky Way Galaxy. For example, the true value of the width of the table from our previous measurement, 88.2 ± 0.7 mm, is probably in the range 88.2+0.7 to 88.2-0.7, or 88.9 to 87.5 mm.

In Gauss's theory of errors, if the errors have a random cause, then the measurements will have a well-defined distribution around the true value. This is called the Gaussian or normal distribution, or the "bell curve." The bell curve falls off rapidly in either direction. This means that random errors are unlikely to throw a measurement very far off the true value. By convention, the standard error σ is defined as the region that encloses 68%, or about 2/3 of the measurements. In other words, if the measured value of a quantity is X, there is a 68% chance that the true value lies within the range X+σ to X-σ. There is a 95% chance that the true value lies within the range X+2σ to X-2σ and a 99.5% chance that the true value lies within the range X+3σ to X-3σ. To be cautious, scientists can quote a 3σ error range on a measurement, since the true value is very unlikely to lie outside that range.

Here are some examples of measurements with their associated errors. The following list has the measurement, its probable error, and the implied range on the true value at three levels of confidence: ±1σ, ±2σ, and ±3σ. In other words, if we make multiple observations, we expect that 68%, 95%, and 99.5% of measurements will lie within these ranges.

• Measurement: 0.67 ± 0.05 kg, 68% within range 0.72 to 0.62 kg, 95% within range 0.77 to 0.57 kg, 99.5% within range 0.82 to 0.52 kg

• Measurement: 157.8 ± 6.2, 68% within range 164.0 to 151.6 pc, 95% within range 170.2 to 145.4 pc, 99.5% within range 176.4 to 139.2 pc

• Measurement: 8800 ± 100 N, 68% within range 8900 to 8700 N, 95% within range 9000 to 8600 N, 99.5% within range 9100 to 8500 N

• Measurement: 0.0045 ± 0.0013 s, 68% within range 0.0058 to 0.0032 s, 95% within range 0.0071 to 0.0019 s, 99.5% within range 0.0084 to 0.0006 s

• Measurement: 59.1 ± 0.4 m, 68% within range 59.5 to 58.7 m, 95% within range 59.9 to 58.3 m, 99.5% within range 60.3 to 57.9 m

Astronomers gather light from the distant universe and they are often working at the limits of detection. We can see what this means in terms of standard error. Suppose that we measure the magnetic field of a distant star to be 90 ± 100 Gauss. You can see that the range of the magnetic field within the standard error reaches zero. In other words, there is a 1/3 chance that the star has no magnetic field at all! We might take more measurements of use a more accurate technique to get a new value of 84 ± 41 Gauss. The new measurement is slightly more than two times the standard error (or 2σ) above zero. There is still a nearly 5% chance that the star has no magnetic field. Scientists often demand a measurement of three or four times the standard error (3σ or 4σ) above zero before they are confident that they have detected something.

Scientists can use a similar logic to test a theory. Suppose in the previous example that we have a theory that predicts that a certain type ofstar will have a magnetic field of 200 Gauss. Our first measurement, 90 ± 100 Gauss, is low but it is too crude to test the theory. The reason is that the measurement disagrees with the theoretical prediction by only 200-90 = 110 Gauss, or 1.1 times the standard error (1.1σ). So there is nearly a 1/3 chance that the true magnetic field actually agrees with the theory. The second measurement is much more accurate, 84 ± 41 Gauss. Now we disagree with the theoretical prediction by 200-84 = 116 Gauss, or 116/41 = 2.9 times the standard error (2.9σ). In other words, there is a less than 1% chance that the measurement agrees with the theory. This is what scientists mean when they talk about "fitting a model" and "testing a hypothesis." Scientists often use this type of argument to test theories and advance to a better understanding of nature.

Most discussion of errors assumes that their nature is random. A random error is equally likely to displace a measurement to the high or to the low side of the true value. If errors are random, we can be confident that more measurements will reduce the standard error and allow us to home in on the true value of a quantity. Unfortunately, scientists often encounter a more difficult problem: systematic errors. A systematic erroris any kind of error that does not reduce in size when more measurements are made.

To see some examples of systematic error, let us imagine measuring the width of a table with a tape measure. Suppose that the machine that printed the scale on the tape measure had been programmed wrong, so that the divisions were all slightly too big. You would then measure all distances to be slightly smaller than they really are. Or we know that metal expands when it is hot, so if the tape measure had been calibrated when it was still hot,you would measure all distances to be slightly larger than they really are because the tape measure had shrunk since its manufacture. Or you might systematically misread the scale. None of these problems would be revealed by repeated measurements. The only way scientists can guard against systematic errors is by measuring a quantity using two or more different techniques.

The discussion of measurement error is not a minor aside — the correct treatment of errors or uncertainty is at the heart of the scientific method. Most scientists spend as much time trying to understand the error in a measurement as they did making the measurement in the first place! One final caution: most calculators do their calculations with a large number of significant figures (as many as 16) and can be set to show an arbitrary number of digits following the decimal point. But the numbers you get out of a calculator are only as accurate as the numbers you put in. So if your calculation involves quantities that are only accurate to 2 significant figures, you should only read off you answer to a precision of 2 significant figures.