I was wondering what the term "rule of thumb" actually means in statistics. Why did they select this name, for example, for sample size calculation? Is it like an approximation based on practice rather than theory?
-
10$\begingroup$ It's just an expression we borrowed from general use, which means the same thing - an approximate solution. $\endgroup$– user2974951Commented Jun 21 at 7:48
-
$\begingroup$ Here is another reference, and good discussion about the origin of the expression. Of interest here is the fairly recent but unwarranted association of the expression with domestic abuse. $\endgroup$– jginestetCommented Jun 21 at 16:25
-
$\begingroup$ yikes: en.wikipedia.org/wiki/Rule_of_thumb#19th-century_United_States $\endgroup$– TaylorCommented Jun 21 at 18:34
3 Answers
The term "rule of thumb" predates statistics. The most accepted theory of its origin is that people would measure things with their thumbs. Of course, this won't be exact, and, thus "rule of thumb" came to mean "approximate".
The first known use in writing is from a 17th century sermon, excoriating builders for using the "rule of thumb" rather than measuring exactly, but the phrase is surely much older. Indeed, the Imperial measures for length (inch, foot, yard, etc) are based on body parts, but measuring things using the parts of the body started long before.
Its meaning in statistics is "generally reasonably accurate" or "a decent approximation" and we use rules of thumb when either a) We don't have an analytic solution or b) we don't have the tools to use an analytic solution or any exact alternatives or c) we just want to get a quick approximation.
van Belle book
Note an entire book: Statistical Rules of Thumb by Gerald van Belle.
https://onlinelibrary.wiley.com/doi/book/10.1002/9780470377963
I guess anybody with experience would rate some of van Belle's rules as bang on and some as misguided or beyond their personal experience; and that would apply too to anybody else's book.
Parker book
I like this book from some years back.
Parker, Tom. 1988. Rules of Thumb. Wellingborough: Equation.
I have selected some of its rules that seem to apply either to statistically-based research or to this community. It was first published in the United States. The numbers are as given to rules in the book: there are 1406.
266 Predicting behaviour The best predictor of future behaviour is past behaviour.
442 Offending people The people who offend others most easily are often the most easily offended themselves.
157 Measuring things The first joint of your thumb measures about 1 inch, your foot measures about 1 foot, and your pace measures about 1 yard.
415 The library rule of 20/80 Twenty per cent of a library's patrons account for 80 per cent of the library's use. Twenty per cent of the books in a library account for 80 per cent of the library's use. (Also applies to contributors and funds, products and profits ... cf. 492, 493.)
1298 Looking for engineering correlations If you are trying to describe a phenomenon rigorously, correlate aggregate variables in such a way that the units cancel out. For example, don't study the effect of changing pipe diameter, which has units of distance. Study changes of pipe diameter divided by pipe length, which has units of distance divided by distance. The result is dimensionless. These correlations are more resilient to changes in materials and scale.
31 Measuring snow One inch of rain would make ten inches of snow.
61 Determining the age of a spruce tree You can determine the approximate age of a spruce tree by counting the layers of limbs on its trunk. A tree that has ten layers of limbs is roughly ten years old.
626 Cleaning a park The number of people and the amount of litter decrease with the cube of the vertical distance and the square of the horizontal distance to the trailhead.
880 Protecting your data In the computer world, make a copy of anything that's important. If it's really important, make two copies.
1209 Looking over a computer manual If a manual's table of contents lists names of programs or components instead of tasks, the manual isn't user friendly.
350 Picking a programmer Never hire a computer programmer who knows only one programming language.
373 Writing computer software A software writer can be expected to generate about ten lines of debugged, high-order language a day.
1253 Writing a computer program 1 When writing a long computer program, figure out the data storage first, the input and output next, and only then write the parts of the program that actually do the work.
1254 Writing a computer program 2 Write the documentation for a program before you write the program itself. In other words, figure out how you are going to explain the program to the user, then write the program to fit the documentation.
1255 Writing a computer program 3 To write a good program, write and debug the entire program, get it documented and working perfectly, then start over again from scratch based on what you learned the first time through. This process can be repeated as many as four times and still be cost-effective, but you should always do it at least once.
1256 Writing a computer program 4 In most computer programs, 10 per cent of the program accounts for 90 per cent of the processing time. Finding and re-writing this part of the program so that it runs fast is always cost-effective.
1257 Writing a computer program 5 No good computer program can be written by more than ten people. The best programs are written by one or two people.
1338 Illustrating your data If your data include fewer than twenty pieces of information, a graphic presentation is unnecessary.
Anonymous: I.J. Good?
This paper appeared in a mostly serious, partly facetious collection. The Editor, I.J. Good, was a highly productive and moderately quirky Bayesian, and it's possible that Good himself contributed to this list.
Anon. 1962. Bloggins's working rules. In Good, I.J. (Ed.) The Scientist Speculates: An Anthology of Partly-Baked Ideas. London: Heinemann, 212--213.
[p.212]
Murphy's edict -- if something can go wrong it will.
If a problem has less than three variables it is not a problem. If it has more than eight, you cannot solve it.
Parkinson's Laws state:
(i) Work expands to fill the time available for its completion, especially when it is interesting.
(ii) A man starts to lose his grip five years before retirement age, whatever this may be.
Hartree's Law states that whatever the state of a project, the time a project-leader will estimate for completion is constant. A task always takes twice as long as one might reasonably expect.
All reports require three drafts.
The 20:80 rules: 20 per cent. of the people drink 80 per cent. of the beer. It is prudent to assume the same concentration of effort elsewhere, and Holt's Rule to forecast time series states:
New forecast $=$ 0.2 (Last result) + 0.8 (Last forecast).
When there are unknown scale factors, assume a 0.70 power law.
Numbers in real life usually have a 25 per cent. coefficient of variation and rarely less than 10 per cent. Data usually have at least 1 percent. of gross errors. This applies to people too.
[p.213]
The best experts resist innovation, for they wish to remain experts, and they are right only three-quarters of the time.
The variance of cumulative chance events is practically infinite.
Edie's Limit. Pooled Services may be more effective in theory, but they are soon degraded by difficulties of switching and scanning. Edie found 4--6 channels the most efficient group size for toll booths. The same number must often apply elsewhere.
Anyone more than two years younger than oneself is inexperienced. Anyone more than five years older is past his best.
Any useful classification has 3--6 sub-categories, but a thirty-fold division provides a fine monument to hard work.
Really top brass takes one year to make up its mind in matters in which you are interested.
Do not ask questions on which people have no real opinions, or which they will not answer truthfully. Socratic dialogue is more potent than any arithmetic.
There are never less than three conflicting criteria of merit. At best, operational research is nearly right.
Life is
(i) Discrete.
(ii) Non-linear.
(iii) Non-zero sum.
(iv) Non-commutative, and positively irreversible.
(v) Multiplicative rather than additive; the log normal distribution is more normal than the normal.
One nearly always has prior knowledge, but optimisation? It is a delusion. Probabilities are always conditional -- very conditional.
There are no decision rules to choose decision rules.
The only practical problem is what to do next.
The art of being correct lies in making the weakest possible statements.
[ends]
Notes by NJC:
Edie, Leslie C. 1954. Traffic delays at toll booths. Journal of the Operations Research Society of America 2(2): 107--138.
Leslie C. Edie (1914--1990) was an early pioneer in operations research and transportation science.
Douglas Rayner Hartree (1897--1958) was a mathematician and physicist who contributed to numerical analysis and its application to atomic physics and to the development of computing machinery.
Charles C. Holt (1921--2010) is known for modelling and forecasting methods using exponential smoothing.
Cyril Northcote Parkinson (1909--1993) was a naval historian and author of many books, most famously advancing Parkinson's Law.
-
$\begingroup$ "A software writer can be expected to generate about ten lines of debugged, high-order language a day." stands out as as sore thumb - pun intended ! :D $\endgroup$ Commented Jun 22 at 10:57
A rule of thumb is a heuristic:
(of an approach to problem solving, learning, or discovery) That employs a practical method not guaranteed to be optimal or perfect; either not following or derived from any theory, or based on an advisedly oversimplified one.
Examples include:—
Have at least 20 observations per regression degree of freedom to avoid overfitting
You can use a Gaussian approximation to the distribution of the sample mean when the sample size is 30 or more
The common theme is that rules of thumb are based on "the kind of data people commonly want to analyze". The danger of relying of them is that they may well be formulated with regard to areas of application differing wildly from yours.
I wouldn't call an approximation itself a rule of thumb. The Rule of Three, say, is an approximation, no more; a rule of thumb would condone its use in place of the exact binomial when 12 successes or more have been observed. (Perhaps that's mere pedanticism.)