Benford Analysis Article PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Journal of Forensic Accounting

1524-5586/Vol.V(2004), pp. 17-34


2004 R.T. Edwards, Inc.
Printed in U.S.A.

The Effective Use of Benfords Law to Assist in


Detecting Fraud in Accounting Data

Cindy Durtschi1, William Hillison2 and Carl Pacini3


1UtahState University, Logan, UT USA
2Florida
State University, Tallahassee, FL USA
3Florida Gulf Coast University, Ft. Myers, FL USA

Benfords law has been promoted as providing the auditor with a tool that is simple and effec-
tive for the detection of fraud. The purpose of this paper is to assist auditors in the most effec-
tive use of digital analysis based on Benfords law. The law is based on a peculiar observation
that certain digits appear more frequently than others in data sets. For example, in certain data
sets, it has been observed that more than 30% of numbers begin with the digit one. After dis-
cussing the background of the law and development of its use in auditing, we show where dig-
ital analysis based on Benfords law can most effectively be used and where auditors should
exercise caution. Specifically, we identify data sets which can be expected to follow Benfords
distribution, discuss the power of statistical tests, types of frauds that would be detected and
not be detected by such analysis, the potential problems that arise when an account contains
too few observations, as well as issues related to base rate of fraud. An actual example is pro-
vided demonstrating where Benfords law proved successful in identifying fraud in a popula-
tion of accounting data.

INTRODUCTION

In the past half-century, more than 150 articles have been published about Benfords law, a
quirky law based on the number of times a particular digit occurs in a particular position in
numbers (Nigrini 1999). In the past 10 years a subset of these articles have promoted the use
of this law (the study of digits or digital analysis) as a simple, effective way for auditors to
not only identify operational discrepancies, but to uncover fraud in accounting numbers.
Recent audit failures and the issuance of Statement on Auditing Standard No. 99,
Consideration of Fraud in a Financial Statement Audit (AICPA 2002) have set the profes-
sion in search of analytical tools and audit methods to detect fraud. Specifically, SAS No.
99 (paragraph 28) reiterates SAS No. 56 in requiring auditors to employ analytical proce-
dures during the planning phase of the audit with the objective to identify the existence of

17
18 C. Durtschi, W. Hillison and C. Pacini

unusual transactions, events and trends. At the same time, SAS No. 99 cautions that rely-
ing on analytical procedures, which are traditionally done on highly aggregated data, can
provide only broad indications of fraud. The purpose of this paper is to assist auditors in the
most effective use of digital analysis based on Benfords law. When used properly, digital
analysis conducted on transaction level data, rather than aggregated data, can assist auditors
by identifying specific accounts in which fraud might reside so that they can then analyze
the data in more depth.

Specifically, we provide guidance to auditors to help them distinguish between those cir-
cumstances in which digital analysis might be useful in detecting fraud and those circum-
stances where digital analysis cannot detect fraud. Further, we provide guidance on how to
interpret the results of such tests so auditors can better assess the amount of reliance they
should place on digital analysis as a way to detect fraud. Coderre (2000) states, Auditors
should use discretion when applying this method as it is not meant for all data analysis
situations. Our paper attempts to provide auditors with facts needed to exercise that dis-
cretion.

We begin by providing an overview of Benfords law, and how it applies to accounting and
in particular fraud detection. We next detail the types of accounting numbers that might be
expected to conform to a Benford distribution and which accounting numbers may not, thus
highlighting situations where digital analysis is most useful. We next discuss which digital
analysis tests should be conducted and how the results are interpreted. The following sec-
tion discusses the limitations of digital analysis for detecting certain types of fraud. We pro-
vide some insight into the overall effectiveness of digital analysis conditioned on the under-
lying fraud rate. We include an example of digital analysis using data from an actual entity
and show the results of a normal account versus an account that was later determined to con-
tain fraud. In summary, we conclude that if used appropriately, digital analysis can increase
an auditors ability to detect fraud.

ORIGIN OF BENFORDS LAW

In 1881, Simon Newcomb, an astronomer and mathematician, published the first known
article describing what has become known as Benfords law in the American Journal of
Mathematics. He observed that library copies of books of logarithms were considerably
more worn in the beginning pages which dealt with low digits and progressively less worn
on the pages dealing with higher digits. He inferred from this pattern that fellow scientists
used those tables to look up numbers which started with the numeral one more often than
EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 19

those starting with two, three, and so on. The obvious conclusion was that more numbers
exist which begin with the numeral one than with larger numbers. Newcomb calculated that
the probability that a number has any particular non-zero first digit is:

P(d)= Log10(1+1/d) (1)


Where: d is a number 1, 2 9, and
P is the probability
Using his formula, the probability that the first digit of a number is one is about 30 percent
while the probability the first digit a nine is only 4.6 percent. Table 1 shows the expected
frequencies for all digits 0 through 9 in each of the first four places in any number.
Table 1
Expected Frequencies Based on Benford's Law
Digit 1st place 2nd place 3rd place 4th place
0 .11968 .10178 .10018
1 .30103 .11389 .10138 .10014
2 .17609 .19882 .10097 .10010
3 .12494 .10433 .10057 .10006
4 .09691 .10031 .10018 .10002
5 .07918 .09668 .09979 .09998
6 .06695 .09337 .09940 .09994
7 .05799 .09035 .09902 .09990
8 .05115 .08757 .09864 .09986
9 .04576 .08500 .09827 .09982
Source: Nigrini, 1996.
Formulas for expected digital frequencies:
For first digit of the number:
Probability (D1=d1) = log(1+(1/d1)); d1= (1,2, 39)
For second digit of the number:
9
Probability (D 2 = d 2 ) = log(1 + (1/ d 1 d 2 )); d 2 = (1, 2,3... 0 )
d1 =1

For two digit combinations:


Probability (D1D2 = d1d2)=log(1+(1/ d1d2))
Probability (D2= d2 | D1= d1) = log(1+(1/ d1d2))/log(1+(1/ d1))
Where
D1 represents the first digit of a number,
D2 represents the second digit of a number, etc.
20 C. Durtschi, W. Hillison and C. Pacini

Newcomb provided no theoretical explanation for the phenomena he described and his arti-
cle went virtually unnoticed. Then, almost 50 years later, apparently independent from
Newcombs original article, Frank Benford, a physicist, also noticed that the first few pages
of his logarithm books were more worn than the last few. He came to the same conclusion
Newcomb had arrived at years prior; that people more often looked up numbers that began
with low digits rather than high ones. He also posited that there were more numbers that
began with the lower digits. He, however, attempted to test his hypothesis by collecting and
analyzing data. Benford collected more than 20,000 observations from such diverse data
sets as areas of rivers, atomic weights of elements, and numbers appearing in Readers
Digest articles (Benford 1938). Diaconis and Freedman (1979) offer convincing evidence
that Benford manipulated round-off errors to obtain a better fit to a logarithmic law, but even
the non-manipulated data are a remarkably good fit (Hill 1995). Benford found that num-
bers consistently fell into a pattern with low digits occurring more frequently in the first
position than larger digits. The mathematical tenet defining the frequency of digits became
known as Benfords law.

For almost 90 years mathematicians and statisticians offered various explanations for this
phenomenon. Raimis 1976 article describes some of the less rigorous explanations that
range from Goudsmit and Furrys (1944) thesis that the phenomena being the result of the
way we write numbers, to Furlans (1948) theory that Benfords law reflects a profound
harmonic truth of nature. It wasnt until 1995 that Hill, a mathematician, provided a proof
for Benfords law as well as demonstrating how it applied to stock market data, census sta-
tistics, and certain accounting data (Hill 1995). He noted that Benfords distribution, like
the normal distribution, is an empirically observable phenomenon. Hills proof relies on the
fact that the numbers in sets that conform to the Benford distribution are second generation
distributions, that is, combinations of other distributions. If distributions are selected at ran-
dom and random samples are taken from each of these distributions, then the significant-
digit frequencies of the combined samplings will converge to Benfords distribution, even
though the individual distributions may not closely follow the law (Hill 1995; Hill 1998).
The key is in the combining of numbers from different sources. In other words, combining
unrelated numbers gives a distribution of distributions, a law of true randomness that is uni-
versal (Hesman 1999).

Boyle (1994) shows that data sets follow Benfords law when the elements result from ran-
dom variables taken from divergent sources that have been multiplied, divided, or raised to
integer powers. This helps explain why certain sets of accounting numbers often appear to
closely follow a Benford distribution. Accounting numbers are often the result of a mathe-
EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 21

matical process. A simple example might be an account receivable which is a number of


items sold (which comes from one distribution) multiplied by the price per item (coming
from another distribution). Another example would be the cost of goods sold which is a
mathematical combination of several numbers, each of which comes from a different source.

Although the mathematical proof is beyond the needs of our discussion, intuitively the law
is not difficult to understand. Consider the market value of a firm. If it is $1,000,000, it will
have to double in size before the first digit is a 2, in other words it needs to grow 100 per-
cent.1 For the first digit to be a 3, it only needs to grow 50 percent. To be a 4 the firm
must only grow 33 percent and so on. Therefore, in many distributions of financial data,
which measure the size of anything from a purchase order to stock market returns, the first
digit one is much further from two than eight is from nine. Thus, the observed finding is
that for these distributions, smaller values of the first significant digits are much more like-
ly than larger values.2

BENFORDS LAW APPLIED TO AUDITING AND ACCOUNTING

Auditors have long applied various forms of digital analysis when performing analytical
procedures. For example, auditors often analyze payment amounts to test for duplicate pay-
ments. They also search for missing check or invoice numbers. Benfords law as applied to
auditing is simply a more complex form of digital analysis.3 It looks at an entire account to
determine if the numbers fall into the expected distribution.

Although Varian (1972), an economist, suggests that Benfords law can be used as a test of
the honesty or validity of purportedly random scientific data in a social science context, it
wasnt picked up by accountants until the late 1980s. At that time, two studies relied on dig-
ital analysis to detect earnings manipulation. Carslaw (1988) found that earnings numbers
from New Zealand firms did not conform to the expected distribution. Rather, the numbers
contained more zeros in the second digit position than expected and fewer nines, thus imply-
ing that when a firm had earnings such as $1,900,000, they rounded up to $2,000,000.
Although Carslaw used the Benford distribution as his expectation, he referred to it as
Fellers Proof. Thomas (1989) discovered a similar pattern in the earnings of U.S. firms.

1 The growth rate from $1,000,000 to $2,000,000 is determined as ($2,000,000-$1,000,000)/$1,000,000 = 100


percent.
2 A similar description, in British pounds, is found in Carr (2000).
3 An excellent operational description of the application of Benford's law for auditing is found in Drake and
Nigrini (2000).
22 C. Durtschi, W. Hillison and C. Pacini

Nigrini appears to be the first researcher to apply Benfords law extensively to accounting
numbers with the goal to detect fraud. According to an article published in Canadian
Business (1995), Nigrini first became interested in the work on earnings manipulation by
Carslaw and Thomas then separately came across Benfords work and wed the two ideas
together for his dissertation. His dissertation used digital analysis to help identify tax
evaders (Nigrini 1996). More recently, papers have been published which detail practical
applications of digital analysis such as descriptions of how an auditor performs tests on sets
of accounting numbers, how an auditor uses digital analysis computer programs, and case
studies for training students (Nigrini and Mittermaier 1997).4

The academic literature is somewhat cautious in making claims about the effectiveness of
procedures based on Benfords law to detect fraud. In particular, such work cautions that a
data set which, when tested, does not conform to Benfords law, may show only operating
inefficiencies or flaws in systems rather than fraud (Etteridge and Srivastava 1999). Our
paper expands on those studies to discuss why certain data sets are appropriate for digital
analysis and others are not. We explain why some types of fraud cannot be identified by
digital analysis. We show how tests of the results of digital analysis can be interpreted as
well as why care must be taken in the interpretation. All this will inform auditors discre-
tion as they apply digital analysis to a particular work environment.

WHEN TO USE DIGITAL ANALYSIS

When an auditor chooses to use digital analysis in an attempt to detect fraud, several issues
should be considered. First, on which types of accounts might Benford analysis be expect-
ed to be effective? While most accounting-related data sets conform to a Benford distribu-
tion, there are some exceptions. And since digital analysis is only effective when applied to
conforming sets, auditors must consider whether a particular data set should be expected to
fall into a Benford distribution prior to conducting digital analysis. Second, what tests
should be run and how should the results of those tests be interpreted? Since there are high
costs associated with false positives (identifying a fraud condition when none is present) as
well as false negatives (failing to identify a fraud condition when one exists), one must con-
sider the level of significance, or threshold beyond which accounts are deemed contaminat-
ed and selected for further investigation. Third, when is digital analysis ineffective? In

4 York (2000) provides directions for conducting a Benford analysis in Excel. Nigrini (1999) demonstrates how
a routine in Audit Command Language (ACL) can be used to perform digital analysis. Lanza (2000) describes
DATAS, a statistical analysis tool, that performs Benford digital analysis. Etteridge and Srivastava (1999) pro-
vide three assignments and a completed Excel program for students to use to understand digital analysis.
EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 23

other words, are there categories of fraud that cannot be signaled using digital analysis?
Finally, how much assistance can auditors expect to receive from Benfords law in their abil-
ity to identify suspect accounts for further investigation? Each of these issues is addressed
in the subsequent sections.

Choosing Appropriate Data Sets for Analysis

Most accounting-related data can be expected to conform to a Benford distribution, and thus
will be appropriate candidates for digital analysis (Hill 1995). Such is the case because typ-
ical accounts consist of transactions that result from combining numbers. For example,
accounts receivable is the number of items purchased multiplied by the price per item.
Similarly, accounts payable and most revenue and expense accounts are expected to con-
form. Account size, meaning the number of entries or transactions, also matters. In gener-
al, results from Benford analysis are more reliable if the entire account is analyzed rather
than sampling the account. This is because the larger the number of transactions or items in
the data set, the more accurate the analysis.

Benford analysis will reveal various underlying peculiarities in an account. Therefore, not
all accounts labeled as non-conforming will be fraudulent. For example, we ran digital
analysis on various accounts of a large medical center. In this analysis, the laboratory
expenses were flagged as not conforming to a Benford distribution when there was no rea-
son a priori to believe it would not. Further investigation revealed that certain authorized,
but repetitive transactions caused the account to fail the statistical tests. Specifically, there
were numerous purchases for $11.40, which turned out to be the cost of liquid nitrogen
ordered by dermatologists, as well as numerous purchases for $34.95, which were costs for
cases of bottled water. Once these entries were removed, the data set conformed to the
expected distribution.

Some populations of accounting-related data do not conform to a Benford distribution. For


example, assigned numbers, such as check numbers, purchase order numbers, or numbers
that are influenced by human thought, such as product or service prices, or ATM with-
drawals, do not follow Benfords law (Nigrini and Mittermaier 1997). Assigned numbers
should follow a uniform distribution rather than a Benford distribution. Prices are often set
to fall below psychological barriers, for example $1.99 has been shown to be perceived as
much lower than $2.00, thus prices tend to cluster below psychological barriers. ATM with-
drawals are often in pre-assigned, even amounts. Other accounts which might not follow a
Benford distribution will be firm specific. For example, in the medical center the patient
refund account did not conform because most refunds involved co-payments which were
24 C. Durtschi, W. Hillison and C. Pacini

often pre-assigned amounts and applied to large numbers of patients.5 Other examples of
accounts which would not be expected to conform to a Benford distribution would be those
that have a built-in maximum or minimum value. For example, a list of assets that must
achieve a certain materiality level before being recorded would have a built-in minimum and
therefore would not likely conform.

In addition to an auditors judgment in determining which populations fit a Benford distri-


bution, there exist some tests that reveal whether or not Benfords law applies to a particu-
lar data set. Wallace (2002) suggests that if the mean of a particular set of numbers is larg-
er than the median and the skewness value is positive, the data set likely follows a Benford
distribution. It follows that the larger the ratio of the mean divided by the median, the more
closely the set will follow Benfords law. This is true since observations from a Benford dis-
tribution have a predominance of small values. The difficulty in relying only on such tests
as a screening process, before applying digital analysis, is that if an account contains suffi-
cient bogus observations it could fail the tests; thus, digital analysis would not be applied
when, in fact, it should. Table 2 summarizes when it is appropriate to use Benford analysis,
and when to use caution.
Table 2
When Benford Analysis Is or Is Not Likely Useful
When Benford Analysis Is Likely Useful Examples
Sets of numbers that result from mathematical Accounts receivable (number sold * price),
combination of numbers - Result comes from Accounts payable (number bought * price)
two distributions
Transaction-level data - No need to sample Disbursements, sales, expenses
On large data sets - The more observations, the better Full year's transactions
Accounts that appear to conform - When the Most sets of accounting numbers
mean of a set of numbers is greater than the
median and the skewness is positive
When Benford Analysis Is Not Likely Useful Examples
Data set is comprised of assigned numbers Check numbers, invoice numbers, zip codes
Numbers that are influenced by human thought Prices set at psychological thresholds ($1.99), ATM
withdrawals
Accounts with a large number of firm-specific numbers An account specifically set up to record $100
refunds
Accounts with a built in minimum or maximum Set of assets that must meet a threshold to be
recorded
Where no transaction is recorded Thefts, kickbacks, contract rigging

5 An analysis of patient refunds from a large medical center showed a higher number of entries which began with
the digit 2 and a lower than expected number of entries which began with the digits 6-9.
EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 25

Interpreting Results of the Statistical Tests

Two underlying concepts should be considered when deciding how effective digital analy-
sis based on Benfords law might be. First, the effectiveness of digital analysis declines as
the level of contaminated entries drops, and not all accounts which contain fraud contain a
large number of fraudulent transactions. Second, in many instances accounts identified as
non-conforming do not contain fraud. An example of this was given previously in the case
of the operating supplies for the medical center. These facts are particularly important when
considering the usefulness of statistical tests.

As in any statistical test, digital analysis compares the actual number of items observed to
the expected and calculates the deviation. For example, in a Benford distribution, the
expected proportion of numbers which contain the digit one in the first position is 30.103
percent. The actual proportion observed will most likely deviate from this expected amount
due to random variation. While no data set can be expected to conform precisely, at what
point is the deviation considered large enough to be a significant indication of fraud?

The expected distribution of digit frequency, based on Benfords law, is a logarithmic dis-
tribution that appears visually like a Chi-square distribution. Such a distribution deviates
significantly from a normal or uniform distribution. The standard deviation for each digits
expected proportion is:

si = [pi* (1 pi)/n] /
1
2 (2)
Where: si is the standard deviation of each digit, 1 through 9;
pi is the expected proportion of a particular digit based on Benfords law; and
n is the number of observations in the data set (particular account).
A z-statistic can be used to determine whether a particular digits proportion from a set of
data is suspect. In other words, does a digit appear more or less frequently in a particular
position than a Benford distribution would predict? The z-statistic is calculated as follows
(Nigrini 1996):

z = (|(po pe| 1/(2n))/si (3)


Where: po is the observed proportion in the data set;
pe is the expected proportion based on Benfords law;
si is the standard deviation for a particular digit; and
n is the number of observations (the term 1/(2n) is a continuity correction factor and is
used only when it is smaller than the absolute value term).
26 C. Durtschi, W. Hillison and C. Pacini

A z-statistic of 1.96 would indicate a p-value of .05 (95 percent confidence) while a z-sta-
tistic of 1.64 would suggest a p-value of .10 (90 percent confidence). For the proportion of
observations to be significantly different from that expected, the deviation must be in the tail
of the distribution. Thus arise two concerns, one intuitive and one statistical. First, intu-
itively, if there are only a few fraudulent transactions, a significant difference will not be
triggered even if the total dollar amount is large. Second, statistically, if the account being
tested has a large number of transactions, it will take a smaller proportion of inconsistent
numbers to trigger a significant difference from expected than it would take if the account
had fewer observations. This is why many prepackaged programs which include a
Benfords law-based analytical test urge auditors to test the entire account rather than tak-
ing a sample from the account.

To understand the second point, consider two accounts, one contains 10,000 transactions
while the second contains only 1,000 transactions. If all transactions within the 10,000-
transaction account are used, a minimum difference of 75 transactions is required before a
z-statistic would signal that the account is deviant. This translates into a proportion of .75
percent of the total account. By contrast, in the 1000-entry account, there would need to be
23 fraudulent entries (or a proportion of 2.3 percent deviant entries) before the same z-sta-
tistic flagged it as possibly fraudulent. If a 200-entry sample was drawn, it would require six
deviant entries or 3 percent before it would be seen as statistically different than expected.6

Such a result occurs because the size of the variance of the sample proportion is dependent
on the number of observations in the data set being tested. If lower confidence levels are
used to detect the presence of fraud, more false positives will be signaled with an accompa-
nying higher cost of investigation. For example, in an account which contains 10,000
entries, if the confidence level is set to 80 percent (a z-statistic of 1.28) only 58 deviant trans-
actions would be needed to signal the possibility of fraud. In general, there is a tradeoff. The
more discriminatory the test, the less likely that fraud will be detected when it is present,
and the less discriminatory the test, the more likely the test will return false positives, indi-
cating fraud when none is present.

An extension of the z-test, which tests only one digit at a time, is a chi-square test. The chi-
square test combines the results of testing each digits expected frequency with actual fre-

6 We solved for N, (the number of deviant entries required) using the following formula: (N-np) / (npq)1/2 = z-
statistic. Where n = sample size, p = expected proportion of the first digit, and q = (1 - p). Z-statistic used was
1.64 for a 90 percent confidence level which put 5 percent in each tail of the distribution.
EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 27

quency into one test statistic that indicates the probability of finding the result. If the chi-
square test rejects the hypothesis that the probability of all digits conform to a Benford dis-
tribution, then one knows the entire account warrants further examination. In general, the
chi-square test will be less discriminatory than the individual z-test results but will result in
fewer false positives.

Limitations Based on the Type of Fraud

Benfords analysis tests for fraudulent transactions based on whether digits appear in certain
places in numbers in the expected proportion. Therefore, a significant deviation from expec-
tations occurs under only two conditions: the person perpetrating the fraud has either added
observations or has removed observations but on a basis that would not conform to a
Benford distribution. Each action would result in an observable deviation from expecta-
tions, provided the number relative to the sample was large enough for statistical detection.
Therefore, when a fraud is such that transactions are never recorded (i.e., off-the-books
fraud), as in the case of bribes, kickbacks or asset thefts, digital analysis cannot be expect-
ed to detect the absence of transactions. This deficiency is noted by the ACL for Windows
Workbook, It is very difficult to discover any clues from records that are unexpectedly
absent. (ACL 2001, p. 221).

In addition, other types of fraud exist that cannot be detected by Benford analysis because
the data sets under examination are not appropriate for such analysis. For example, dupli-
cate addresses or bank accounts cannot be detected, yet two employees with similar address-
es might signal ghost employees or an employees address which is also a vendors address
might signal a shell company. Other examples include duplicate purchase orders or invoice
numbers that could signal duplicate payments fraud or shell companies. Further, Benford
analysis will not detect such frauds as contract rigging, defective deliveries, or defective
shipments.

The question arises as to what additional tests might complement Benford analysis. We
would suggest rather than using Benfords law as the primary tool around which other ana-
lytical tools are chosen, that auditors should consider it another tool to be added to the arse-
nal they already employ. Such arsenal includes personal observations of assets, outside ver-
ification, keen awareness of corporate culture, an awareness of the examined firms per-
formance relative to others in the industry and a healthy skepticism toward management
explanations of deviations in their records.
28 C. Durtschi, W. Hillison and C. Pacini

Base Rates and Conditional Probabilities

The value of Benfords law is in its use as a signaling device to identify accounts more like-
ly to involve fraud, thus improving on the random selection process auditors generally
employ when assessing the validity of a firms reported numbers. An auditor who decides
to rely on the results of digital analysis to detect fraud is making a decision in the presence
of two kinds of uncertainty. First, it is unknown exactly how accurate digital analysis will
be with real data. Second, it is unknown what the base probability of fraud is in real data.
To accurately determine the effectiveness of Benfords law, it would be essential to compare
the empirical distribution of accounts known to contain fraud with accounts known to be
fraud free. Unfortunately, that data is very difficult to obtain as most firms do not want to
make public their particular accounts even when no fraud has occurred. Therefore, we are
left to speculate as to the accuracy. To do this, we rely on Bayes theorem that represents
probability under uncertainty as:
P (S | F )* P ( F ) P(S | F ) * P( F )
P (F S ) = = (4)
P ( F ) * P (S | F ) + P( NF ) * P (S | NF ) P(S )

Where: F is fraud present;


NF is no fraud present;
S is the signal of fraud; and
P is the probability.
Thus, the probability of fraud existing, given the signal of fraud from digital analysis based
on Benfords law, is the probability a signal will be given if fraud exists multiplied by the
probability of fraud (the base rate) divided by the probability of a signal of fraud. The prob-
ability of a fraud signal, P(S), is the percent of times the test correctly identifies fraud plus
the percent of times the test incorrectly signals fraud. Thus, the usefulness of Benfords law
for fraud detection can be summarized as accurate fraud signals divided by total fraud sig-
nals.

As noted above, the probability or base rate of fraud, P(F), is a necessary consideration in
evaluating the usefulness of a Benford analysis. Although the Association of Certified Fraud
Examiners in their 1996 Report to the Nation on Occupational Fraud and Abuse report certain
summary statistics, the real base fraud rate is unknown. This is due to several reasons: (1)
firms are often reluctant to report that they have been victimized; (2) auditors and law enforce-
ment know only about the frauds that have been detected or reported; and (3) no one knows
the extent of undetected fraud. Further, the base rate of fraud is likely environment specific,
with certain environments being more conducive to fraud (Bologna and Lindquist 1995).
EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 29

It is likely that base rates of fraud in specific populations of transactions are fairly small. As
an example, assume that the base rate is 3 percent and Benford analysis correctly identifies
accounts which contain fraud 75 percent of the time.7 In this case the probability of finding
a fraud would be calculated as follows:

P (S | F )* P ( F ) .75*.03
P (F S ) = = = .085 (5)
P ( F ) * P (S | F ) + P( NF ) * P (S | NF ) (.03*.75) + (.97 *.25)

The conditional probability is .085 or approximately 9 percent, meaning there would be a 9


percent chance of discovery.8 This may be judged as a significant improvement over an
unassisted random sampling that would be successful 3 percent of the time. It should be
noted that there has been no widespread testing of digital analysis, nor is there any way to
assess its true success rate, since an auditor would seldom know of the frauds the analysis
failed to detect. The above analysis does, however, provide insights into how to evaluate the
effectiveness of the procedure given certain assumptions.

AN EXAMPLE OF DIGITAL ANALYSIS

We conducted digital analysis on two accounts of a large medical center in the Western
United States.9 Figure 1 on the following page, shows the distribution for the first digits of
check amounts written for the office supplies account. While digits two and seven appeared
to be significantly different than expected, the overall deviation falls within the conforming
range.10 However, subsequent analysis was conducted on the two non-conforming items.
The analysis indicated that the variation was due to legitimate payments and did not repre-
sent a fraud.

7 It is not to be expected that the procedure will always find accounts with fraud since sometimes it will signal
that an account is deviant when there is no fraud and other times will not signal fraud present because there are
too few fraudulent entries.
8 The numerator shows that 75 percent of the time the test will successfully signal fraud given that fraud is in 3
percent of the transaction sets. The denominator is the base rate (3 percent) multiplied by the success rate (75
percent) added to the expectation that some accounts will be signaled (25 percent) when no fraud exists (97 per-
cent). The correct signals divided by the total signals indicating fraud provides a measure of the usefulness of
the test in this case.
9 For the analysis shown here, we used the free download software found at http://www.nigrini.com/, under
DATAS software for windows.
10 A simple test of the conformity of 9 digits given in Drake and Nigrini (2000) is called the Mean Average
Deviation (The sum of the absolute values of the difference between the actual and expected percentages divid-
ed by nine (the number of digits)). This test proved to be within suggested limits.
30 C. Durtschi, W. Hillison and C. Pacini

Figure 1
Office Supplies Disbursement Check Amounts
FIRST DIGIT DISTRIBUTION
0.35

0.30

0.25
PROPORTION

0.20

0.15

0.10

0.05

0.00
1 2 3 4 5 6 7 8 9
FIRST DIGIT

Actual Benford's Law

In the second Benford analysis, the insurance refund account reflected a distribution of the
first digits shown in Figure 2. All digits, except the digit 2, were significantly different than
expected by the Benford distribution.
Figure 2
Insurance Refund Check Amounts
FIRST DIGIT DISTRIBUTION
0.50
0.45
0.40
0.35
PROPORTION

0.30
0.25
0.20
0.15
0.10
0.05
0.00
1 2 3 4 5 6 7 8 9
FIRST DIGIT

Actual Benford's Law


EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 31

When the details of the account were inspected, it was apparent that many more refund
checks of just over $1,000 had been written than in the previous period. In fact, most of the
previous periods refund checks were in amounts of less than $100.00. When queried, the
financial officer of the medical center responded that she had decided to accumulate refunds
for large insurers in an attempt to write fewer checks.

A subsequent detailed examination of the account, however, uncovered that the financial
officer had created bogus shell insurance companies in her own name and was writing large
refund checks to those shell companies. While the investigation into the fraud is ongoing,
it appears that approximately $80,000 had been diverted to the shell insurance companies.
In this instance, digital analysis was useful in identifying a suspect account. However, it
required looking beyond the easy explanation to find the fraud.

CONCLUSION

We conclude that Benfords analysis, when used correctly, is a useful tool for identifying
suspect accounts for further analysis. Because of its usefulness, digital analysis tools based
on Benfords law are now being included in many popular software packages (e.g., ACL and
CaseWare 2002) and are being touted in the popular press. CaseWare 2002 says of this new
application it ...can identify possible errors, potential fraud or other irregularities. The
goal of this paper has been to help auditors more appropriately apply Benfords law-based
analysis to increase their ability to detect fraud. SAS No. 99 instructs auditors to use ana-
lytical tests in the planning stages of their audit. Benford analysis is a particularly useful
analytical tool because it does not use aggregated data, rather it is conducted on specific
accounts using all the data available. It can be very useful in identifying specific accounts
for further analysis and investigation.

Because the potential cost of undetected fraud is high, an auditor using this technique must
take care not to overstate the reliability of such tests. While such tests have many advan-
tages, certain limitations must also be considered. Specifically, (1) care must be exercised
in interpreting the statistical results of the test, (2) Benford analysis should only be applied
to accounts which conform to the Benford distribution, and (3) the auditor must be cog-
nizant of the fact that certain types of frauds will not be found with this analysis.

While Benford analysis by itself might not be a surefire way to catch fraud, it can be a
useful tool to help identify some accounts for further testing and therefore should assist
auditors in their quest to detect fraud in financial statements.
32 C. Durtschi, W. Hillison and C. Pacini

REFERENCES
ACL for Windows. 2001. Version 7, Workbook, Vancouver: ACL: Services, Ltd.
American Institute of Certified Public Accountants. 2002. Statement on Auditing Standards No. 99,
Consideration of Fraud in a Financial Statement Audit. New York, New York.
Benford, F. 1938. The law of anomalous numbers. Proceedings of the American Philosophical
Society. 78(4):551-572.
Bologna, G. and R. Lindquist. 1995. Fraud Auditing and Forensic Accounting: New Tools and
Techniques. 2nd ed. New York, NY: John Wiley & Sons.
Boyle, J. 1994. An application of Fourier series to the most significant digit problem. American
Mathematical Monthly. 101(9):879-886.
Canadian Business. 68 (September 1995):21.
Carr, E. 2000. Days are numbered for Scarboroughs cheats. The Financial Times (London) August
16:12.
Carslaw, C. A. P. N. 1988. Anomalies in income numbers: Evidence of goal oriented behavior. The
Accounting Review. LXIII(2):321-327.
Caseware-idea.com. 2003. WWW.
Coderre, D. 2000. Computer assisted fraud detection. Internal Auditor. August:25-27.
Diaconis, P. and D. Freedman. 1979. On rounding percentages. Journal of the American Statistical
Association. 74(June):359-364.
Drake, P. D. and M. J. Nigrini. 2000. Computer assisted analytical procedures using Benfords law.
Journal of Accounting Education. 18:127-46.
Etteridge M. L. and R. P. Srivastava. 1999. Using digital analysis to enhance data integrity. Issues
in Accounting Education. 14(4):675-690.
Furlan, L. V. 1948. Das Harmoniegesetz der Statistik: Eine Untersuchung uber die metrische
Interdependenz der sozialen Erscheinungen, Basel, Switzerland: Verlag fur Recht und
Gesellschaft A. -G xiii:504.
Goudsmit, S. A. and W. H. Furry. 1944. Significant figures of numbers in statistical tables. Nature.
154:800-801.
Hesman, T. 1999. Cheaters tripped up by random numbers law. Dallas Morning News. August 22,
1999 Sunday. 6H.
Hill, T. P. 1995. A statistical derivation of the significant digit law. Statistical Science. 10(4):354-
363.
Hill, T. P. 1998. The first digital phenomenon. American Scientist. 86(4):358-363.
Lanza, R. B. 2000. Using digital analysis to detect fraud: Review of the DATAS statistical analy-
sis tool. Journal of Forensic Accounting. 1:291-296.
Newcomb, S. 1881. Note of the frequency of use of the different digits in natural numbers.
American Journal of Mathematics. 4:39-40.
EFFECTIVE USE OF BENFORD'S LAW IN DETECTING FRAUD IN ACCOUNTING DATA 33

Nigrini, M. J. 1996. Taxpayer compliance application of Benfords law. Journal of the American
Taxation Association. 18(1):72-92.
Nigrini, M. J. 1999. Adding value with digital analysis. The Internal Auditor. 56 (1):21-23.
Nigrini, M. J. and L. J. Mittermaier. 1997. The use of Benfords law as an aid in analytical proce-
dures. Auditing: A Journal of Practice & Theory. 16(2):52-67.
Raimi, R. A. 1976. The first digit problem. American Mathematical Monthly. 83(Aug-Sept):521-
538.
Thomas, J. K. 1989. Unusual patterns in reported earnings. The Accounting Review. LXIV(4):773-
787.
Varian, H. R. 1972. Benfords law. The American Statistician. 26:65-66.
Wallace, W. A. 2002. Assessing the quality of data used for benchmarking and decision-making.
The Journal of Government Financial Management. (Fall) 51 (3):16-22.
York, D. 2000. Auditing technique Benfords law. Accountancy. (July) 126, Issue 1283:126.

You might also like