Expert Systems with Applications 36 (2009) 5256–5263
Contents lists available at ScienceDirect
Expert Systems with Applications
journal homepage: www.elsevier.com/locate/eswa
A binary classification method for bankruptcy prediction
Jae H. Min *, Chulwoo Jeong
Sogang University, Graduate School of Business, #1, Shinsu-dong, Mapo-gu, Seoul 121-742, Republic of Korea
a r t i c l e
i n f o
Keywords:
Bankruptcy prediction
Binary classification
Genetic algorithm
a b s t r a c t
The purpose of this paper is to propose a new binary classification method for predicting corporate failure
based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing
virtual companies representing bankrupt companies and non-bankrupt ones, respectively, the proposed
method measures the similarity between the virtual companies and the subject for prediction, and classifies the subject into either bankrupt or non-bankrupt one. The values of the classification variables of
the virtual companies and the weights of the variables are determined by the proper model to maximize
the hit ratio of training data set using genetic algorithm. In order to test the validity of the proposed
method, we compare its prediction accuracy with those of other existing methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and it is shown that the binary classification method we propose in this paper can serve as a promising alternative to the existing
methods for bankruptcy prediction.
Ó 2008 Elsevier Ltd. All rights reserved.
1. Introduction
In the area of risk management, corporate failure prediction
plays a key role in examining credit loan applications since it enables banks to prevent themselves from insolvency due to bad
loans in advance and helps them to sustain profitability from its
proper lending practices. In addition, predicting corporate bankruptcy in a proper manner, a bank, as a financial agency, can contribute to its community by supplying prospective companies with
right fund corresponding to their respective financial soundness.
Moreover, the implementation of BASEL II Accord induces more severe competition among banks since it sets up more rigorous risk
and capital management requirements to ensure that a bank holds
capital reserves appropriate to the risk the bank is exposed through
its lending and investment practices. With these reasons among
others, banks now place their huge emphasis on identifying the
risks they may face in the future more accurately.
Due to rapidly changing corporate environment in recent years,
however, a lot of complicated reasons behind corporate bankruptcy are newly emerging. Therefore, despite many existing
methodologies for predicting corporate failure, it is worthwhile
for academia as well as practitioners to continuously develop
state-of-the-art methods reflecting various symptoms of corporate
failure that may not be explained by the existing ones.
In this paper, we develop a new method for bankruptcy prediction employing genetic algorithm, an artificial intelligence technique. To validate our model, we empirically compare the
* Corresponding author. Tel.: +82 2 705 8545; fax: +82 2 715 8505.
E-mail address:
[email protected] (J.H. Min).
0957-4174/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2008.06.073
performance of the proposed method with those of existing ones
such as multi-discriminant analysis, logistic regression, decision
tree, and neural network, and show its potential as a promising
alternative to the existing methods.
2. Background of the study
The bankruptcy prediction models can be divided into two main
streams. The first one is based on statistical methods. It was pioneered by Beaver (1966), followed by Altman (1968) who applied
multi-discriminant analysis, and has been developed to stochastic
models such as logit (Ohlson, 1980) and probit (Zmijewski, 1984).
The second one is employing artificial intelligence (AI) methods,
and a number of studies have applied them to bankruptcy prediction
problem from 1990’s. AI methods include decision tree (Frydman,
Altman, & Kao, 1985; Marais, Patel, & Wolfson, 1984), fuzzy set theory (Zimmermann, 1996), case-based reasoning (Bryant, 1997; Jo,
Han, & Lee, 1997; Park & Han, 2002), genetic algorithm (Shin &
Lee, 2002; Varetto, 1998), support vector machine (Min & Lee,
2005), data envelopment analysis (Cielen & Vanhoof, 2004), rough
sets theory (Dimitras, Slowinski, Susmaga, & Zopounidis, 1999;
McKee, 2000, 2003), and several kinds of neural networks such as
BPNN (back propagation trained neural network) (Atiya, 2001; Bell,
1997; Lam, 2004; Leshno & Spector, 1996; Salchenberger, Mine, &
Lash, 1992; Swicegood & Clark, 2001; Tam, 1991; Wilson & Sharda,
1994), PNN (probabilistic neural networks) (Yang, Platt, & Platt,
1999), SOM (self-organizing map) (Kaski, Sinkkonen, & Peltonen,
2001; Lee, Booth, & Alam, 2005), Cascor (cascade correlation neural
network) (Lacher, Coats, Sharma, & Fantc, 1995).
5257
J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263
In this paper, we propose a new classification method following
the second stream, which employs a genetic algorithm to the problem. Genetic algorithm is a search technique embodying the evolutionary mechanism such as selection, cross over, and mutation. It
searches wide and complicated spaces to find the optimal values
of the parameters of optimization problems with various constraints, and has been applied in a wide rage of areas including
bankruptcy prediction. Varetto (1998) and Shin and Lee (2002)
may be taken as good examples of the study using genetic algorithm in the area of bankruptcy prediction.
Varetto (1998) suggested two different models based on genetic
algorithm, one of which is a linear model estimating the constant
and the variable coefficients of the discriminant function to maximize its discriminant power using genetic algorithm. The other one
is a rule based model, which classifies firms according to their
respective discriminant scores called GSR (genetic score by rules)
using genetic algorithm. Genetic algorithm can also be used to produce a set of rules based on the tests deriving the signs and the cutoff values of selected ratios, and in this regard, Shin and Lee (2002)
suggested a rule inducing model to maximize its prediction power
using genetic algorithm.
In previous literature as described in the above examples, genetic algorithm has played a role complementing the existing statistical methods and AI methods rather than a standalone method
for bankruptcy prediction. In this study, genetic algorithm is also
used as a medium of designing a new binary classification method.
Specifically, it is used to estimate the weights and the values of the
classification variables of the firms representing bankrupt firms
and non-bankrupt ones, respectively.
Now, we describe the advantage of the proposed method over
some of the existing ones, which also provides the background of
our approach to bankruptcy prediction.
First, the proposed method is somewhat similar to cluster analysis as both methods classify the subjects into several clusters. But
the proposed method differentiates itself from cluster analysis in
the following aspects. The method classifies the subjects into clusters to maximize the prediction accuracy, the matching rate of the
subject’s bankruptcy status and the representative firm’s one, with
prior knowledge of whether each subject is bankrupt or not, while
cluster analysis does its classifying job without the prior knowledge of the status of each cluster. Also, cluster analysis doesn’t consider differences in the weights of the classification variables,
while our model estimates the weights of the classification variables and reflects those into the prediction. In addition, employing
genetic algorithm, our method can overcome the problem of kmeans cluster analysis such that the clustering results can differ
by initial value setting.
Second, at a glance, the proposed method seems to be similar to
CBR (case-based reasoning) as both methods classify the companies
according to the commonality of the subject for prediction and the
existing companies in data set; however, the model in this paper is
different from CBR in the following aspects. Above all, our model
takes a global learning approach while CBR is of a local learning approach. In other words, CBR measures distance between each subject for prediction and all existing companies in data set, identifies
the nearest k companies, and uses them to classify each subject into
good or bad. On the contrary, our model employs genetic algorithm
to search for the optimal values of the weights and the classification
variables of the virtual firms representing bankrupt firms and nonbankrupt ones (the representative firms) in a wider space in order
to maximize its prediction power, calculates the distance between
each subject for prediction and the representative firms, and classify
the subject into either bankrupt or non-bankrupt according to the
status of the nearest representative firm to the subject.
In addition, our method can provide one with more meaningful
information than CBR does. While CBR does provide the informa-
tion about the bankruptcy status of the subject for prediction,
and shows the most similar firm to each subject for prediction, it
is very difficult for us to identify which characteristics of the reference firm is similar to those of the firm for prediction. On the contrary, our method provides all the information CBR does, and at the
same time, it helps us to easily identify what characteristics the
reference firm representing the subject for prediction has in
comparison with other representative firms so that we can make
logical inference that the subject for prediction classified into same
cluster as a particular representative firm would also have similar
characteristics to those of the representative firm.
3. Mathematical model
We now mathematically describe the binary classification model we propose as the following. The distance between a representative firm and an observation in data set, which implies the
similarity or non-similarity between them, can be expressed in
the following equation
dki ¼
X
j
wj
,
X
j
!
wj jX kj X ij j
ð1Þ
where dki, distance between representative firm k and observation i
(i = 1,2,...,N); Xkj, representative firm k’s value of the j th variable
(j = 1,2,...,M); Xij, observation i’s value of the j th variable; wj, weight
of the j th variable.
In Eq. (1), for each observation i, dki is calculated as many as the
number of the representative firms. An observation can then be
classified as either bankrupt or non-bankrupt according to the
status of the representative firm having the minimum value of
dki’s(denote it by dk i ). If the representative firm is bankrupt, observation i is then classified as bankrupt company. Otherwise, the
observation is classified as non-bankrupt one.
If all the observations in data set are classified in the manner
described above, the hit ratio, the proportion of correct prediction,
can be calculated. In this paper, we employ genetic algorithm to
search for the weights and the values of the classification variables
of representative firms so as to maximize the hit ratio. This process
can be expressed in the following equation
N
1 X
Ci
N i¼1
max
H¼
subject to
C i ¼ 1 if BðiÞ ¼ Bðk Þ 8i 2 f1; 2; . . . ; Ng;
ð2Þ
C i ¼ 0 otherwise;
dk i ¼ minðdki ’sÞ
where H, hit ratio (the number of correct predictions over the number of all predictions for N observations); N, number of observations
in data set; Ci, either prediction for observation i is correct (Ci = 1) or
not (Ci = 0) (i = 1, 2, ... , N); B(i), either observation i is bankrupt
(B(i) = 1) or not (B(i) = 0); B(k*), either representative firm is bank
rupt (B(k*) = 1) or not ðBðk Þ ¼ 0Þ; k*, representative firm corresponding to dk i ; dk i , minimum distance between representative
firm k and observation i.
4. Empirical analysis
4.1. Data
The final data set for the analysis contains financial ratios of
2542 externally audited small and medium-sized manufacturing
firms. Among them, 1271 firms are filed for bankruptcy and the
other 1271 firms are non-bankrupt ones during the period of year
2001 through year 2004.
5258
J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263
The data selection process is described as follows. First, we initially gathered 27 financial ratios (see Table 2) of 2814 firms (1407
bankrupt and 1407 non-bankrupt ones) to conduct an empirical
analysis. For bankrupt firms, the financial ratios as of one year before their going into bankruptcy were gathered. Second, we used
the means and standard deviations of the financial ratios of 2814
firms to standardize them as Z-values, and the observations (the
firms) whose Z-values were beyond the range of [3, 3] were considered outliers and excluded from the data set. The numbers of
remaining bankrupt firms and non-bankrupt ones now become
1271 and 1316, respectively. Third, in order to equalize the numbers of bankrupt firms and non-bankrupt ones, we excluded randomly selected 45 non-bankrupt firms from the data set. Table 1
shows how the number of data points in the sample is subsequently reduced according to the process described above.
The entire data is split into three subsets which are training set,
test set, and validation set with respective proportions of 60%, 20%,
and 20% using stratified sampling method. So, the numbers of data
points in training set, test set, and validation set turn out to be
1526, 508, and 508, respectively.
Table 2 shows the definitions of the 27 financial ratios we selected. In order to select the variables for the analysis, we concurrently employed several statistical methods including independent
sample t-test, discriminant analysis, logistic regression, decision
tree, and factor analysis.
Now, Table 3 presents the variables selected by each statistical
method we just mentioned. The variables selected by independent
sample t-test are those showing significant differences in their
means between bankrupt companies and non-bankrupt ones. In discriminant analysis and logistic regression, we employ stepwise
selection method, where we set an entry value of 0.01 and an excluding value of 0.05 as the levels of significance for F statistic. In decision
tree, we use Chi-square test as the splitting criterion where we set
the significance level to be 0.02, the maximum depth of root to be
6, and the minimum number of observations in a leaf to be 5.
Next, we conducted factor analysis for the selected variables,
and could derive 8 factors as shown in Table 4.
Considering all the results from applying the various methods
described above, we selected the final variables for the analysis.
The variable selection process can be summarized as follows. First,
among the variables under each factor derived by factor analysis,
we only selected the variables that were also chosen by discriminant analysis, logistic regression, or decision tree. Second, among
the variables selected in the previous step, we excluded the variables that turned out to be insignificant by the independent sample
t-test. Third, among the variables remaining under each factor, we
tried to choose one variable by excluding the variables whose
meanings are similar to the chosen one. Table 5 summarizes the
variables chosen according to the above mentioned variable selection process.
In Table 5, we notice that under factor 5, for example, there are
two variables, X15 and X21 in the second step; however, only X15
is selected in the third step since the meaning of X15 and that of
X21 are considered similar to each other. On the contrary, we see
that both variables X12 and X13 under factor 4 are selected in
the third step as well as in the second step because they are believed to give different impact on firms’ status of bankruptcy. LikeTable 1
The number of data points in data selection process
Raw data
First step: excluding outliers
Second step: equalizing the number of data points
in each category
Nonbankrupt
Bankrupt
Total
1407
1316
1271
1407
1271
1271
2814
2587
2542
Table 2
Definition of variables
Variable
Definition
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
X11
X12
X13
X14
X15
X16
X17
X18
X19
X20
X21
X22
X23
X24
X25
X26
X27
Gross value added to sales
Gross value added to total assets
Growth rate of total assets
Ordinary income to sales
Net income to sales
Operating income to sales
Costs of sales to sales
Net interest expenses to sales
Ordinary income to total assets
Rate of earnings on total capital
Net working capital to total assets
Current liabilities to total assets
Stockholders’ equity to total assets
Total borrowings and bonds payable to total assets
Total assets turnover
Ordinary income to total assets
Net working capital to sales
Stockholders’ equity to sales
Ordinary income to total assets
Depreciation expenses
Operating assets turnover
Interest expenses to total expenses
Net interest expenses
Break-even point ratio
Employment costs
Interest expenses and net income to total assets
Earnings before interest and taxes to sales
Table 3
Selected variables by each analyzing method
Analyzing method
Selected variables
Independent sample t-test
Discriminant analysis
Logistic regression
Decision tree
X2,
X2,
X2,
X2,
X3,
X3,
X3,
X3,
X12, X13, X15, X20, X21, X23, X24, X25
X9, X12, X15, X20, X21, X22, X24, X25
X7, X9, X12, X20, X21, X22, X25
X4, X10, X15, X16, X20, X23, X24, X25
wise, under factor 6, variables X20, X23, and X25 are also
considered as the variables of different characteristics. Finally,
the variables selected for the analysis are reduced to 9 variables
including X2, X3, X12, X13, X15, X20, X23, X24, and X25.
4.2. Model design
In this study, we initially examine five models, denoted by
[Model 1] through [Model 5], according to the numbers of the
respective representative firms of bankruptcy and non-bankruptcy,
which are set to be simultaneously increased one by one from 1 to
5. That is, [Model 1] is the model that has one representative for
bankrupt firms and one for non-bankrupt ones. [Model 2] is the
one that has two representatives for bankrupt firms and also two
representatives for non-bankrupt ones, and so on.
The reason why we increase the number of representative firms
in this manner is that, if the number of the representative firms is
too small, there would be a concern that the model is too simplified, while there would be a risk of over-fit if the number of the
representative firms is too large. As we increase the number of representative firms one by one both for bankrupt firms and nonbankrupt ones, we can observe the trend of hit ratios for these five
models.
Also, the reason why we simultaneously increase the numbers
of the representative firms both for bankrupt firms and non-bankrupt firms one by one is that it can save the time to find the optimal
model. Should we consider all the alternatives (e.g., one representative for bankrupt firms and two representatives for non-bankrupt
firms), we shall have to spend too much computational time on
searching for the optimal model. Therefore, we calculate, in the
5259
J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263
Table 4
Rotated factor matrix for financial variables
Variable
X6
X4
X27
X5
X1
X10
X26
X19
X9
X2
X17
X18
X8
X12
X13
X11
X22
X15
X21
X14
X25
X20
X23
X24
X7
X3
X16
Factor
1
2
3
4
5
6
7
8
0.988
0.985
0.982
0.980
0.968
0.111
0.101
0.105
0.114
0.068
0.066
0.066
0.191
0.089
0.089
0.089
0.047
0.015
0.003
0.062
0.028
0.005
0.024
0.107
0.198
0.038
0.049
0.090
0.092
0.094
0.086
0.086
0.940
0.940
0.939
0.937
0.521
0.003
0.003
0.010
0.169
0.249
0.173
0.078
0.402
0.375
0.090
0.060
0.047
0.008
0.041
0.087
0.117
0.052
0.064
0.117
0.145
0.122
0.073
0.006
0.010
0.009
0.003
0.014
0.979
0.979
0.907
0.003
0.037
0.049
0.105
0.012
0.005
0.022
0.015
0.001
0.050
0.028
0.031
0.041
0.069
0.052
0.060
0.055
0.066
0.053
0.172
0.193
0.182
0.169
0.039
0.031
0.031
0.023
0.856
0.834
0.806
0.174
0.103
0.125
0.428
0.034
0.024
0.167
0.026
0.043
0.101
0.067
0.009
0.023
0.006
0.025
-0.034
0.117
0.122
0.147
0.136
0.311
0.075
0.075
0.064
0.070
0.224
0.024
0.777
0.675
0.621
0.537
0.024
0.047
0.382
0.121
0.158
0.112
0.049
0.003
0.005
0.001
0.007
0.063
0.057
0.057
0.044
0.042
0.255
0.011
0.011
0.023
0.020
0.020
0.121
0.141
0.069
0.049
0.034
0.855
0.792
0.642
0.059
0.114
0.011
0.017
0.002
0.015
0.009
0.032
0.159
0.028
0.005
0.004
0.027
0.450
0.009
0.009
0.016
0.072
0.051
0.149
0.111
0.216
0.199
0.123
0.101
0.088
0.311
0.760
0.747
0.154
0.117
0.002
0.000
0.004
0.003
0.031
0.064
0.033
0.044
0.066
0.062
0.033
0.033
0.014
0.014
0.080
0.097
0.111
0.187
0.138
0.123
0.020
0.026
0.004
0.057
0.074
0.751
0.652
Table 5
Variable selection process
Factor
First step
Second step
Third step
1
2
X4
X2, X9, X10
X2
X2
3
4
5
6
7
8
X11, X12, X13, X22
X12, X13
X12, X13
X15, X21
X15, X21
X15
X20, X23, X25
X20, X23, X25
X20, X23, X25
X7, X24
X24
X24
X3, X16
X3
X3
first place, the hit ratios of the five models above, each of which has
equal number of the representatives for bankrupt firms and nonbankrupt ones respectively, and try to find a sub-optimal model.
And then, we search for the optimal model, in which the numbers
of the representatives for bankrupt and non-bankrupt firms may
differ from each other. This additional model will be dealt with
in the latter part of this paper.
With respect to the application of genetic algorithm, the chromosome structure of the model is described as follows. Each chromosome is composed of 9 genes for the weights and the number of
representative firms times 9 genes for the representative firms’ classification variables. The genes for the weights are set to be real numbers in the range of [0, 1], and the genes for the representative firms’
variables are set to take real numbers over the range [3, 3]. For initial values, random numbers of [0, 1] are used for the genes for the
weights, and the random variates from standard normal distribution
are used for the genes for the representative firms’ variables.
To search for the optimal values, we set the size of the population to be 100, the mutation rates to be 0.1 for both wi and Xki, and
the cross over rates to be 0.5 for both wi and Xki. The search is to be
terminated when the total number of trials reaches 20,000. For the
experiment, we use Evolver 4.0, a genetic algorithm software.
4.3. Analysis results and discussion
Table 6 represents the total number of iterations and the valid
number of iterations until it searches for the optimal values by
each model.
Table 7 shows the hit ratio of each model. In Table 7, we notice
that as the number of the representative firms gets larger (in other
words, as the model number increases from 1 to 5), the hit ratio of
training data increases. In case of test data, however, the hit ratio
peaks when the number of the representative firms is 4, and after
that point it has a tendency of getting downward. The trend like
this can be interpreted as an over-fitting problem, which means
that while the level of fit of the model to the training data increases
as the number of representative firms increase, the hit ratio for the
test data has a tendency of getting lower if the number of representative firms exceeds a certain number.
For example, in case of [Model 4], we notice that there exists an
over-fitting problem as the hit ratio of test data (76.8%) turns out to
be lower than that of the training data (77.9%). Similarly, this problem occurs in other models except [Model 2]. Therefore, we select
[Model 2] as the best one among 5 models since it does not only
cause the over-fitting problem, but also its hit ratio for the test data
Table 6
The number of iterations to search for the optimal values by each model
Model
Total number of iterations
Valid number of iterations
1
2
3
4
5
20,000
20,000
20,000
20,000
20,000
2444
19,719
17,525
15,567
11,472
5260
J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263
Table 7
Hit ratio of each model
Table 10
[Model 2]’s classification result for observations
Model
# of representative
firms
Training data
(%)
Test data
(%)
Validation data
(%)
1
2
3
4
5
2
4
6
8
10
76.4
77.0
77.1
77.9
78.2
76.2
77.0
76.2
76.8
76.4
75.8
78.4
77.8
75.8
70.5
reaches the highest. The hit ratio of [Model 2] for validation data
(78.4%) is also highest among those of 5 models.
Now it is worthwhile to analyze [Model 2] further. Table 8
shows the normalized weight of each variable of [Model 2]. The
weights of X3, X12, X13, and X25 are shown to be relatively higher
while those of X2, X15, X20, X23, and X24 are shown to be relatively lower. This result implies that variables X3, X12, X13, and
X25 give more impact on classifying the observations than the
other variables do.
Table 9 shows the optimal values of each representative firm’s
classification variables in [Model 2]. In Table 9, it is noted that representative firm 1 and 2 are of non-bankruptcy, and representative
firm 3 and 4 are of bankruptcy.
The results of classifying the observations in the data set using
the searched weights and values of the classification variables are
summarized in Table 10. As we see in Table 10, the observations
are classified under only three representative firms among four
of [Model 2]. Based on this result, we suspect that an additional
model, which has two representatives for non-bankrupt firms
and one representative for bankrupt firms, may not only search
for the optimal values in a shorter time than [Model 2] does, but
it may also show the hit ratio similar to that of [Model 2].
Experimenting with this additional model, we found the optimal values in 3183 trials out of 20,000 trials, which is consistent
with our presumption that the additional model would search for
the optimal values in a shorter time. The further results from the
analysis of the additional model are summarized in Table 11
through Table 14.
Table 11 shows the optimal weight of each variable of the additional model. Comparing this result with that of [Model 2] (see Table 8), we see that in [Model 2], X3, X12, X13, and X25 are the
variables of relatively high weights, while in the additional model,
those variables are X3, X13, X20, X24, and X25.
Table 12 shows the values of the classification variables of each
representative firm in the additional model, which are different
from those in [Model 2] (see Table 9). For example, in [Model 2],
the values of X13 of representative firm 1 and 2 are relatively high,
Bankruptcy
Representative
1
2
3
4
Total
Training
0
1
Total
685
273
958
14
67
81
–
–
–
64
423
487
763
763
1526
Test
0
1
Total
231
94
325
3
26
29
–
–
–
20
134
154
254
254
508
Validation
0
1
Total
230
86
316
5
26
31
–
–
–
19
142
161
254
254
508
and that of representative firm 4 is low while in the additional
model, the value of X13 of representative firm 3 lies between the
respective values of representative firm 1 and 2. You can see the
similar difference in other variables as well.
This result subsequently causes a difference in hit ratios between the additional model and [Model 2]. The hit ratios of the
additional model turn out to be 77.3% for training data and 78.0%
for test data, which are higher than those of [Model 2], while it is
77.4% for validation data that is lower than that of [Model 2].
In [Model 2], we set the numbers of the representatives for
bankrupt firms and non-bankrupt ones to be equally 2, which resulted that no observation was classified under representative firm
3. In the additional model, however, we set the number of the representatives for non-bankrupt firms to be 2 and the number of the
representatives for bankrupt ones to be 1 so that we could find the
optimal values of the weights and the classification variables of
each representative firm in a shorter time, and also show a higher
hit ratio for the training data than in [Model 2].
While the additional model showed a higher hit ratio for the
training data than that of [Model 2], we notice that its hit ratio
for validation data turned out to be lower than that of [Model 2].
But, in this case, we do not say that we encounter an over-fitting
problem because the additional model’s hit ratio for the validation
data (77.4%) is still higher than its hit ratio for the training data
(77.3%) although it is lower than that of [Model 2] (78.4%). Also,
the result that the hit ratio of [Model 2] for the validation data
(78.4%) is higher than that of the additional model (77.4%) would
not imply that [Model 2] is better than the additional model.
Rather, we claim that the additional model is better than [Model
2] since it does not only cause an over-fitting problem, but it also
searches for the optimal values for all of the training, test, and validation data in a shorter time as well as in a more stable and consistent manner with regard to its hit ratios than [Model 2] does.
Table 8
The weights of variables of [Model 2]
Weights
P
wj = j w j
wj
X2
X3
X12
X13
X15
X20
X23
X24
X25
0.0448
0.1694
0.1776
0.6720
0.1403
0.5310
0.1395
0.5281
0.0501
0.1896
0.0909
0.3441
0.0715
0.2707
0.0717
0.2712
0.2136
0.8084
Table 9
The values of the classification variables in [Model 2]
Representative firm
1
2
3
4
(non-bankrupt)
(non-bankrupt)
(bankrupt)
(bankrupt)
Variable
X2
X3
X12
X13
X15
X20
X23
X24
X25
2.4921
2.3118
0.1427
2.0558
0.4066
2.4352
2.4112
1.2644
1.5520
2.7850
0.7815
1.8007
0.3040
0.6320
0.7192
0.2825
1.7466
0.2582
2.0606
2.3154
1.0213
2.3915
2.4126
2.0725
1.1974
0.4086
1.5321
0.8260
2.9719
0.7399
2.1588
2.9237
0.8361
1.9994
0.6153
0.5044
5261
J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263
Table 11
The weights of the additional model’s classification variables
Weight
.P
wj
j wj
wj
X2
X3
X12
X13
X15
X20
X23
X24
X25
0.0319
0.1334
0.2183
0.9127
0.0378
0.1580
0.1674
0.7000
0.0196
0.0819
0.1167
0.4877
0.0276
0.1154
0.1749
0.7313
0.2057
0.8601
Table 12
The values of the classification variables in the additional model
Representative firm
Variable
1 (non-bankrupt)
2 (non-bankrupt)
3 (bankrupt)
X2
X3
X12
X13
X15
X20
X23
X24
X25
2.2508
2.2057
0.6379
0.0855
0.1220
0.2129
1.3176
0.1508
2.3068
1.4516
2.9111
1.7754
2.2511
0.1874
1.6116
2.9119
1.8721
1.7152
0.6878
0.7571
0.2517
1.4422
0.3975
0.0458
2.6507
0.0664
1.6165
Table 13
The result of classifying the observations of the additional model
Bankruptcy
Representative
1
2
3
Total
Training
0
1
Total
15
3
18
666
261
927
82
499
581
763
763
1526
Test
0
1
Total
5
0
5
228
91
319
21
163
184
254
254
508
Validation
0
1
Total
6
1
7
428
160
588
102
519
621
536
680
1216
The results of classifying the observations of the additional model
are summarized in Table 13.
Now, Table 14 and Fig. 1 show the weighted values of each representative firm’s classification variables, which are calculated by
multiplying the normalized weights by the values of each representative firm’s variables.
As seen in Fig. 1, representative firm 3 of bankruptcy may be
characterized such that its value of X12 (Current Liabilities to Total
Assets) is higher than those of representative firm 1 and 2 of nonbankruptcy while its values of X24 (Break-Even Point Ratio) and
X25 (Employment Costs) are lower than those of representative
firm 1 and 2. This sort of information can be useful for credit loan
evaluations. For example, consider a firm requesting a bank for a
credit loan. If its ratio of Current Liabilities to Total Assets is relatively higher, and its Break-Even Point Ratio and Employment
Costs are relatively lower than those of other firms in same industry, we can suspect that the firm is more likely to face bankruptcy.
To be a sound credit loan institution, therefore, a bank should be
able to measure the distance between a firm requesting a credit
loan and each representative firm in order to check whether the
firm is likely to belong to the group of high potential of bankruptcy,
and they must reflect the result on their credit loan decisions.
Fig. 1. The weighted values of the classification variables of each representative
firm.
Table 15
Hit ratios of classification methods
Classification methods
Training (%)
Validation (%)
Model in this study
Discriminant analysis
Logistic regression
Decision tree
Neural network
77.3
70.2
71.2
76.5
78.1
77.4
69.1
70.7
76.8
76.4
4.4. Measuring performance
To validate the predicting power of the proposed model, we
compared the performance of our model with those of other existing classification methods. To make a fair comparison, we used the
same 9 variables, which are used in this paper, to the other methods. We also used the same data, which are divided into three subsets of training, test, and validation ones, as in this study. In
decision tree analysis, the splitting criterion is based on Chi-square
test with the significance level of 0.02, the maximum depth of root
is set to be 6, and the minimum number of observations is set to be
5. For neural network, four-layer perceptron with an input layer, an
output layer, and two hidden layers is used, and the number of
Table 14
The weighted values of the classification variables of each representative firm
1 (non-bankrupt)
2 (non-bankrupt)
3 (bankrupt)
X2
X3
X12
X13
X15
X20
X23
X24
X25
0.0718
0.0704
0.0204
0.0187
0.0266
0.0465
0.0498
0.0057
0.0872
0.2431
0.4874
0.2973
0.0441
0.0037
0.0316
0.3397
0.2184
0.2001
0.0190
0.0209
0.0069
0.2523
0.0695
0.0080
0.5454
0.0137
0.3326
5262
J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263
nodes in each hidden layer is set to be 9 that is equal to the number
of the input variables.
Table 15 is the summary of comparing the hit ratios of the existing classification methods with that of the proposed model in this
study. Based on the hit ratio for the validation data, our model
shows the best performance while its hit ratio for the training data
is slightly behind the neural network method. We should here note
that the above result is derived from the specific data so that it may
not be generalized; however, from this result, we can claim that
our method is not at least inferior method to the existing ones
for bankruptcy prediction, and it can serve as a promising alternative with methodological advantage over the existing methods
such as cluster analysis and CBR as described earlier.
tended to show our method would not be at least inferior to the
other existing methods, and hence claim that it can serve as a
promising alternative, armed with its methodological merits, to
the existing methods. We admit, however, statistical tests based
on rigorous experimental design remain to be conducted in the future to generalize the difference in prediction accuracy among the
classification methods.
Acknowledgement
This work was supported by the Sogang University Research
Grant of 2007 (200701059).
References
5. Conclusion
In this study, we have proposed a new binary classification
method, and empirically validated its predicting power. We expect
the classification method proposed in this paper may contribute to
academia as well as practitioners in the following aspects.
First, the binary classification method in this paper can make a
contribution to credit risk management of financial institutions
such as banks. Ever increasing competition among banks makes
the bankruptcy prediction to become their inevitable agenda for
systematic credit risk management, and their ability of accurate
prediction for corporate failures plays an extremely important role
for them to survive in the market and sustain growth through generating profits from their lending practices. Also, at a macro level,
the method is expected to contribute to economic development
of a nation by helping banks to allocate their funds efficiently
and effectively to the prospective firms requesting credit loans
according to their respective financial soundness.
Second, this study purposed to develop a new classification
method as a promising alternative to existing methods for predicting corporate failures. In this respect, it served its end. In addition,
the method suggested in this study has the flexibility in its application range such that it can be applied in other areas such as product purchase prediction and project risk management among
others.
Despite these contributions, this study has some limitations we
would like to pinpoint, which may in turn suggest the directions
for further study.
First, the proposed method permits the subjective judgment of
human beings in selecting the optimal model. In other words, our
method does not provide objective standards to prevent over-fitting problem in the course of searching for the optimal model.
Rather, we select the model demonstrating relatively stable prediction accuracy for training, test, and validation data as the optimal
model to our best knowledge and experience. We suggest a future
study that can derive the objective guidelines to complement the
subjective judgment.
Second, this study takes a naive approach to determining the
number of representative firms such that we simultaneously increase the number of representative firms for both bankrupt and
non-bankrupt firms one by one, make the corresponding model
to each number, and compare the prediction results of the models,
which cause a lot of computational time and efforts. Although current computing environment alleviates much of the computational
burden, a further study would be needed to incorporate a method
to determine the optimal number of representative firms into the
model.
Third, we did not conduct hypothesis tests to ensure that the
proposed method statistically outperforms the existing methods
in terms of prediction accuracy. Rather, through comparing prediction performance among the methods using a sample data, we in-
Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of
corporate bankruptcy. Journal of Finance, 23(4), 589–609.
Atiya, A. F. (2001). Bankruptcy prediction for credit risk using neural networks: A
survey and new results. IEEE Transactions on Neural Networks, 12(4), 929–935.
Beaver, W. H. (1966). Financial ratios and predictions of failure. Journal of Accounting
Research, 4(Suppl.), 71–111.
Bell, T. B. (1997). Neural nets or the logit model. A comparison of each model’s
ability to predict commercial bank failures. International Journal of Intelligent
Systems in Accounting, Finance and Management, 6(3), 249–264.
Bryant, S. M. (1997). A case-based reasoning approach to bankruptcy prediction
modeling. International Journal of Intelligent Systems in Accounting, Finance and
Management, 6(3), 195–214.
Cielen, A. P., & Vanhoof, K. (2004). Bankruptcy prediction using a data envelopment
analysis. European Journal of Operational Research, 154(2), 526–532.
Dimitras, A. I., Slowinski, R., Susmaga, R., & Zopounidis, C. (1999). Business failure
prediction using rough sets. European Journal of Operational Research, 114(2),
263–280.
Frydman, H., Altman, E. I., & Kao, D. (1985). Introducing recursive partitioning for
financial classification: The case of financial distress. Journal of Finance, 40(1),
269–291.
Jo, H., Han, I., & Lee, H. (1997). Bankruptcy prediction using case-based reasoning,
neural network and discriminant analysis for bankruptcy prediction. Expert
Systems with Applications, 13(2), 97–108.
Kaski, S., Sinkkonen, J., & Peltonen, J. (2001). Bankruptcy analysis with selforganizing maps in learning metrics. IEEE Transaction on Neural Networks, 12(4),
936–947.
Lacher, R. C., Coats, P. K., Sharma, S. C., & Fantc, L. F. (1995). A neural network for
classifying the financial health of a firm. European Journal of Operational
Research, 85(1), 53–65.
Lam, M. (2004). Neural networks techniques for financial performance prediction:
Integrating fundamental and technical analysis. Decision Support Systems, 34(4),
567–581.
Lee, K., Booth, D., & Alam, P. (2005). A comparison of supervised and unsupervised
neural networks in predicting bankruptcy of Korean firms. Expert Systems with
Applications, 29(1), 1–16.
Leshno, M., & Spector, Y. (1996). Neural network prediction analysis: The
bankruptcy case. Neurocomputing, 10(2), 125–147.
Marais, M. L., Patel, J., & Wolfson, M. (1984). The experimental design of
classification models: An application of recursive partitioning and
bootstrapping to commercial bank loan classifications. Journal of Accounting
Research, 22(Suppl.), 87–114.
McKee, T. E. (2000). Developing a bankruptcy prediction model via rough sets
theory. International Journal of Intelligent Systems in Accounting, Finance and
Management, 9(3), 59–173.
McKee, T. E. (2003). Rough sets bankruptcy prediction models versus auditor
signaling rates. Journal of Forecasting, 22(8), 569–589.
Min, J. H., & Lee, Y. C. (2005). Bankruptcy prediction using support vector machine
with optimal choice of kernel function parameters. Expert Systems with
Applications, 28(4), 603–614.
Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy.
Journal of Accounting Research, 18(1), 109–131.
Park, C. S., & Han, I. (2002). A case-based reasoning with the feature weights derived
by analytic hierarchy process for bankruptcy prediction. Expert Systems with
Applications, 23(3), 255–264.
Salchenberger, L., Mine, C., & Lash, N. (1992). Neural networks: A tool for predicting
thrift failures. Decision Sciences, 23(4), 899–916.
Shin, K. S., & Lee, Y. J. (2002). A genetic algorithm application in bankruptcy
prediction modeling. Expert Systems with Applications, 23(3), 321–328.
Swicegood, P., & Clark, J. A. (2001). Off-site monitoring systems for predicting bank
underperformance: A comparison of neural networks, discriminant analysis and
professional human judgment. International Journal of Intelligent Systems in
Accounting, Finance and Management, 10(3), 169–186.
Tam, K. Y. (1991). Neural network models and the prediction of bank bankruptcy.
Omega, 19(5), 429–445.
Varetto, F. (1998). Genetic algorithm applications in the analysis of insolvency risk.
Journal of Banking and Finance, 22(10-11), 1421–1439.
J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263
Wilson, R. L., & Sharda, R. (1994). Bankruptcy prediction using neural networks.
Decision Support Systems, 11(5), 545–557.
Yang, Z. R., Platt, M. B., & Platt, H. D. (1999). Probability neural network in
bankruptcy prediction. Journal of Business Research, 44(2), 67–74.
5263
Zimmermann, H. J. (1996). Fuzzy set theory and its applications. London: Kluwer
Academic Publishers.
Zmijewski, M. E. (1984). Methodological issues related to the estimation of financial
distress prediction models. Journal of Accounting Research, 22(Suppl.), 59–82.