A binary classification method for bankruptcy prediction

2009, Expert Systems With Applications

The purpose of this paper is to propose a new binary classification method for predicting corporate failure based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing virtual companies representing bankrupt companies and non-bankrupt ones, respectively, the proposed method measures the similarity between the virtual companies and the subject for prediction, and classifies the subject into either bankrupt or non-bankrupt one. The values of the classification variables of the virtual companies and the weights of the variables are determined by the proper model to maximize the hit ratio of training data set using genetic algorithm. In order to test the validity of the proposed method, we compare its prediction accuracy with those of other existing methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and it is shown that the binary classification method we propose in this paper can serve as a promising alternative to the existing methods for bankruptcy prediction.

Expert Systems with Applications 36 (2009) 5256–5263 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: A binary classification method for bankruptcy prediction Jae H. Min *, Chulwoo Jeong Sogang University, Graduate School of Business, #1, Shinsu-dong, Mapo-gu, Seoul 121-742, Republic of Korea a r t i c l e i n f o Keywords: Bankruptcy prediction Binary classification Genetic algorithm a b s t r a c t The purpose of this paper is to propose a new binary classification method for predicting corporate failure based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing virtual companies representing bankrupt companies and non-bankrupt ones, respectively, the proposed method measures the similarity between the virtual companies and the subject for prediction, and classifies the subject into either bankrupt or non-bankrupt one. The values of the classification variables of the virtual companies and the weights of the variables are determined by the proper model to maximize the hit ratio of training data set using genetic algorithm. In order to test the validity of the proposed method, we compare its prediction accuracy with those of other existing methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and it is shown that the binary classification method we propose in this paper can serve as a promising alternative to the existing methods for bankruptcy prediction. Ó 2008 Elsevier Ltd. All rights reserved. 1. Introduction In the area of risk management, corporate failure prediction plays a key role in examining credit loan applications since it enables banks to prevent themselves from insolvency due to bad loans in advance and helps them to sustain profitability from its proper lending practices. In addition, predicting corporate bankruptcy in a proper manner, a bank, as a financial agency, can contribute to its community by supplying prospective companies with right fund corresponding to their respective financial soundness. Moreover, the implementation of BASEL II Accord induces more severe competition among banks since it sets up more rigorous risk and capital management requirements to ensure that a bank holds capital reserves appropriate to the risk the bank is exposed through its lending and investment practices. With these reasons among others, banks now place their huge emphasis on identifying the risks they may face in the future more accurately. Due to rapidly changing corporate environment in recent years, however, a lot of complicated reasons behind corporate bankruptcy are newly emerging. Therefore, despite many existing methodologies for predicting corporate failure, it is worthwhile for academia as well as practitioners to continuously develop state-of-the-art methods reflecting various symptoms of corporate failure that may not be explained by the existing ones. In this paper, we develop a new method for bankruptcy prediction employing genetic algorithm, an artificial intelligence technique. To validate our model, we empirically compare the * Corresponding author. Tel.: +82 2 705 8545; fax: +82 2 715 8505. E-mail address: [email protected] (J.H. Min). 0957-4174/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2008.06.073 performance of the proposed method with those of existing ones such as multi-discriminant analysis, logistic regression, decision tree, and neural network, and show its potential as a promising alternative to the existing methods. 2. Background of the study The bankruptcy prediction models can be divided into two main streams. The first one is based on statistical methods. It was pioneered by Beaver (1966), followed by Altman (1968) who applied multi-discriminant analysis, and has been developed to stochastic models such as logit (Ohlson, 1980) and probit (Zmijewski, 1984). The second one is employing artificial intelligence (AI) methods, and a number of studies have applied them to bankruptcy prediction problem from 1990’s. AI methods include decision tree (Frydman, Altman, & Kao, 1985; Marais, Patel, & Wolfson, 1984), fuzzy set theory (Zimmermann, 1996), case-based reasoning (Bryant, 1997; Jo, Han, & Lee, 1997; Park & Han, 2002), genetic algorithm (Shin & Lee, 2002; Varetto, 1998), support vector machine (Min & Lee, 2005), data envelopment analysis (Cielen & Vanhoof, 2004), rough sets theory (Dimitras, Slowinski, Susmaga, & Zopounidis, 1999; McKee, 2000, 2003), and several kinds of neural networks such as BPNN (back propagation trained neural network) (Atiya, 2001; Bell, 1997; Lam, 2004; Leshno & Spector, 1996; Salchenberger, Mine, & Lash, 1992; Swicegood & Clark, 2001; Tam, 1991; Wilson & Sharda, 1994), PNN (probabilistic neural networks) (Yang, Platt, & Platt, 1999), SOM (self-organizing map) (Kaski, Sinkkonen, & Peltonen, 2001; Lee, Booth, & Alam, 2005), Cascor (cascade correlation neural network) (Lacher, Coats, Sharma, & Fantc, 1995). 5257 J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263 In this paper, we propose a new classification method following the second stream, which employs a genetic algorithm to the problem. Genetic algorithm is a search technique embodying the evolutionary mechanism such as selection, cross over, and mutation. It searches wide and complicated spaces to find the optimal values of the parameters of optimization problems with various constraints, and has been applied in a wide rage of areas including bankruptcy prediction. Varetto (1998) and Shin and Lee (2002) may be taken as good examples of the study using genetic algorithm in the area of bankruptcy prediction. Varetto (1998) suggested two different models based on genetic algorithm, one of which is a linear model estimating the constant and the variable coefficients of the discriminant function to maximize its discriminant power using genetic algorithm. The other one is a rule based model, which classifies firms according to their respective discriminant scores called GSR (genetic score by rules) using genetic algorithm. Genetic algorithm can also be used to produce a set of rules based on the tests deriving the signs and the cutoff values of selected ratios, and in this regard, Shin and Lee (2002) suggested a rule inducing model to maximize its prediction power using genetic algorithm. In previous literature as described in the above examples, genetic algorithm has played a role complementing the existing statistical methods and AI methods rather than a standalone method for bankruptcy prediction. In this study, genetic algorithm is also used as a medium of designing a new binary classification method. Specifically, it is used to estimate the weights and the values of the classification variables of the firms representing bankrupt firms and non-bankrupt ones, respectively. Now, we describe the advantage of the proposed method over some of the existing ones, which also provides the background of our approach to bankruptcy prediction. First, the proposed method is somewhat similar to cluster analysis as both methods classify the subjects into several clusters. But the proposed method differentiates itself from cluster analysis in the following aspects. The method classifies the subjects into clusters to maximize the prediction accuracy, the matching rate of the subject’s bankruptcy status and the representative firm’s one, with prior knowledge of whether each subject is bankrupt or not, while cluster analysis does its classifying job without the prior knowledge of the status of each cluster. Also, cluster analysis doesn’t consider differences in the weights of the classification variables, while our model estimates the weights of the classification variables and reflects those into the prediction. In addition, employing genetic algorithm, our method can overcome the problem of kmeans cluster analysis such that the clustering results can differ by initial value setting. Second, at a glance, the proposed method seems to be similar to CBR (case-based reasoning) as both methods classify the companies according to the commonality of the subject for prediction and the existing companies in data set; however, the model in this paper is different from CBR in the following aspects. Above all, our model takes a global learning approach while CBR is of a local learning approach. In other words, CBR measures distance between each subject for prediction and all existing companies in data set, identifies the nearest k companies, and uses them to classify each subject into good or bad. On the contrary, our model employs genetic algorithm to search for the optimal values of the weights and the classification variables of the virtual firms representing bankrupt firms and nonbankrupt ones (the representative firms) in a wider space in order to maximize its prediction power, calculates the distance between each subject for prediction and the representative firms, and classify the subject into either bankrupt or non-bankrupt according to the status of the nearest representative firm to the subject. In addition, our method can provide one with more meaningful information than CBR does. While CBR does provide the informa- tion about the bankruptcy status of the subject for prediction, and shows the most similar firm to each subject for prediction, it is very difficult for us to identify which characteristics of the reference firm is similar to those of the firm for prediction. On the contrary, our method provides all the information CBR does, and at the same time, it helps us to easily identify what characteristics the reference firm representing the subject for prediction has in comparison with other representative firms so that we can make logical inference that the subject for prediction classified into same cluster as a particular representative firm would also have similar characteristics to those of the representative firm. 3. Mathematical model We now mathematically describe the binary classification model we propose as the following. The distance between a representative firm and an observation in data set, which implies the similarity or non-similarity between them, can be expressed in the following equation dki ¼ X j wj , X j ! wj jX kj  X ij j ð1Þ where dki, distance between representative firm k and observation i (i = 1,2,...,N); Xkj, representative firm k’s value of the j th variable (j = 1,2,...,M); Xij, observation i’s value of the j th variable; wj, weight of the j th variable. In Eq. (1), for each observation i, dki is calculated as many as the number of the representative firms. An observation can then be classified as either bankrupt or non-bankrupt according to the status of the representative firm having the minimum value of dki’s(denote it by dk i ). If the representative firm is bankrupt, observation i is then classified as bankrupt company. Otherwise, the observation is classified as non-bankrupt one. If all the observations in data set are classified in the manner described above, the hit ratio, the proportion of correct prediction, can be calculated. In this paper, we employ genetic algorithm to search for the weights and the values of the classification variables of representative firms so as to maximize the hit ratio. This process can be expressed in the following equation N 1 X Ci N i¼1 max H¼ subject to C i ¼ 1 if BðiÞ ¼ Bðk Þ 8i 2 f1; 2; . . . ; Ng;  ð2Þ C i ¼ 0 otherwise; dk i ¼ minðdki ’sÞ where H, hit ratio (the number of correct predictions over the number of all predictions for N observations); N, number of observations in data set; Ci, either prediction for observation i is correct (Ci = 1) or not (Ci = 0) (i = 1, 2, ... , N); B(i), either observation i is bankrupt (B(i) = 1) or not (B(i) = 0); B(k*), either representative firm is bank rupt (B(k*) = 1) or not ðBðk Þ ¼ 0Þ; k*, representative firm corresponding to dk i ; dk i , minimum distance between representative firm k and observation i. 4. Empirical analysis 4.1. Data The final data set for the analysis contains financial ratios of 2542 externally audited small and medium-sized manufacturing firms. Among them, 1271 firms are filed for bankruptcy and the other 1271 firms are non-bankrupt ones during the period of year 2001 through year 2004. 5258 J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263 The data selection process is described as follows. First, we initially gathered 27 financial ratios (see Table 2) of 2814 firms (1407 bankrupt and 1407 non-bankrupt ones) to conduct an empirical analysis. For bankrupt firms, the financial ratios as of one year before their going into bankruptcy were gathered. Second, we used the means and standard deviations of the financial ratios of 2814 firms to standardize them as Z-values, and the observations (the firms) whose Z-values were beyond the range of [3, 3] were considered outliers and excluded from the data set. The numbers of remaining bankrupt firms and non-bankrupt ones now become 1271 and 1316, respectively. Third, in order to equalize the numbers of bankrupt firms and non-bankrupt ones, we excluded randomly selected 45 non-bankrupt firms from the data set. Table 1 shows how the number of data points in the sample is subsequently reduced according to the process described above. The entire data is split into three subsets which are training set, test set, and validation set with respective proportions of 60%, 20%, and 20% using stratified sampling method. So, the numbers of data points in training set, test set, and validation set turn out to be 1526, 508, and 508, respectively. Table 2 shows the definitions of the 27 financial ratios we selected. In order to select the variables for the analysis, we concurrently employed several statistical methods including independent sample t-test, discriminant analysis, logistic regression, decision tree, and factor analysis. Now, Table 3 presents the variables selected by each statistical method we just mentioned. The variables selected by independent sample t-test are those showing significant differences in their means between bankrupt companies and non-bankrupt ones. In discriminant analysis and logistic regression, we employ stepwise selection method, where we set an entry value of 0.01 and an excluding value of 0.05 as the levels of significance for F statistic. In decision tree, we use Chi-square test as the splitting criterion where we set the significance level to be 0.02, the maximum depth of root to be 6, and the minimum number of observations in a leaf to be 5. Next, we conducted factor analysis for the selected variables, and could derive 8 factors as shown in Table 4. Considering all the results from applying the various methods described above, we selected the final variables for the analysis. The variable selection process can be summarized as follows. First, among the variables under each factor derived by factor analysis, we only selected the variables that were also chosen by discriminant analysis, logistic regression, or decision tree. Second, among the variables selected in the previous step, we excluded the variables that turned out to be insignificant by the independent sample t-test. Third, among the variables remaining under each factor, we tried to choose one variable by excluding the variables whose meanings are similar to the chosen one. Table 5 summarizes the variables chosen according to the above mentioned variable selection process. In Table 5, we notice that under factor 5, for example, there are two variables, X15 and X21 in the second step; however, only X15 is selected in the third step since the meaning of X15 and that of X21 are considered similar to each other. On the contrary, we see that both variables X12 and X13 under factor 4 are selected in the third step as well as in the second step because they are believed to give different impact on firms’ status of bankruptcy. LikeTable 1 The number of data points in data selection process Raw data First step: excluding outliers Second step: equalizing the number of data points in each category Nonbankrupt Bankrupt Total 1407 1316 1271 1407 1271 1271 2814 2587 2542 Table 2 Definition of variables Variable Definition X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 Gross value added to sales Gross value added to total assets Growth rate of total assets Ordinary income to sales Net income to sales Operating income to sales Costs of sales to sales Net interest expenses to sales Ordinary income to total assets Rate of earnings on total capital Net working capital to total assets Current liabilities to total assets Stockholders’ equity to total assets Total borrowings and bonds payable to total assets Total assets turnover Ordinary income to total assets Net working capital to sales Stockholders’ equity to sales Ordinary income to total assets Depreciation expenses Operating assets turnover Interest expenses to total expenses Net interest expenses Break-even point ratio Employment costs Interest expenses and net income to total assets Earnings before interest and taxes to sales Table 3 Selected variables by each analyzing method Analyzing method Selected variables Independent sample t-test Discriminant analysis Logistic regression Decision tree X2, X2, X2, X2, X3, X3, X3, X3, X12, X13, X15, X20, X21, X23, X24, X25 X9, X12, X15, X20, X21, X22, X24, X25 X7, X9, X12, X20, X21, X22, X25 X4, X10, X15, X16, X20, X23, X24, X25 wise, under factor 6, variables X20, X23, and X25 are also considered as the variables of different characteristics. Finally, the variables selected for the analysis are reduced to 9 variables including X2, X3, X12, X13, X15, X20, X23, X24, and X25. 4.2. Model design In this study, we initially examine five models, denoted by [Model 1] through [Model 5], according to the numbers of the respective representative firms of bankruptcy and non-bankruptcy, which are set to be simultaneously increased one by one from 1 to 5. That is, [Model 1] is the model that has one representative for bankrupt firms and one for non-bankrupt ones. [Model 2] is the one that has two representatives for bankrupt firms and also two representatives for non-bankrupt ones, and so on. The reason why we increase the number of representative firms in this manner is that, if the number of the representative firms is too small, there would be a concern that the model is too simplified, while there would be a risk of over-fit if the number of the representative firms is too large. As we increase the number of representative firms one by one both for bankrupt firms and nonbankrupt ones, we can observe the trend of hit ratios for these five models. Also, the reason why we simultaneously increase the numbers of the representative firms both for bankrupt firms and non-bankrupt firms one by one is that it can save the time to find the optimal model. Should we consider all the alternatives (e.g., one representative for bankrupt firms and two representatives for non-bankrupt firms), we shall have to spend too much computational time on searching for the optimal model. Therefore, we calculate, in the 5259 J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263 Table 4 Rotated factor matrix for financial variables Variable X6 X4 X27 X5 X1 X10 X26 X19 X9 X2 X17 X18 X8 X12 X13 X11 X22 X15 X21 X14 X25 X20 X23 X24 X7 X3 X16 Factor 1 2 3 4 5 6 7 8 0.988 0.985 0.982 0.980 0.968 0.111 0.101 0.105 0.114 0.068 0.066 0.066 0.191 0.089 0.089 0.089 0.047 0.015 0.003 0.062 0.028 0.005 0.024 0.107 0.198 0.038 0.049 0.090 0.092 0.094 0.086 0.086 0.940 0.940 0.939 0.937 0.521 0.003 0.003 0.010 0.169 0.249 0.173 0.078 0.402 0.375 0.090 0.060 0.047 0.008 0.041 0.087 0.117 0.052 0.064 0.117 0.145 0.122 0.073 0.006 0.010 0.009 0.003 0.014 0.979 0.979 0.907 0.003 0.037 0.049 0.105 0.012 0.005 0.022 0.015 0.001 0.050 0.028 0.031 0.041 0.069 0.052 0.060 0.055 0.066 0.053 0.172 0.193 0.182 0.169 0.039 0.031 0.031 0.023 0.856 0.834 0.806 0.174 0.103 0.125 0.428 0.034 0.024 0.167 0.026 0.043 0.101 0.067 0.009 0.023 0.006 0.025 -0.034 0.117 0.122 0.147 0.136 0.311 0.075 0.075 0.064 0.070 0.224 0.024 0.777 0.675 0.621 0.537 0.024 0.047 0.382 0.121 0.158 0.112 0.049 0.003 0.005 0.001 0.007 0.063 0.057 0.057 0.044 0.042 0.255 0.011 0.011 0.023 0.020 0.020 0.121 0.141 0.069 0.049 0.034 0.855 0.792 0.642 0.059 0.114 0.011 0.017 0.002 0.015 0.009 0.032 0.159 0.028 0.005 0.004 0.027 0.450 0.009 0.009 0.016 0.072 0.051 0.149 0.111 0.216 0.199 0.123 0.101 0.088 0.311 0.760 0.747 0.154 0.117 0.002 0.000 0.004 0.003 0.031 0.064 0.033 0.044 0.066 0.062 0.033 0.033 0.014 0.014 0.080 0.097 0.111 0.187 0.138 0.123 0.020 0.026 0.004 0.057 0.074 0.751 0.652 Table 5 Variable selection process Factor First step Second step Third step 1 2 X4 X2, X9, X10 X2 X2 3 4 5 6 7 8 X11, X12, X13, X22 X12, X13 X12, X13 X15, X21 X15, X21 X15 X20, X23, X25 X20, X23, X25 X20, X23, X25 X7, X24 X24 X24 X3, X16 X3 X3 first place, the hit ratios of the five models above, each of which has equal number of the representatives for bankrupt firms and nonbankrupt ones respectively, and try to find a sub-optimal model. And then, we search for the optimal model, in which the numbers of the representatives for bankrupt and non-bankrupt firms may differ from each other. This additional model will be dealt with in the latter part of this paper. With respect to the application of genetic algorithm, the chromosome structure of the model is described as follows. Each chromosome is composed of 9 genes for the weights and the number of representative firms times 9 genes for the representative firms’ classification variables. The genes for the weights are set to be real numbers in the range of [0, 1], and the genes for the representative firms’ variables are set to take real numbers over the range [3, 3]. For initial values, random numbers of [0, 1] are used for the genes for the weights, and the random variates from standard normal distribution are used for the genes for the representative firms’ variables. To search for the optimal values, we set the size of the population to be 100, the mutation rates to be 0.1 for both wi and Xki, and the cross over rates to be 0.5 for both wi and Xki. The search is to be terminated when the total number of trials reaches 20,000. For the experiment, we use Evolver 4.0, a genetic algorithm software. 4.3. Analysis results and discussion Table 6 represents the total number of iterations and the valid number of iterations until it searches for the optimal values by each model. Table 7 shows the hit ratio of each model. In Table 7, we notice that as the number of the representative firms gets larger (in other words, as the model number increases from 1 to 5), the hit ratio of training data increases. In case of test data, however, the hit ratio peaks when the number of the representative firms is 4, and after that point it has a tendency of getting downward. The trend like this can be interpreted as an over-fitting problem, which means that while the level of fit of the model to the training data increases as the number of representative firms increase, the hit ratio for the test data has a tendency of getting lower if the number of representative firms exceeds a certain number. For example, in case of [Model 4], we notice that there exists an over-fitting problem as the hit ratio of test data (76.8%) turns out to be lower than that of the training data (77.9%). Similarly, this problem occurs in other models except [Model 2]. Therefore, we select [Model 2] as the best one among 5 models since it does not only cause the over-fitting problem, but also its hit ratio for the test data Table 6 The number of iterations to search for the optimal values by each model Model Total number of iterations Valid number of iterations 1 2 3 4 5 20,000 20,000 20,000 20,000 20,000 2444 19,719 17,525 15,567 11,472 5260 J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263 Table 7 Hit ratio of each model Table 10 [Model 2]’s classification result for observations Model # of representative firms Training data (%) Test data (%) Validation data (%) 1 2 3 4 5 2 4 6 8 10 76.4 77.0 77.1 77.9 78.2 76.2 77.0 76.2 76.8 76.4 75.8 78.4 77.8 75.8 70.5 reaches the highest. The hit ratio of [Model 2] for validation data (78.4%) is also highest among those of 5 models. Now it is worthwhile to analyze [Model 2] further. Table 8 shows the normalized weight of each variable of [Model 2]. The weights of X3, X12, X13, and X25 are shown to be relatively higher while those of X2, X15, X20, X23, and X24 are shown to be relatively lower. This result implies that variables X3, X12, X13, and X25 give more impact on classifying the observations than the other variables do. Table 9 shows the optimal values of each representative firm’s classification variables in [Model 2]. In Table 9, it is noted that representative firm 1 and 2 are of non-bankruptcy, and representative firm 3 and 4 are of bankruptcy. The results of classifying the observations in the data set using the searched weights and values of the classification variables are summarized in Table 10. As we see in Table 10, the observations are classified under only three representative firms among four of [Model 2]. Based on this result, we suspect that an additional model, which has two representatives for non-bankrupt firms and one representative for bankrupt firms, may not only search for the optimal values in a shorter time than [Model 2] does, but it may also show the hit ratio similar to that of [Model 2]. Experimenting with this additional model, we found the optimal values in 3183 trials out of 20,000 trials, which is consistent with our presumption that the additional model would search for the optimal values in a shorter time. The further results from the analysis of the additional model are summarized in Table 11 through Table 14. Table 11 shows the optimal weight of each variable of the additional model. Comparing this result with that of [Model 2] (see Table 8), we see that in [Model 2], X3, X12, X13, and X25 are the variables of relatively high weights, while in the additional model, those variables are X3, X13, X20, X24, and X25. Table 12 shows the values of the classification variables of each representative firm in the additional model, which are different from those in [Model 2] (see Table 9). For example, in [Model 2], the values of X13 of representative firm 1 and 2 are relatively high, Bankruptcy Representative 1 2 3 4 Total Training 0 1 Total 685 273 958 14 67 81 – – – 64 423 487 763 763 1526 Test 0 1 Total 231 94 325 3 26 29 – – – 20 134 154 254 254 508 Validation 0 1 Total 230 86 316 5 26 31 – – – 19 142 161 254 254 508 and that of representative firm 4 is low while in the additional model, the value of X13 of representative firm 3 lies between the respective values of representative firm 1 and 2. You can see the similar difference in other variables as well. This result subsequently causes a difference in hit ratios between the additional model and [Model 2]. The hit ratios of the additional model turn out to be 77.3% for training data and 78.0% for test data, which are higher than those of [Model 2], while it is 77.4% for validation data that is lower than that of [Model 2]. In [Model 2], we set the numbers of the representatives for bankrupt firms and non-bankrupt ones to be equally 2, which resulted that no observation was classified under representative firm 3. In the additional model, however, we set the number of the representatives for non-bankrupt firms to be 2 and the number of the representatives for bankrupt ones to be 1 so that we could find the optimal values of the weights and the classification variables of each representative firm in a shorter time, and also show a higher hit ratio for the training data than in [Model 2]. While the additional model showed a higher hit ratio for the training data than that of [Model 2], we notice that its hit ratio for validation data turned out to be lower than that of [Model 2]. But, in this case, we do not say that we encounter an over-fitting problem because the additional model’s hit ratio for the validation data (77.4%) is still higher than its hit ratio for the training data (77.3%) although it is lower than that of [Model 2] (78.4%). Also, the result that the hit ratio of [Model 2] for the validation data (78.4%) is higher than that of the additional model (77.4%) would not imply that [Model 2] is better than the additional model. Rather, we claim that the additional model is better than [Model 2] since it does not only cause an over-fitting problem, but it also searches for the optimal values for all of the training, test, and validation data in a shorter time as well as in a more stable and consistent manner with regard to its hit ratios than [Model 2] does. Table 8 The weights of variables of [Model 2] Weights P wj = j w j wj X2 X3 X12 X13 X15 X20 X23 X24 X25 0.0448 0.1694 0.1776 0.6720 0.1403 0.5310 0.1395 0.5281 0.0501 0.1896 0.0909 0.3441 0.0715 0.2707 0.0717 0.2712 0.2136 0.8084 Table 9 The values of the classification variables in [Model 2] Representative firm 1 2 3 4 (non-bankrupt) (non-bankrupt) (bankrupt) (bankrupt) Variable X2 X3 X12 X13 X15 X20 X23 X24 X25 2.4921 2.3118 0.1427 2.0558 0.4066 2.4352 2.4112 1.2644 1.5520 2.7850 0.7815 1.8007 0.3040 0.6320 0.7192 0.2825 1.7466 0.2582 2.0606 2.3154 1.0213 2.3915 2.4126 2.0725 1.1974 0.4086 1.5321 0.8260 2.9719 0.7399 2.1588 2.9237 0.8361 1.9994 0.6153 0.5044 5261 J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263 Table 11 The weights of the additional model’s classification variables Weight .P wj j wj wj X2 X3 X12 X13 X15 X20 X23 X24 X25 0.0319 0.1334 0.2183 0.9127 0.0378 0.1580 0.1674 0.7000 0.0196 0.0819 0.1167 0.4877 0.0276 0.1154 0.1749 0.7313 0.2057 0.8601 Table 12 The values of the classification variables in the additional model Representative firm Variable 1 (non-bankrupt) 2 (non-bankrupt) 3 (bankrupt) X2 X3 X12 X13 X15 X20 X23 X24 X25 2.2508 2.2057 0.6379 0.0855 0.1220 0.2129 1.3176 0.1508 2.3068 1.4516 2.9111 1.7754 2.2511 0.1874 1.6116 2.9119 1.8721 1.7152 0.6878 0.7571 0.2517 1.4422 0.3975 0.0458 2.6507 0.0664 1.6165 Table 13 The result of classifying the observations of the additional model Bankruptcy Representative 1 2 3 Total Training 0 1 Total 15 3 18 666 261 927 82 499 581 763 763 1526 Test 0 1 Total 5 0 5 228 91 319 21 163 184 254 254 508 Validation 0 1 Total 6 1 7 428 160 588 102 519 621 536 680 1216 The results of classifying the observations of the additional model are summarized in Table 13. Now, Table 14 and Fig. 1 show the weighted values of each representative firm’s classification variables, which are calculated by multiplying the normalized weights by the values of each representative firm’s variables. As seen in Fig. 1, representative firm 3 of bankruptcy may be characterized such that its value of X12 (Current Liabilities to Total Assets) is higher than those of representative firm 1 and 2 of nonbankruptcy while its values of X24 (Break-Even Point Ratio) and X25 (Employment Costs) are lower than those of representative firm 1 and 2. This sort of information can be useful for credit loan evaluations. For example, consider a firm requesting a bank for a credit loan. If its ratio of Current Liabilities to Total Assets is relatively higher, and its Break-Even Point Ratio and Employment Costs are relatively lower than those of other firms in same industry, we can suspect that the firm is more likely to face bankruptcy. To be a sound credit loan institution, therefore, a bank should be able to measure the distance between a firm requesting a credit loan and each representative firm in order to check whether the firm is likely to belong to the group of high potential of bankruptcy, and they must reflect the result on their credit loan decisions. Fig. 1. The weighted values of the classification variables of each representative firm. Table 15 Hit ratios of classification methods Classification methods Training (%) Validation (%) Model in this study Discriminant analysis Logistic regression Decision tree Neural network 77.3 70.2 71.2 76.5 78.1 77.4 69.1 70.7 76.8 76.4 4.4. Measuring performance To validate the predicting power of the proposed model, we compared the performance of our model with those of other existing classification methods. To make a fair comparison, we used the same 9 variables, which are used in this paper, to the other methods. We also used the same data, which are divided into three subsets of training, test, and validation ones, as in this study. In decision tree analysis, the splitting criterion is based on Chi-square test with the significance level of 0.02, the maximum depth of root is set to be 6, and the minimum number of observations is set to be 5. For neural network, four-layer perceptron with an input layer, an output layer, and two hidden layers is used, and the number of Table 14 The weighted values of the classification variables of each representative firm 1 (non-bankrupt) 2 (non-bankrupt) 3 (bankrupt) X2 X3 X12 X13 X15 X20 X23 X24 X25 0.0718 0.0704 0.0204 0.0187 0.0266 0.0465 0.0498 0.0057 0.0872 0.2431 0.4874 0.2973 0.0441 0.0037 0.0316 0.3397 0.2184 0.2001 0.0190 0.0209 0.0069 0.2523 0.0695 0.0080 0.5454 0.0137 0.3326 5262 J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263 nodes in each hidden layer is set to be 9 that is equal to the number of the input variables. Table 15 is the summary of comparing the hit ratios of the existing classification methods with that of the proposed model in this study. Based on the hit ratio for the validation data, our model shows the best performance while its hit ratio for the training data is slightly behind the neural network method. We should here note that the above result is derived from the specific data so that it may not be generalized; however, from this result, we can claim that our method is not at least inferior method to the existing ones for bankruptcy prediction, and it can serve as a promising alternative with methodological advantage over the existing methods such as cluster analysis and CBR as described earlier. tended to show our method would not be at least inferior to the other existing methods, and hence claim that it can serve as a promising alternative, armed with its methodological merits, to the existing methods. We admit, however, statistical tests based on rigorous experimental design remain to be conducted in the future to generalize the difference in prediction accuracy among the classification methods. Acknowledgement This work was supported by the Sogang University Research Grant of 2007 (200701059). References 5. Conclusion In this study, we have proposed a new binary classification method, and empirically validated its predicting power. We expect the classification method proposed in this paper may contribute to academia as well as practitioners in the following aspects. First, the binary classification method in this paper can make a contribution to credit risk management of financial institutions such as banks. Ever increasing competition among banks makes the bankruptcy prediction to become their inevitable agenda for systematic credit risk management, and their ability of accurate prediction for corporate failures plays an extremely important role for them to survive in the market and sustain growth through generating profits from their lending practices. Also, at a macro level, the method is expected to contribute to economic development of a nation by helping banks to allocate their funds efficiently and effectively to the prospective firms requesting credit loans according to their respective financial soundness. Second, this study purposed to develop a new classification method as a promising alternative to existing methods for predicting corporate failures. In this respect, it served its end. In addition, the method suggested in this study has the flexibility in its application range such that it can be applied in other areas such as product purchase prediction and project risk management among others. Despite these contributions, this study has some limitations we would like to pinpoint, which may in turn suggest the directions for further study. First, the proposed method permits the subjective judgment of human beings in selecting the optimal model. In other words, our method does not provide objective standards to prevent over-fitting problem in the course of searching for the optimal model. Rather, we select the model demonstrating relatively stable prediction accuracy for training, test, and validation data as the optimal model to our best knowledge and experience. We suggest a future study that can derive the objective guidelines to complement the subjective judgment. Second, this study takes a naive approach to determining the number of representative firms such that we simultaneously increase the number of representative firms for both bankrupt and non-bankrupt firms one by one, make the corresponding model to each number, and compare the prediction results of the models, which cause a lot of computational time and efforts. Although current computing environment alleviates much of the computational burden, a further study would be needed to incorporate a method to determine the optimal number of representative firms into the model. Third, we did not conduct hypothesis tests to ensure that the proposed method statistically outperforms the existing methods in terms of prediction accuracy. Rather, through comparing prediction performance among the methods using a sample data, we in- Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 23(4), 589–609. Atiya, A. F. (2001). Bankruptcy prediction for credit risk using neural networks: A survey and new results. IEEE Transactions on Neural Networks, 12(4), 929–935. Beaver, W. H. (1966). Financial ratios and predictions of failure. Journal of Accounting Research, 4(Suppl.), 71–111. Bell, T. B. (1997). Neural nets or the logit model. A comparison of each model’s ability to predict commercial bank failures. International Journal of Intelligent Systems in Accounting, Finance and Management, 6(3), 249–264. Bryant, S. M. (1997). A case-based reasoning approach to bankruptcy prediction modeling. International Journal of Intelligent Systems in Accounting, Finance and Management, 6(3), 195–214. Cielen, A. P., & Vanhoof, K. (2004). Bankruptcy prediction using a data envelopment analysis. European Journal of Operational Research, 154(2), 526–532. Dimitras, A. I., Slowinski, R., Susmaga, R., & Zopounidis, C. (1999). Business failure prediction using rough sets. European Journal of Operational Research, 114(2), 263–280. Frydman, H., Altman, E. I., & Kao, D. (1985). Introducing recursive partitioning for financial classification: The case of financial distress. Journal of Finance, 40(1), 269–291. Jo, H., Han, I., & Lee, H. (1997). Bankruptcy prediction using case-based reasoning, neural network and discriminant analysis for bankruptcy prediction. Expert Systems with Applications, 13(2), 97–108. Kaski, S., Sinkkonen, J., & Peltonen, J. (2001). Bankruptcy analysis with selforganizing maps in learning metrics. IEEE Transaction on Neural Networks, 12(4), 936–947. Lacher, R. C., Coats, P. K., Sharma, S. C., & Fantc, L. F. (1995). A neural network for classifying the financial health of a firm. European Journal of Operational Research, 85(1), 53–65. Lam, M. (2004). Neural networks techniques for financial performance prediction: Integrating fundamental and technical analysis. Decision Support Systems, 34(4), 567–581. Lee, K., Booth, D., & Alam, P. (2005). A comparison of supervised and unsupervised neural networks in predicting bankruptcy of Korean firms. Expert Systems with Applications, 29(1), 1–16. Leshno, M., & Spector, Y. (1996). Neural network prediction analysis: The bankruptcy case. Neurocomputing, 10(2), 125–147. Marais, M. L., Patel, J., & Wolfson, M. (1984). The experimental design of classification models: An application of recursive partitioning and bootstrapping to commercial bank loan classifications. Journal of Accounting Research, 22(Suppl.), 87–114. McKee, T. E. (2000). Developing a bankruptcy prediction model via rough sets theory. International Journal of Intelligent Systems in Accounting, Finance and Management, 9(3), 59–173. McKee, T. E. (2003). Rough sets bankruptcy prediction models versus auditor signaling rates. Journal of Forecasting, 22(8), 569–589. Min, J. H., & Lee, Y. C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603–614. Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18(1), 109–131. Park, C. S., & Han, I. (2002). A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction. Expert Systems with Applications, 23(3), 255–264. Salchenberger, L., Mine, C., & Lash, N. (1992). Neural networks: A tool for predicting thrift failures. Decision Sciences, 23(4), 899–916. Shin, K. S., & Lee, Y. J. (2002). A genetic algorithm application in bankruptcy prediction modeling. Expert Systems with Applications, 23(3), 321–328. Swicegood, P., & Clark, J. A. (2001). Off-site monitoring systems for predicting bank underperformance: A comparison of neural networks, discriminant analysis and professional human judgment. International Journal of Intelligent Systems in Accounting, Finance and Management, 10(3), 169–186. Tam, K. Y. (1991). Neural network models and the prediction of bank bankruptcy. Omega, 19(5), 429–445. Varetto, F. (1998). Genetic algorithm applications in the analysis of insolvency risk. Journal of Banking and Finance, 22(10-11), 1421–1439. J.H. Min, C. Jeong / Expert Systems with Applications 36 (2009) 5256–5263 Wilson, R. L., & Sharda, R. (1994). Bankruptcy prediction using neural networks. Decision Support Systems, 11(5), 545–557. Yang, Z. R., Platt, M. B., & Platt, H. D. (1999). Probability neural network in bankruptcy prediction. Journal of Business Research, 44(2), 67–74. 5263 Zimmermann, H. J. (1996). Fuzzy set theory and its applications. London: Kluwer Academic Publishers. Zmijewski, M. E. (1984). Methodological issues related to the estimation of financial distress prediction models. Journal of Accounting Research, 22(Suppl.), 59–82.