Hybrid Mining Approach in The Design of Credit Scoring Models

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Expert Systems with Applications 28 (2005) 655–665

www.elsevier.com/locate/eswa

Hybrid mining approach in the design of credit scoring models


Nan-Chen Hsieh*
Department of Information Management, National Taipei College of Nursing, No. 365, Min-Ten Road 11257, Taipei, Taiwan, ROC

Abstract

Unrepresentative data samples are likely to reduce the utility of data classifiers in practical application. This study presents a hybrid mining
approach in the design of an effective credit scoring model, based on clustering and neural network techniques. We used clustering
techniques to preprocess the input samples with the objective of indicating unrepresentative samples into isolated and inconsistent
clusters, and used neural networks to construct the credit scoring model. The clustering stage involved a class-wise classification process. A
self-organizing map clustering algorithm was used to automatically determine the number of clusters and the starting points of each cluster.
Then, the K-means clustering algorithm was used to generate clusters of samples belonging to new classes and eliminate the unrepresentative
samples from each class. In the neural network stage, samples with new class labels were used in the design of the credit scoring model.
The proposed method demonstrates by two real world credit data sets that the hybrid mining approach can be used to build effective credit
scoring models.
q 2005 Elsevier Ltd. All rights reserved.

Keywords: Data mining; Credit scoring model; Clustering; Class-wise classification; Neural network

1. Introduction of the deficiencies of the linear discriminant analysis model.


One of the efforts is leading to the investigation of
In response to the recent growth of the credit industry and non-parametric statistical methods, that is, neural networks
the management of large loan portfolios, the industry is for scoring applications.
actively developing credit and behavioral scoring models. Swales and Yoon (1992) used neural networks to
Credit scoring models help to decide whether to grant credit differentiate stocks. They found that the neural networks
to new applicants using customer’s characteristics such as performed significantly better than the linear multiple
age, income and martial status (Chen & Huang, 2003). discriminant models. Tam and Kiang (1992) compared the
Behavioral scoring models help to analyze purchasing neural network approach with a linear classifier model,
behavior of existing customers (Setiono, Thong, & Yap, logistic regression model, neural network model, and
1998). This study utilizes a hybrid mining approach in the decision tree model to predicate bank failures. They
design of credit scoring models to support credit approval demonstrated that neural networks are more accurate,
decisions. adaptive, and robust than to other methods. Zhang et al.
A simple parametric statistical model, linear discrimi- (1999) similarly showed that neural networks are signifi-
nant analysis, was the first model employed for credit cantly better than logistics regression models in bankruptcy
scoring. However, analysts have questioned the appropri- predication. Desai, Crook, and Overstreet. (1996) per-
ateness of linear discriminant analysis for credit scoring formed an experiment that the performance of discriminant
because of the categorical nature of the credit data and the analysis is comparable to the performance of back-
fact that the covariance matrices of the good and bad credit propagation neural networks in classifying loan applicants
classes are not likely to be equal. Researchers are now into good and bad credit. They pointed out that more
investigating more sophisticated models to overcome some customized architectures might be necessary for building
effective generic models to classify consumer loan appli-
* Tel./fax: C886 2 2822 7101 2200.
cations in the credit union environment. West (2000)
E-mail address: [email protected] investigated the credit scoring accuracy of five neural
0957-4174/$ - see front matter q 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2004.12.022
656 N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665

network models, and reported that a hybrid architecture of of the credit scoring model, isolated clusters should be
neural network models should be considered for credit eliminated. Since inconsistent clusters may be important and
scoring applicants. That is, non-parametric and hybrid eliminating them may cause the loss of valuable information,
design architecture will be very useful in developing investigating inconsistent clusters in detail will be very
effective credit scoring models. The meaning of an effective helpful in understanding customers. In the neural network
credit scoring model is two-fold, relating to accuracy and stage, the neural network used samples with new class labels
easy interpretation of classified results. to create models for predicate consumer loans.
Generally, hybrid design architectures for creating For a better understanding of our study, Section 2 begins
more accurate classifiers that uses neural networks guided with an overview of credit scoring models in general.
by clustering algorithms (Gopalakrishnan, Sridhar, & Section 3 offers a hybrid credit scoring model and considers
Krishnamurthy, 1995; Sung, 1998) or genetic algorithms why it should perform better than other credit scoring
(Kim & Street, 2004) were proposed. The other studies focus models. Following this discussion, Section 4 empirically
on integrating the multivariate analysis and neural network to tests the hybrid credit scoring model using two real world
increase clustering accuracy. Punj and Steward (1983) credit data sets. Finally, Section 5 discusses the findings of
proposed a two-stage method that combines Ward’s the experiment and offers observations about practical
minimum variance method and the K-means method. applications and directions for future research.
Balakrishnan et al. (1996) integrated unsupervised FSCL
neural networks with the K-means method. Kuo, Ho, and Hu
(2002) proposed a two-stage method, which uses the self- 2. Description the analysis methodology
organizing map to determine the number of clusters and then
employs the K-means algorithm to classify samples into 2.1. Credit and behavioral scoring models
clusters. This study addresses the benefit by investigating a
simple but effective hybrid utility of clustering and neural Credit and behavioral scoring models (Thomas, 2000) are
network techniques in the design of a credit scoring model. one of the most successful applications of statistical and
As shown in Fig. 1, the proposed hybrid scoring model has operational research modelling in finance and banking, and
two processing stages. In the clustering stage, samples are the number of scoring analysts in the industry is constantly
grouped into homogeneous clusters. In particular, unrepre- increasing. The main objective of both credit and behavioral
sentative samples are indicated as ‘isolated’ and ‘inconsist- scoring models is to classify samples into homogeneous
inconsistent’ clusters. Herein, isolated clusters are thinly groups (Lancher, Coates, Shanker, & Fant, 1995). Hence,
populated clusters; inconsistent clusters are those clusters scoring problems are related to classification by statistical
with inconsistent class values. To improve the performance analysis (Hand, 1981; Hsieh, 2004; Johnson & Wichern,

Fig. 1. A hybrid mining credit scoring system.


N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665 657

1998; Morrison, 1990), especially classification by neural


networks in the field of data mining (Lancher et al., 1995).
Applying to credit approval databases, classification analysis
for credit scoring is used to categorize a new applicant as
either accepted or rejected with respect to characteristics
such as age, income and martial status (Chen & Huang,
2003). On the other hand, classification analysis for
behavioral scoring is used to describe the behavior of
existing customers by using behavioral scoring variables and
also to predict future purchasing behavior or credit status of
existing customers (Setiono et al., 1998).
Until recently, the building of both scoring models has
been always based on a pragmatic approach; therefore, a best
and most standard scoring model for every unique circum- Fig. 2. Unrepresentative samples prevent accurate predictions.
stance undoubtedly does not exist. Most previous studies
have focused on building more accurate credit or behavioral requirement of correct classification requires that feature
scoring models and increasing the accuracy of the classifi- values of samples conform to specify class membership. In
cation model with various statistical techniques. However, many situations, the correct class membership is not
creating a good performance model for customer credit absolutely known. The uncertainty of class membership of
management is difficult because unrepresentative samples the samples will increase with the learning complexity of the
exist. Therefore, even when building highly accurate scoring credit scoring models. Clustering is a technique that uses
models, misclassification patterns appear frequently (Kim & sample’s feature values to classify samples into clusters with
Sohn, 2004). shared characteristics. Preprocessing samples by clustering
This study intends to draw significantly from data mining technique is useful in defining proper class membership.
perspectives, providing general model which integrates
clustering and neural network techniques for applicant 2.3. Assessing hybrid architecture to the credit
approval analysis, including necessary preprocessing of the scoring analysis
credit data sets. Finding isolated and inconsistent clusters to
support a standard model building process will be very For credit scoring analysis, many studies have shown that
useful. The proposed hybrid credit scoring model can serve neural networks perform significantly better than statistical
as a tool to validate the effect of clustering techniques in techniques such as linear discriminate analysis,
practical credit scoring analysis applications. multiple discriminate analysis and logistic regression
analysis (Desai et al., 1996; Lancher et al., 1995; Malhotra
2.2. Unrepresentative samples prevent the utility & Malhotra, 2003). The application of neural networks to
of credit scoring models credit scoring analysis is a promising research area and is a
challenge for a variety of marketers (Kim & Sohn, 2004;
To date, many studies have proposed various credit Malhotra & Malhotra, 2003; Sharda & Wilson, 1996;
scoring models to find accurate boundaries to classify Vellido, Lisboa, & Vaughan, 1999; Zhang, Hu, Patuwo, &
the samples. Since there are practical reasons to prevent Indro ,1999). Lee, Chiu, Lu, and Chen. (2002) explored the
perfect classification, an absolutely accurate credit scoring performance of credit scoring by integrating the back-
model does not exist. The learning of credit scoring models propagation neural networks with traditional discriminate
is based on historical samples with known classes, which are analysis. Hush and Horne (1993) presented the application
usually two classes: ‘good credit’ and ‘bad credit’. In of clustering techniques to build more accurate classifiers in
theory, every new sample has a fixed class membership to the context of RBF-based neural networks. Hruschka and
the known classes, but this theory is unachievable in real Natter (1999) suggested that the feedforward neural
world application, since samples with known classes are networks obtained better solutions comparable to those of
quite limited, and the collected samples are often unrepre- partitional clustering techniques, such as K-means. Gopa-
sentative of the population to be analyzed. lakrishnan et al. (1995) presented that by employing
Fig. 2 shows how unrepresentative samples prevent clustering techniques, the training effort of neural networks
accurate predictions when the classes are determined by a may be drastically reduced, and clustering techniques can be
two-dimensional feature space. The ideal classification used to build more robust classifiers. Natter (1999) found
boundary is a simple line. The sample points with circles are that a feedforward neural network incorporated both
limited and unrepresentative, and cause credit scoring models clustering and discriminate analysis, producing more
to find an inaccurate classification boundary. However, if homogeneous solutions.
samples are further classified into more classes, the accuracy Importantly, these authors noted that weaknesses in the
of credit scoring models increases. Anyway, the basic built models are due to the model’s architecture, rather than
658 N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665

a failure of neural network itself in general. They thus infer the candidate number of clusters and starting point of each
that neural networks can effectively be used in credit cluster, and non-hierarchical methods can provide better
scoring, but the architecture design of the credit scoring clustering result with the specified information. This study
model determines the degree of success observed. Along has chosen to replace the hierarchical method with a
these lines, this study investigates a hybrid credit scoring computational intelligent technique, self-organizing map.
model based on clustering and neural networks for building For the sake of in the hierarchical method, once an
more effective classifiers. Various issues, such as number observation has been assigned to a cluster, it should not
and nature of training samples, and the network architecture, be moved. However, a self-organizing map can continually
can be resolved with the help of clustering techniques. update the observation to the closest cluster even initial
conditions have been assigned. Therefore, the candidate
number of clusters and the starting point of each cluster can
3. Offering a hybrid credit scoring model easily be determined. The second stage follows the
non-hierarchical, K-means method, to determine the final
Typically, the network architecture affects the ability of a solution due to its efficiency. Next, an intelligent search
neural network. The design parameters of a neural network method is used to find the best neural network architecture,
include network topology, number of input nodes, number then this neural network applies the clustering results to
of nodes in each hidden layers, number of output nodes and predicate consumer loans. In Section 4, two experiments
activation function selected. However, the quality of were conducted to show how this hybrid credit scoring
training samples leads to misclassification, and the mis- model proceeded, and verify that the proposed model
classification problem is due to unrepresentative samples performs better than the other approaches.
that drastically reduce the accuracy of the neural network
classifier. A way of identifying the unrepresentative samples
is to look for isolated and inconsistent clusters. 4. Empirical analysis
Isolated clusters are thinly populated clusters; inconsist-
ent clusters are clusters with inconsistent class values. A 4.1. Real world credit data sets
fundamental assumption made here is that the isolated
clusters are far away from the core clusters. To improve the Two real world credit data sets from the UCI Repository
quality of classification, isolated clusters can be eliminated of Machine Learning Databases (http://www.niaad.liacc.up.
from training samples. However, inconsistent clusters may pt/statlog/datasets.html) were used to test the predicate
be important and eliminating them may cause the loss of accuracy of the credit scoring models. The German credit
valuable information. Handling inconsistent clusters in data set consisted of a set of loans given to a total of 1000
detail and investigating their properties will be very helpful applicants, 700 samples of creditworthy applicants and 300
for understanding customers. A way of achieving this goal is samples where credit should not be extended. For each
to employ clustering techniques to preprocess the training applicant, 20 variables described credit history, account
samples. Finally, we stand to gain if we can keep the feature balances, loan purpose, loan amount, employment status,
values of training samples more homogeneous, would allow and personal information. The Australian credit data set is a
an efficient classifier to be build. The training samples similar data set with 690 samples, in which 468 samples
filtered by clustering are used to successively train the credit were accepted and maintain good credit and 222 samples
scoring model. were accepted, but became delinquent. To protect the
The proposed credit scoring model is a hybrid approach confidentiality of these data sets, sensitive variables of these
using clustering techniques to preprocess samples into two credit sets were transformed to symbolic data. This
homogeneous clusters, and neural network techniques to study employed all the variables as inputs, whereas some
build classifiers. The clustering process depends on previous studies used summary variables (Kuo et al., 2002),
sample’s feature values and optionally makes use of the or ranking importance variables (Kim & Sohn, 2004; Lee
original class label to generate clusters. The cluster labels et al., 2002; Sung, 1998) to build the credit scoring models.
are thus added as new class labels for each sample. This is a
so-called class-wise classification process, that is, the 4.2. A two-stage clustering technique to preprocess
clustering process generated clusters of samples belonging the credit date sets
to class and eliminates the unrepresentative samples from
each class. The goal of developing a neural network is to find an
Since the quality of clustering is not easy to control, Punj appropriate topology for the network and then train the
and Steward (1983) suggest that a two-stage clustering network so that the network gradually learns the desired
method consisting of a hierarchical model, such as Ward’s input–output functionality. A network can be trained in two
minimum variance method, followed by a non-hierarchical ways, by supervised and unsupervised learning. In super-
method, such as K-means method, can provide a better vised learning, the network is presented with training
solution. The key is that hierarchical methods can determine samples of known input–output relationships. The trained
N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665 659

network can be used to mimic the presented input–output


behavior, and then the network is tested to verify whether it
is able to produce correct output. Several supervised neural
network models exist. West (2000) has investigated the
credit scoring accuracy of five neural network models, for
both the German and Australian credit data sets. However,
due to the quality of samples, the improvement of correct
classification rate is limited.
In unsupervised learning, the output data is not available
and usually not even known beforehand. Therefore, the
network tries to find similarities between input samples.
Similar samples form clusters that automatically constitute
the output of the neural network. As reported in Sung
(1998), clustering samples before training contributes to the
network’s achieving a higher accuracy. In here, two credit
data sets have predefined labels describing their loan status.
However, credit data sets might contain unrepresentative
Fig. 3. Applicant distribution map of German credit set.
samples, which will affect the accuracy and representation
of credit scoring models. For this reason, this study utilizes a good credit applicants, and the bad credit applicants would
two-stage learning method (Kuo et al., 2002), which have more of an opportunity to repay the delayed payments
employs a self-organizing map to determine the number than the other bad credit applicants. However, the applicants
of clusters and the starting points of each cluster and then with the same credit status but distributed over different
employs the K-means algorithm to find the final solution. clusters may have different degrees of uncertainty to their
Since no rule exists for determining the best training credit status.
parameters, the training parameters of both clustering Table 1 shows that neither credit data sets has significant
methods are obtained by trial and error. After many times isolated clusters. For easy understanding of the variations of
of testing, the best segmentation number of the two credit credit status, Figs. 3 and 4 depict the applicant distribution
data sets was found to be four. maps of two credit data sets. In the German credit data set,
Suppose that the input samples are representative, and cluster-1 has 235 samples of good credit and 3 samples of
then most of the applicants should be validated clearly in bad credit. Since the proportion of samples having bad
terms of credit status, whether good or bad. However, some credit is small in cluster-1, these samples can be ignored at
applicants may be misclassified by those credit scoring this time. Cluster-2 is an inconsistent cluster, which contains
models, mainly due to unrepresentative samples, or possibly samples of both good and bad credit. The behavioral trend
uncertain definition of class membership of the samples. of these applicants might be changed in the near future.
Worthy of notice, the future behavioral trend of these To indicate this behavioral trend in the built credit scoring
unrepresentative applicants might be changed in the near model, cluster-2 was further divided into cluster-21 and
future (Kim & Sohn, 2004). For the applicants in the
inconsistent cluster, the good credit applicants would have
more of an opportunity to delay future payments than other
Table 1
The clustering results of German and Australian credit sets

German credit data Australian credit data


Good credit, 700 (1) Bad Good credit, 468 (1) Bad
credit, credit,
300 (2) 222 (0)
Cluster Number of Credit Cluster Number of Credit
ID samples status ID samples status
1 238 235 (1) 1 278 0
3 (2) 278 (1)
2 211 105 (1) 2 190 0
106 (2) 190 (1)
3 191 0 3 106 106 (0)
191 (2) 0
4 360 360 (1) 4 116 116 (0)
0 0
Fig. 4. Applicant distribution map of Australian credit set.
660 N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665

cluster-22 according to their original credit status. More- system with any desired accuracy, therefore the employed
over, cluster-1 and cluster-4 are clusters with different neural network has just one hidden layer. For the size of
degrees of uncertainty to good credit status. Finally, the the hidden layer, a relatively large hidden layer creates a
German credit set has five clusters. The original class labels more flexible scoring model and errors tend to have a low
were ignored and each sample was set up with new class bias component with a large variance caused by the
labels by the cluster identifiers. On the other hand, the tendency for the model to over-fit the training samples,
sample quality of the Australian credit set was relatively while a relatively small hidden layer results in a model
good to the German credit set. Each sample was given a new with a higher error bias and a lower variance. Therefore,
class label using the cluster identifiers. the design of the hidden layer involves a tradeoff between
error components.
4.3. Building the credit scoring model For each of the two credit data sets, an intelligent
search process was used to determine the network
To provide a reliable estimate and minimize the impact architecture in the feedforward MLP neural network.
of data dependency in developing credit scoring models, The search process started with one hidden layer, and set
k-fold cross-validation was used to generate random the number of hidden neurons equal to the number of
partitions of the credit data sets (West, 2000). In this inputs divided by 2. Then neurons were gradually added
procedure, the credit data set was divided into k to the hidden layer one at a time, retraining each network
independent groups. A neural network was trained using at least three times with different initial weight randomiz-
the first kK1 groups of samples and the trained neural ations and saving the best for comparison with other
network was tested using the kth group. This procedure architectures. The search process stopped until there is no
was repeated until each of the groups has been used as a further improvement in network performance. In this
test set once. The overall scoring accuracy was reported experiment, the maximum number of hidden neurons was
as an average across all k groups. A merit of cross- rarely required to exceed the number of inputs by more
validation is that the credit scoring model is developed than two times.
with a large proportion of the available data and that all Tables 2 and 3 summarize the results of the neural
the data is used to test the resulting models. In this network models. To compare results, 10-fold cross-
experiment, the value of k was set to 10 and thus forms a validation technique was employed, using the confusion
10-fold cross-validation. An estimate from 10-fold cross matrix to observe the error percentage and then evaluate the
validation is likely to be more reliable than an estimate result. For example, the good credit error is the proportion of
from a common practice of using a singleton holdout set. creditworthy applicants who are classified as a bad credit.
The selection of the neural network is another important Conversely, the bad credit error is the proportion of
factor in building a credit scoring model. This study used applicants who are not creditworthy but are incorrectly
supervised feedforward MLP (multi-layer perceptrons) identified as creditworthy. The performance measure
neural networks trained by back-propagation and gradient indicates the proportion of samples which are correctly
descent, in which the output of the network acts as classified. Each network was trained at least 10 repetitions.
feedback and re-enters the network as an input, afford more The measures in each repetition were the average in
sensitive modeling of real data by iteratively adjusting the proportion to the samples of training and testing sets,
weights until an optimal solution is found. Network while the final results in each group were an average of the 3
architecture decisions included the number of hidden best from 10 repetitions. After being preprocessed by
layers and number of neurons in each layer. For the neural clustering, the accuracy of neural network models increased
input layer, each input variable with a small number of significantly.
discrete integer values was recoded as a nominal variable, German credit set was then used to interpret the credit
and the categorical variables were encoded using the ‘One- scoring results. Appendix A lists the distribution of the
to-N’ encoding method. The numeric variables were scaled relative importance for each input variable using the neural
by ‘Minimax’ encoding method in the range (Chen & network. The error sensitivity indicates the performance of
Huang, 2003) before being fed to the network input, the network if the corresponding input variable is unavail-
according to the logistic activation function property. able. In general, important input variables have a high error,
Finally, the number of neurons in the input layer of the indicating that the network performance deteriorates badly
neural networks was the number of variables in the input if they are not present. The ratio sensitivity presents the ratio
samples. One-to-N encoding was used for the neural output between the error and the baseline error. If the ratio
layer with an output neuron dedicated to each of the credit sensitivity is closer to one, then indicating the input variable
decision outcomes. contributes most to the performance of the network. Finally,
The number of hidden layers and the number of neurons the rank summarizes the input variables in order of
in each hidden layer are more difficult to define. As importance.
recommended by Hornik, Stinchcombe, and White (1989), Herein, the sensitivity analysis of the neural network and
one hidden layer network is sufficient to model a complex the order of most significant input variables indicate
N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665 661

Table 2
Classification of applicants (German credit data)

German credit data (raw data) German credit data (preprocessed by clustering)
Good credit error Bad credit error Performance Good credit error Bad credit error Performance
Group 1 0.1396 0.1444 0.7740 0.0087 0.0173 0.9906
Group 2 0.1285 0.1335 0.7850 0.0092 0.0487 0.9809
Group 3 0.1650 0.1948 0.8236 0.0109 0.0319 0.9869
Group 4 0.1261 0.1711 0.7988 0.0078 0.0350 0.9839
Group 5 0.1542 0.1392 0.7645 0.0361 0.0874 0.9598
Group 6 0.2475 0.2530 0.7585 0.0069 0.0445 0.9829
Group 7 0.1939 0.2609 0.7844 0.0304 0.0586 0.9933
Group 8 0.1849 0.2011 0.8012 0.0041 0.0298 0.9889
Group 9 0.1472 0.1369 0.8233 0.0379 0.0923 0.9866
Group10 0.1960 0.1704 0.7855 0.0353 0.0696 0.9922
Network architecture 20-15-1 20-15-5
(input–hidden–output)

variables that are worth looking at in detail. Since the differences between cluster-4 with cluster-21 at 20%
variable ‘Telephone’ is unconcerned with differentiating level. Therefore, cluster-21 was more similar to cluster-4
applicants, then ‘Status of existing checking account’, than to cluster-1. Moreover, we compared the input
‘Present employment since’ and ‘Personal status and sex’ variables of cluster-1 and cluster-4, the characteristics of
were the three most differentiating variables. On the other ‘Status of existing checking account’, ‘Present employ-
hand, ‘Duration in month’, ‘Foreign worker’ and ‘Age in ment since’, ‘Savings account/bonds’, ‘Personal status
years’ were the three least differentiating variables. and sex’, and ‘Present residence since’ of cluster-1 were
Appendix B depicts the statistical summarized data for better than those in cluster-4. Moreover, cluster-21 and
each cluster to verify the proposed method. Only the 12 cluster-22 were very similar, and the three most
most significant variables to each cluster were presented in differentiating variables, ‘job’, ‘credit amount’ and ‘age
the order of relative importance. in years’, were relatively unimportant. Therefore, we can
Statistical summarized data was observed for whole conclude that cluster-1’s overall applicant credit status
input variables to determine the differences between the can be judged better than that of cluster-4, and the
input variables of each group. The first group (cluster-1, behavioral trend of applicants in cluster-21, cluster-22
cluster-4, cluster-21) consisted of clusters tending toward might be changed in the near future. Also, comparing
good credit status. The second group (cluster-22, cluster-22 and cluster-3, the applicants in cluster-3 were
cluster-3) consisted of clusters tending toward bad credit younger than those in cluster-22, and the average
status. The third group (cluster-21, cluster-22) consists of applicant’s ‘installment rate in percentage of disposable’
clusters tending toward changed credit status in the near in cluster-3 is higher than in cluster-22. So the credit
future. The first group had nine variables with differences status of applicants in cluster-3 might be worse than that
between cluster-1 with cluster-21, and four variables had of those applicants in cluster-22.

Table 3
Classification of applicants (Australian credit data)

Australian credit data (raw data) Australian credit data (preprocessed by clustering)
Good credit error Bad credit error Performance Good credit error Bad credit error Performance
Group 1 0.0945 0.0961 0.9043 0.0545 0.0044 0.9797
Group 2 0.1377 0.1312 0.8980 0.0143 0.0787 0.9916
Group 3 0.1520 0.1502 0.8783 0.0417 0.1123 0.9733
Group 4 0.1654 0.1608 0.8666 0.0323 0.0794 0.9850
Group 5 0.1177 0.1223 0.9133 0.3208 0.0923 0.9866
Group 6 0.1217 0.1142 0.9000 0.0369 0.1092 0.9716
Group 7 0.1416 0.13128 0.8880 0.0370 0.1344 0.9750
Group 8 0.1295 0.1229 0.9050 0.0336 0.0996 0.9800
Group 9 0.1435 0.1386 0.8850 0.0338 0.0731 0.9860
Group10 0.1431 0.1289 0.8950 0.0360 0.1512 0.9700
Network architecture 13-15-1 13-20-4
(input–hidden–output)
662 N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665

5. Conclusions
Input variables Rank Error Ratio Comments

This study has presented a hybrid mining approach in the Credit history 6 0.1862 3.9473 A30: no credits taken/all
credits paid back duly;
design of the credit scoring models. The experience of A31: all credits at this
constructing neural networks for the credit scoring problem bank paid back duly;
indicated that clustering is valuable in building networks of A32: existing credits
paid back duly till now;
high effectiveness. Not only designing an accurate classifier, A33: delay in paying off
clustering techniques were successfully employed to pre- in the past; A34: critical
process the input samples for the purpose of indicating account/other credits
existing (not at this
unrepresentative samples. The clustering results consisted of bank)
normal, isolated and inconsistent three kinds of clusters. Property 7 0.1861 3.9443 A121: real estate; A122:
Dividing samples using this method will be an effective if not A121, building
society savings agree-
means of understanding behavioral trends of each group of ment/life insurance;
applicants, and in building effective credit scoring models. A123: if not A121/
The experimental results demonstrate that such a hybrid A122, car or other;
A124: unknown/no
approach is simple but efficient in the application of credit property
market. How to utilize experimental tools, such as sensitivity Purpose 8 0.1792 3.7984 A40: car (new); A41: car
analysis or genetic algorithm, to identify important input (used); A42: furniture/
equipment; A43: radio/
variables, and the soft-computing method, such as fuzzy sets, television; A44: dom-
to encode inputs into more informative data values, in estic appliances; A45:
conjunction with the domain expert’s knowledge to interpret repairs; A46: education;
A47: (vacation-miss-
the results, are issues worthy of further investigation. ing); A48: retraining;
A49: business; A410:
others
Installment 9 0.1719 3.6440
rate in percen-
tage of dispo-
sable income
Appendix A. Sensitivity analysis of the relative Present resi- 10 0.1596 3.3837
importance of input variables (German credit data set) dence since
Job 11 0.1539 3.2631 A171: unemployed/
unskilled non-resident;
Input variables Rank Error Ratio Comments A172: unskilled-resi-
dent; A173: skilled
Telephone 1 0.4150 8.7968 A191: none; A192: yes, employee/official;
registered under the A174:management/self-
customers name
employed/highly quali-
Status of exist- 2 0.2362 5.0070 A11: x!0 DM; A12: fied employee/officer
ing checking 0%x!200 DM; A13: Other install- 12 0.1385 2.9364 A141: bank; A142:
account xR200 DM/salary ment plans stores; A143: none
assignments for at least 1
Housing 13 0.1371 2.9060 A151: rent; A152: own;
year; A14: no checking A153: for free
account Credit amount 14 0.0979 2.0740 Amount of credit
Present 3 0.2111 4.4737 A71: unemployed; A72:
Number of 15 0.0890 1.8859
employment x!1 year; A73: 1%x! existing credits
since 4 years; A74: 4%x!7 at this bank
years; A75: xR7 years Other debtors/ 16 0.0869 1.8419 A101: none;
Personal status 4 0.1988 4.2136 A91: male, divorced/
guarantors A102: co-applicant;
and sex separated; A92: female, A103: guarantor
divorced/separated/mar- Number of 17 0.0817 1.7306
ried; A93: male, single; people being
A94: male, married/
liable to pro-
widowed; A95: female: vide mainten-
single ance for
Savings 5 0.1950 4.1327 A61: x!100 DM;
Duration in 18 0.0807 1.7099
account/bonds A62: 100%x!500 DM;
month
A63: 500%x!1000 Foreign 19 0.0771 1.6346 A201: yes; A202: no
DM; A64: xR1000 DM; worker
A65: unknown/without
Age in years 20 0.0554 1.1747 Applicant’s age
savings account
N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665 663

Appendix B. Statistical summarized data


664 N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665

References Chen, M. C., & Huang, S. H. (2003). Credit scoring and rejected instances
reassigning through evolutionary computation techniques. Expert
Balakrishnan, P. V. S., Cooper, M. C., Jacob, V. S., & Lewis, P. A. (1996). Systems with Applications, 24, 433–441.
Comparative performance of the FSCL neural net and K-means Desai, V. S., Crook, J. N., & Overstreet, G. A. (1996). A comparison of neural
algorithm for market segmentation. European Journal of Operational networks and linear scoring models in the credit union environment.
Research, 93, 346–357. European Journal of Operational Research, 95(1), 24–37.
N.-C. Hsieh / Expert Systems with Applications 28 (2005) 655–665 665

Gopalakrishnan, M., Sridhar, V., & Krishnamurthy, H. (1995). Some Morrison, D. F. (1990). Multivariate statistical methods. New York, NY:
applications of clustering in the design of neural netwroks. Pattern McGraw-Hill.
Recognition Letters, 16, 59–65. Natter, M. (1999). Conditional market segmentation by neural networks: A
Hand, D. J. (1981). Discrimination and classification. New York: Monte-Carlo study. Journal of Retailing and Consumer Services, 6,
Wiley. 237–248.
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward Punj, G., & Steward, D. W. (1983). Cluster analysis in marketing research:
networks are universal approximations. Neural Networks, 2, 336–359. Review and suggestions for applications. Journal of Marketing
Hruschka, H., & Natter, M. (1999). Comparing performance of feedforward Research, 20, 134–148.
neural nets and K-means for cluster-based market segmentation. Setiono, R., Thong, J. Y. L., & Yap, C. S. (1998). Symbolic rule extraction
European Journal of Operational Research, 114, 346–353. from neural networks—An application to identifying organizations
Hsieh, N. C. (2004). An integrated data mining and behavioral scoring adopting IT. Information and Management, 34(2), 91–101.
model for analyzing bank customers. Expert Systems With Applications, Sharda, R., & Wilson, R. (1996). Neural network experiments in business
27(4), 623–633. failures predication: A review of predictive performance issues.
Hush, D. R., & Horne, B. G. (1993). Progress in supervised neural International Journal of Computational Intelligence and Organiz-
networks. IEEE Signal Processing Magazine, 10(1), 8–39. ations, 1(2), 107–117.
Johnson, R. A., & Wichern, D. W. (1998). Applied multivariate statistical Sung, A. H. (1998). Ranking importance of input parameters of neural
analysis (4th ed.). Upper Saddle River, NJ: Prentice-Hall. networks. Expert Systems with Applications, 15, 405–411.
Kim, Y. S., & Sohn, S. Y. (2004). Managing loan customers using Swales, G., & Yoon, Y. (1992). Applying artificial neural networks to
misclassification patterns of credit scoring model. Expert Systems with investment analysis. Financial Analysis Journal, 48, 78–80.
Applications, 26, 567–573. Tam, K. Y., & Kiang, M. (1992). Managerial applications of neural
Kim, Y. S., & Street, W. N. (2004). An intelligent system for customer networks: The case of bank failure predications. Management Science,
targeting: A data mining approach. Decision Support Systems, 37(2), 38(7), 926–947.
215–228. Thomas, L. C. (2000). A survey of credit and behavioural scoring:
Kuo, R. J., Ho, L. M., & Hu, C. M. (2002). Integration of self-organizing Forecasting financial risk of lending to consumers. International
feature map and K-means algorithm for market segmentation. Journal of Forecasting, 16, 149–172.
Computers and Operations Research, 29, 1475–1493. Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural networks in
Lancher, R. C., Coats, P. K., Shanker, C. S., & Fant, L. F. (1995). A neural business: A survey of applications (1992w1998). Expert Systems with
network for classifying the financial health of a firm. European Journal Applications, 17, 51–70.
of Operational Research, 85(1), 53–65. West, D. (2000). Neural network credit scoring models. Computers and
Lee, T. S., Chiu, C. C., Lu, C. J., & Chen, I. F. (2002). Credit scoring using Operations Research, 27, 1131–1152.
the hybrid neural discriminate technique. Expert Systems with Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural
Applications, 23, 245–254. networks in bankruptcy prediction: General framework and cross-
Malhotra, R., & Malhotra, D. K. (2003). Evaluating consumer loans using validation analysis. European Journal of Operational Research, 116,
neural networks. Omega, 31(2), 83–96. 16–32.

You might also like