Ijser

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 292

ISSN 2229-5518

Analysis of Software Fault and Defect Prediction


by Fuzzy C-Means Clustering and Adaptive
Neuro Fuzzy C-Means Clustering
Pushpavathi T.P, Suma V, Ramaswamy V

Abstract— Faults are related to failures and they do not have much power for indicating a higher quality or a better system above the
baseline that the end-users expect.The system faults are the defects that brim in executable files. Conventional approaches employ the
experts to navigate directly into the source code errors. However expansion in system size grew the complexity of task exponentially and
generated a scope for new methods in fault classification.Experimental studies have shown that miniature bugs are reason of faults. In a
considerable size of system the faulty labels and non-faulty labels are marked during modular phase. This paper presents the adaptive-
neuro fuzzy c-means clustering for fault classification via fuzzy c-means clustering.Experimental studies confirmed that only a small portion
of software modules cause faults in software systems.The NASA pc1 database is used for experiments and the results in this approach is
enhanced than previous clustering based approaches.

Index Terms— Adaptive-neuro fuzzy c-means clustering, Data Classification, Defect Prediction, Fault Prediction, Fault proneness, Fuzzy
clustering, Software Project Success.

——————————  ——————————

IJSER
1 INTRODUCTION

S OFTWARE reliability is the essential parameter in decision


of software standard classification [1]. The software fault
prediction technique characterizes the performance of
The error message at each stage of executing the program in-
dicates the fault in programming. Generally speaking the er-
rors are the logical errors in software program. The prediction
software’s performance in dependent field’s application. The models of software fault proneness technique estimate the
program attributes is scaled by quantitative representation amount of faulty modules in a program. The software metrics
from software metrics that play crucial role in detecting the are the attributes for process, execution and product of the
software quality based on evaluation parameters [2]. Perlis et software system.Various other attributes like defect density
al. in [2] experimented and found relationship between attrib- normalized work, fault proneness, maintainability, reusability
utes modified by faults and complexity metrics direct and true etc. determines the quality of software.
in test and validation [3]. In late 20th century the immense Software metrics data are responsible to report about the spe-
research in this field was oriented towards the prediction of cific attributes for the calculation of modules or functions for
relationship between faults detected and complexity metrics. the whole software. These attributes are the inputs for self-
Finding the necessary metrics depends on multiple variable learning model when co-related with weighted error or defect
models [4] in addition to fault size. data. The learning mode is a system that employs the previous
results of performance measure to upgrade itself so as to en-
A software system is the composition of number of modules hance the performance in comparison with previous results.
dependent on each other. Any module with faults in its func- The learning system is modeled in two phases of categoriza-
tionality adverse the output and lowers its reliability. In this tion in its working mechanism i.e. the testing dataset and the
scenario, the detection of faulty modules in early stage (devel- training data set. Some predictor functions in software fault
opment stage) is mandatory to minimize faults in operation proneness systems simulate the Multilayer Perceptron and
phase. Hence, the systems are classified in two categories i.e. Decision Tree algorithm for training and evaluation of effects
with faulty/non-faulty modules in their testing phase. This with respect to testing data set.
classification diverts the focus to neutralize faulty sections to
achieve high reliability and accuracy.
2 RELATED REVIEW
———————————————— In software systems failures, faults and defects are terms co-
• Pushpavathi T.P, JAIN University, Bangalore, India. related to each other with visible difference in their definitions
E-mail: [email protected] [5]. However for researchers, according to IEEE Software
• Suma V, Dayanandsagar College of Engineering, Bangalore, India.
Glossary for Terminologies of Software Engineering, a distinc-
E-mail: [email protected]
• Ramaswamy V, SASTRA University, Srinivasa Ramanujan Cen- tion is mandatory in these terms [7]. Error is considered to be
tre,Kumbakonam,Tamilnadu,India, human mistake (in reference with Glossary) responsible to
E-mail: [email protected] generate undesired results. The error hinders the software or
its component to produce desirable output constrained to its
A software fault or error is reason of failure in execution stage. parameters within specific requirements. A fault [5] is the de-
fect of the system proposed in terms of system engineering
IJSER © 2014
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 293
ISSN 2229-5518
and hardware [7] and is parallel to error responsible for glitch main specific knowledge for determination of software metrics
in simulations for corresponding platform. In this research a (input) and software fault-proneness (output) relation.
fault in the system is associated with mistakes that are diag- The algorithms are evaluated for their strength and weak-
nosed either in development phase or by testing units and nesses by evaluation parameters in experimental setups. Re-
system levels. In software systems, a term “software failure” is cently Metrics Data Program NASA IV&V facility made its
conventional for the systems with inconsistent output, but in data available for public research. Yue Jiang [26] in 2008 stud-
this paper term “fault” or “faulty modules” is accounted for ied the model evaluation techniques in software engineering
the reason that the anomalies responsible for failure are studies to comprehend strength and weakness of performance
tracked in coding phase itself. The faulty modules of the da- evaluation techniques. Authors in [27] stated that, the compar-
taset considered in this paper are pre-released faults [6]. ison of fault-prone models is a multi dimensional problem. No
Software systems are versatile to most of technological/non- single model or the modeling technique proves to be the best
technological advances of the society. Creating software with for all possible uses in software quality assessment. The su-
100% efficiency is an imaginary thought hence software fault pervised learning approaches are constrained to the
prediction got attention by a variety of researchers. Software knowledge of faulty data. In case the faulty data is undeter-
quality is verified in distinct parameters like formal verifica- mined, new methods of classification surfaced. A Ripper
tion, inspection, fault tolerance and testing. Automated meth- Down Rule (RIDOR) algorithm [27] [28] generates a default
ods employs software metrics [8] [9] of previous versions to rule, for the exceptions to be generated in a tree like structure
develop a prediction model. Fault prediction is the union of until according to rule set all training instances are classified
two steps i.e. training and prediction. Training phase is based correctly. Hassan Najadat and Izzat Alsmadi [29] modified
on the metrics (method-level or class metrics) model devel- this rule in terms of accuracy and effectiveness (i.e. generating
oped by evaluation parameters of earlier software versions. less number of rules). The enhanced RIDOR algorithm imports
This metrics predicts the fault-proneness of modules liable to benefits of CLIPER algorithm and RIDOR. The attributes are
be faulty for the upgraded versions of software. In literature encoded as symbols and are compacted or merged to only
survey the methods described are supervised methods that two. The method proved to be efficient on a number of da-

IJSER
have classified data of faulty modules. Today the software tasets considered by authors. Yue Jiang, Bojan Cukic and Tim
fault detection systems are designed with 85% accuracy in Menzies [30] experimented to find the suitability of metrics in
their analysis subjected to dataset considered and evaluation early development to find fault prone software module. Also
parameters taken into account. This level of accuracy is appli- authors demonstrated a predictive module using metrics to
cable in economical world of software industry. characterize textual requirements. The model was tested on 5
The fault prediction techniques are sourced by means datasets. The early lifecycle metrics plays significant role in
of historical data. Research work suggest that the system un- project management but the requirement metrics is unable to
der development is prone to fault if the software metrics predict data itself.
measures similar properties of software and faulty modules
developed and sensed previously in same environment [10].
The conventional applications provide us a platform for fault
3 METHODOLOGY
proneness and methodology for techniques for fault predic- The Fuzzy C-Means clustering for classification of faults in
tion and mitigation. The supervised learning methods stated software fault prediction is conventional approach. The Fuzzy
in literature are combined study of fault data and software C-Means clustering method [31] is the reference of adaptive
metrics that implements different learning algorithms. In late method that improve the performance index in faults classifi-
90’s and starting years of 21st century many techniques sur- cation sector for software systems. The enhancement of this
faced as the solution for this problem. Neural networks [11, 12, method is the collective co-relation of feed-forward neural
13], Genetic Algorithm [14] were developed for large datasets network [33] with fuzzy c-means to outperform the assign-
to generate the generalized relation. Dampster-Shafer Net- ment of mean deviation and absolute error to a cluster to min-
works [15] believed the data can be treated as faulty depend- imize the distance for fault prediction.
ing on combined evidence from different sources. Naïve Bayes The data of PC1 (NASA) is input to the system. For the
[16, 17] stated that the data may be considered as faulty irre- Fuzzy C-Means to cluster the faulty data requires pre-
spective of its nature, if its parameters match the predefined processing of data to minimize time consumption. The output
threshold for faulty systems. Decision trees [18] map the ob- of C-Means clustering is stored for comparison. The output of
servations to predict the possible outcomes. Artificial Immune C-Means is fed to adaptive Neuro-Fuzzy C-Means clustering
Systems [19] is an appreciated algorithm as it defines the cog- algorithm. The algorithm tuned by Neural Network trains the
nitive patterns once trained to update data. Support Vector data and improve performance index. The output of algorithm
Machines [20, 21] employs associated learning algorithm used is compared with output of Fuzzy C-Means in terms of accu-
in classification to recognize and analyze datasets. Case-based racy, reliability, RMSE and MAE.
Reasoning [22] keeps the track of solutions and detects prob- TABLE 1
lems based on these results. Ant-Colony Optimization [23] is PC1 DATASET (SOURCE: NASA)
probabilistic approach to minimize the computational prob- Faulty Infor- Faultless
lems by graphical method. Fuzzy logic [24] is the clustering mation Information
algorithm to collect fault data with high accuracy. Basili et al.
1996 [25] proposed the logistic regression that employs do- PC1 Da- 23 77
IJSER © 2014
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 294
ISSN 2229-5518
taset the clustering algorithm executed for determination of each
point’s degree that belongs to cluster. Generally, the restrictive
constraint scale the performance of various hard clustering algo-
3.1 Data Pre-processing
rithms to cluster original data into different cluster groups. The
This paper refers PCI dataset as input. The dataset is the Bezdek [32] proposed the Fuzzy C-Means Clustering to improve
matrix ( ) with distinguish features. the Hard C-Means Clustering (HCM) in which the overlapping
between different groups takes place.
Xij is the initial PC1 data matrix.
A is the size of dataset. The reference Fuzzy C-Means clustering considered is studied
B is the number of distinguish features. in the work of Zhiwei Gao et al. [31]. In this method the out-
(1) put vector of preprocessing (eq. 3) taken as input
( ) are classified in c number of clusters
Where is and each clustering center is calculated. The primary differ-
ence that outdates the HCM against FCM is fuzzy partition.
is the number of elements in sample This partition is responsible for determination of the degree by
is the random value of B taken from a finite data which every data point belongs to every group using the
set . membership function in the range 0-1. ‘U’ the element of
is the standard deviation. membership matrix is allowed to vary in range 0 to 1, and also
derives the Fuzzy partition. Based on the rules of normaliza-
(2) tion the total of membership elements in a data set is equal to
Where t is the mean value of distinguish features 1 [31].
B= (4)
From eq. (1) and eq. (2) the new preprocessed dataset Where, c is number of clusters.

IJSER
will be (matrix representation).: The objective function (or cost function) of FCM is the general-
(3) ization of equation:
Where s is the standard deviation and (5)
t is the mean of distinguish features.
where, is between 0 and 1, c i is the clustering centers of
fuzzy group , is the Euclidean distance be-
Data
tween the ith clustering centers and the jth data point, x j is the
jth data point, is weighted index.
To obtain the required parameters new objective functions are
Data Preproessing structured that makes equation (5) into minimum

(6)
Calculation of Objec-
tive Function Where is the Lagrange multiplier of n inhibit-
ed formula described in equation (4). The required values that
minimize the equation (5) are as follows:

Calculation of cluster
centers and member- (7)
ship functions (8)

Fuzzy C-Means becomes the simple iterative process by virtue


of the equations (7) and (8). The steps below [31] are struc-
Update the objective tured to calculate the center c i and membership matrix U
function
Step1: Select a number anonymously in the range 0 and 1 for
calculation of membership matrix U and satisfy the number in
Fig. 1. Flow Diagram of FCM eq (4).

3.2 Fuzzy C-Means Clustering Step2: Calculating the value of clustering centers c i from eq
The Fuzzy C-Means Clustering (FCM) also called ISODATA is (4).

IJSER © 2014
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 295
ISSN 2229-5518
Step 3: In accordance with equation (5), calculate the cost func- Layer 4: the input of this layer is the fuzzy output of layer 2
tion. The iteration in algorithm will be terminated once it is while the output is a single number. The parameters are de-
less than certain value of threshold in comparison with change fined by eq. 13 [35].
in last cost function value.
(13)
Step 4: Calculate the next matrix U (based on eq. 8) and return 5. Layer 5: the overall output is computed by summation of
to step 2. each incoming signal [34].
Finally it can be concluded that weighted index m and cluster
(14)
c are required in FCM algorithm. m decides the flexibility of
algorithm i.e. if it high in value the cluster effect will be de-
prived and if in case if it is too small, the algorithm will be Data Set of NASA (100*21) values
near to Hard C-Means Clustering (HCM). If the condition c>1
holds true, the c will be far less than total number of cluster
samples.
Preprocessing (100*2) by Mean and
Standard Deviation
4 ADAPTIVE NEURO FUZZY INFERENCE
The adaptive neuro fuzzy inference system is reasoning fuzzy
FCM (Formation of 25 Centers)
system that is trained by neural network for computation of
membership function parameters. The method tracks the in-
put output data as a non-linear relation with inputs x,y and f
as output.

IJSER
FIS generation of Data from FCM
Layer Layer 2 Layer Layer Layer

A Train FIS (Training of data using Feed


X forward Neural Network)

A Π N A
𝚺 Trained FIS (This FIS takes the input
B Π N and provides Result)
A
1
Y Fig 3. Flow Diagram of Adaptive Neuro Fuzzy C-Means Clustering
B

Figure 2: Adaptive Neuro Fuzzy Inference System Architecture [33]


5 EXPERIMENTAL SETUP
By the training of system to a number of epochs the Software fault detection by Fuzzy C-Means Clustering:
knowledge base is developed. The fuzzy inference system is The dimension of data in matrix form is (100x21) indicating
defined in following steps for minimization of error rate. the 100 cases of faulty/non-faulty modules with 21 properties
Layer 1: In this layer the degree of membership is upgraded in of each. The methodology pre-process the data (calculation of
reference to the parameters of fuzzy sets [34]. standard deviation and mean) is done for each software
stream. This reduces the data matrix to (100x21) i.e. 100 cases
(9) with 2 properties of each.
In original dataset Class Distribution: the class value (defects)
(10) is discrete
Layer 2: In this layer the fuzzy value of inputs is calculated. In
the range [0, 1] the membership value of fuzzy set is deter- % false: 77 = 6.94%
mined [34].
% true: 1032 = 93.05%
(11) The pre-processed data is fed to Fuzzy C-Means clustering to
Layer 3: this layer normalizes the firing strength [34]. obtain 25 clusters. The first cluster of data is labeled as non-
faulty while remaining 24 fields indicates faulty data. Fuzzy
(12) C-Means finds the location of cluster center and assigns each
stream to a cluster centre.For testing the standard deviation
IJSER © 2014
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 296
ISSN 2229-5518
and mean of stream is calculated, assigned to cluster from
which it shows minimum distance. The stream is classified as
faulty if the cluster is faulty else vice-versa.

Software Fault Detection Using Adaptive Neuro-Fuzzy C-


Means Clustering:
The dimension of data in matrix is again (100x21) which indi-
cated 100 cases with 21 properties each. The pre-processing of
data reduces the matrix to the order (100x2) i.e. 100 cases with
2 properties of each.On basis of above a fuzzy inference sys-
tem is generated using FCM. Both systems till this point share
same level of accuracy. This inference system is fed to feed
forward neural network with training data. In training stage
neural network modifies the structure of fuzzy inference sys-
tem to obtain high level of accuracy. Fig.6. Membership Function Plot of Fuzzy Inference System

6 RESULTS AND COCLUSION


6
x 10 Data Input (PC1 database with attributes)
3

2.5

IJSER
1.5

0.5

0
0 10 20 30 40 50 60 70 80 90 100

Fig. 4. Graphical representation of PC1 data distribution of faulty and non-


faulty modules

In training dataset taken as input for Class Distribution: the Fig.7. Membership Function Plot of Adaptive Neuro Fuzzy Inference Sys-
class value (defects) is discrete in nature. tem
% data with positive attribute: 23 = 23%
% data with negative attribute: 77 = 77% TABLE 2
COMPARATIVE TABLE BASED ON SIMULATION FACTORS FOR FUZZY
C-MEANS AND ADAPTIVE NEURO FUZZY C-MEANS CLUSTERING
Adaptive
Fuzzy C-
Neuro Fuzzy
Means
C-Means
Accuracy 77% 91%

Reliability 72.98% 73.98%

RMSE 0.068 0.09

MAE 0.23 0.13


This paper empirically evaluates and compares the perfor-
mance of Fuzzy C-Means clustering technique and adaptive
Neuro-Fuzzy C-Means clustering for software fault prediction.
Fig. 5. Adaptive Neuro-Fuzzy Inference Model for PC1 Dataset with 3 The platform for the testing is MATLAB 2010A on the PC1
Rules. testing database. The proposed Adaptive Neuro-Fuzzy C-
Means Clustering based prediction technique shows 91% ac-
curacy in results. We observe that the implementation that has
achieved a better performance.C-Means clustering for the pre-
diction of faulty prone classes. It can be further accomplished
IJSER © 2014
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 297
ISSN 2229-5518
that the Adaptive Neuro-Fuzzy C-Means clustering is satisfac- Support Vector Machines. // Journal of Systems and Software, 81, 5(2008),
tory in finding the faulty/fault-free area. pp. 649-660.
[21] Gondra, I. Applying Machine Learning to Software Faultproneness Predic-
REFERENCES tion. // Journal of Systems and Software, 81, 2(2008), pp. 186-195.
[22] El Emam, K.; Benlarbi, S.; Goel, N.; Rai, S. Comparing Case-based Reasoning
[1] M. R. Lyu, Handbook of software Reliability Engineering IEEE Computer
Classifiers for Predicting High Risk Software Components. // Journal of Sys-
Society Press, McGraw Hill, 1996.
tems and Software, 55, 3(2001), pp. 301-320.
[2] F. G. Sayward A. J. Perlis and M. Shaw, Software Metrics: An Analysis and
[23] Vandecruys, O.; Martens, D.; Baesens, B.; Mues, C.; De Backer, M.; Haesen, R.
Evaluation, MIT Press, Cambridge, MA, 1981.
Mining Software Repositories for Comprehensible Software Fault Prediction
[3] V. Y. Shen, T.Yu, S. M. Thebaut, and L. R. Paulsen, “Identifying errorprone
Models. // Journal of Systems and Software, 81, 5(2008), pp. 823-839.
software—an empirical study,” IEEE Trans. on Software Engineering, vol. SE-
[24] Yuan, X.; Khoshgoftaar, T. M.; Allen, E. B.; Ganesan, K. An Application of
11, pp. 317–323, April 1985.
Fuzzy Clustering to Software Quality Prediction. // Proc. of the 3rd IEEE
[4] S. G. Crawford, A. A. McIntosh, and D. Pregibon, “An analysis of static met-
Symp. on Application-Specific Systems and Software Eng. Technology, /
rics and faults in C software,” J. Syst. Sofyware, vol. 5, pp. 27–48, 1985.
Washington, DC, 2000, pp. 85-90.
[5] Bose, S.K., Presenting a Novel Neural Network Architecture for Membrane
[25] Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design
Protein Prediction, Intelligent Engineering Systems, 2006. INES '06. Proceed-
metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761.
ings. International Conference
doi:10.1109/32.544352
[6] Attarzadeh, I., Proposing an Enhanced Artificial Neural Network Prediction
[26] Yue Jiang, Bojan Cukic, Tim Menzies, “Fault Prediction using Early Lifecycle
Model to Improve the Accuracy in Software Effort Estimation, Computational
Data”. Lane Department of Computer Science and Electrical Engineer-
Intelligence, Communication Systems and Networks (CICSyN), 2012 Fourth
ing,West Virginia University, Morgantown, WV 26506, USA
International Conference
[27] Witten, I.H. and Frank, E., Data Mining: Practical Machine Learning Tools
[7] Yuan Chen,Research on software defect prediction based on data mining,
and Techniques (San Francisco, CA: Morgan Kaufmann), 2nd edition (2005)
Computer and Automation Engineering (ICCAE), 2010 The 2nd International
[28] Gaines, B.R., Compton, P.: Induction of Ripple-Down Rules Applied to Mod-
Conference on (Volume:1 )
eling Large Databases. J. Intell. Inf. Syst. 5(3), 211–228 (1995)

IJSER
[8] N. E. Fenton and M. Neil. A critique of software defect prediction models.
[29] Hassan Najadat and Izzat Alsmadi, Enhance Rule Based Detection for Soft-
IEEE, Transactions on Software Engineering, 25(5):675–689, 1999.
ware Fault Prone Modules. International Journal of Software Engineering and
[9] N. E. Fenton and N. Ohlsson. Quantitative analysis of faults and failures in a
Its Applications Vol. 6, No. 1, January, 2012
complex software system. IEEE Transactions on Software Engineering,
[30] Zhiwei Guo, Chengqing Yuan, Peng Liu, “Study on Identification Model of
26(8):797–814, 2000
Cylinder Liner-Piston Ring Using Vibration Analysis Based on Fuzzy C-
[10] Khoshgoftaar TM, Allen EB, Ross FD, Munikoti R, Goel N, Nandi A (1997)
means Clustering” The Open Mechanical Engineering Journal, 2012, 6, (Suppl
Predicting fault-prone modules with case-based reasoning. The Eighth Inter-
2: M2) 126-132 1874-155X
national Symposium on Software Engineering (ISSRE '07). IEEE Computer
[31] James C. Bezdek, Robert Ehrlich, William Full, “FCM: The Fuzzy C-Means
Society, pp 27–35
Clustering Algorithm” Computers & Geosciences Vol. 10, No. 2-3, Pp. 191-
[11] Thwin, M. M.; Quah, T. Application of Neural Networks for Software Quality
203, 1984.
Prediction using Object-oriented Metrics. // ICSM 2003. / Amsterdam, The
[32] Pejman Tahmasebi, Ardeshir Hezarkhani, Application of Adaptive Neuro-
Netherlands, 2003, pp. 113-122.
Fuzzy Inference System for Grade Estimation; Case Study, Sarcheshmeh
[12] Quah, T. S. Estimating Software Readiness using Predictive Models. // In-
Porphyry Copper Deposit, Kerman, Iran” Australian Journal of Basic and
formation Sciences, 179, 4(2009), pp. 430-445.
Applied Sciences, 4(3): 408-420, 2010
[13] Kanmani S,Uthariaraj V. R. Sankaranarayanan V, Thambidurai, P. Object-
[33] Ahmed A. M. Emam, Eisa Bashier M. Tayeb, A. Taifour Al, Ammar Hassan
oriented Software Fault Prediction using Neural Networks.Information and
Habiballh, “ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM IDENTI-
Software Technology, 49, 5(2007), pp. 483-492.
FICATION OF AN INDUCTION MOTOR” European Journal of Science and
[14] Evett, M.; Khoshgoftaar, T.; Chien, P.; Allen, E. GP-based Software Quality
Engineering Vol. 1, Issue 1, 2013.
Prediction. // Proceedings of the 3rd Annual Genetic Programming Confer-
[34] M. H. Olyaee, H. Abasi, M. Yaghoobi, “Using Hierarchical Adaptive Neuro
ence / San Francisco, CA, 1998, pp. 60-65.
Fuzzy Systems And Design Two New Edge Detectors In Noisy Images” Vol-
[15] Guo, L.; Cukic, B.; Singh, H. Predicting Fault Prone Modules by the Demp-
ume 2013, Year 2013 Article ID jsca-00030, 10 Pages doi:10.5899/2013/jsca-
ster-Shafer Belief Networks. // Proceedings of the 18th IEEE Int’l Conf. on
00030.
Automated Software Eng. / Montreal, Canada, 2003, pp. 249-252.
[16] Menzies, T.; Greenwald, J.; Frank, A. Data Mining Static Code Attributes to
Learn Defect Predictors. // IEEE Transactions on Software Engineering, 33,
1(2007), pp. 2-13.
[17] Zhang, M. L.; Peña, J. M.; Robles, V. Feature Selection for Multi-label Naive
Bayes Classification. // Information Sciences, 179, 19(2009), pp. 3218-3229.
[18] Khoshgoftaar, T. M.; Seliya, N. Software Quality Classification Modeling
using the SPRINT Decision Tree Algorithm. // Proc.of the 4th IEEE Int’l
Conf. on Tools with AI. / Washington, DC, 2002, pp. 365-374.
[19] Catal, C.; Diri, B. Investigating the Effect of Dataset Size, Metrics Sets, and
Feature Selection Techniques on Software Fault Prediction Problem. // In-
formation Sciences, 179, 3(2009), pp. 1040-1058.
[20] Elish, K. O.; Elish, M. O. Predicting Defect-Prone Software Modules using

IJSER © 2014
http://www.ijser.org

You might also like