Text Recognition Past, Present and Future

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169
Volume: 4 Issue: 6 447 - 453

______________________________________________________________________________________________________
Development of a Prototype for Critical Disease Predictions using Data Mining
Mohammad Taha Khan Professor Dr. Shamimul Qamar Dr. Ripu Ranjan Sinha
Research Scholar, Suresh Gyan Vihar Department of Computer Networks and Associate Dean Research,
University Communication Engg. Suresh Gyan Vihar University,
Mahal Jagatpura ,Jaipur,Rajasthan College of Computer Science Jaipur, India
e-mail: [email protected] King Khalid University,Abha, Ksa e-mail:
e-mail: [email protected] [email protected]
AbstractThe goal of this paper is to present breast cancer prototype model along with the prediction of heart diseases by employing data
mining techniques. The data used in the study had been retrieved from Public-Use Data, which is available online. The data comprised of 699
and 909 records for breast cancer and heart disease respectively. For data prediction and mining, C4.5 and C5.0, which are decision tree
algorithms, were used on the data, used in the study. The results of both data sets using both algorithms were also compared. The paper also
outlines the significance of evidence based medicine, which is the novel and innovative approach in healthcare decision making process [5]. It is
essential that the clinical decisions are supported and based on scientific evidence, which ensures that they are sound and effective decisions.
This paper also will depict the importance of data mining in modern healthcare.
Keywords- Health care Prediction, data mining, EBM

__________________________________________________*****_________________________________________________
C4.5 and C5.0 classification algorithms for prediction,

I. INTRODUCTION analysis of the data and compare the results of both.
In healthcare and medical sector, there is need to provide
accurate and precise diagnosis and treatment to patients in The paper has been outlined as: section 2 discusses the
order to meet their requirements and provide them high quality significance of data mining in the healthcare sector, section 3
and affordable care. Literature suggests that the quality of the discusses the C4.5 and C5.0 classification algorithms, section
service provided to the patients in healthcare organization 4 presents the data mining case studies of heart disease and
refers to accurate diagnosis and providing them treatments that cancer prediction, section 5 discusses the prototype model and
are instrumental in treating it efficiently [8]. Furthermore, section 6, is the final section, which provides conclusion and
hospitals and healthcare organizations also focus on reducing direction for future research.
their costs and therefore, they aim at reducing costs associated
with clinical testing by using computer based systems and II. IMPORTANCE OF DATA MINING IN HEALTHCARE
decision support systems. Majority of the hospitals have The use of information and communication technology in the
adopted hospital information systems to store, record and healthcare industry had increased significantly.
manage patient data [7]. These systems have large amount of Consequently, medical databases usage in healthcare
data, which is stored in the form of images, text, charts and organizations has increased to manage patient data and
numbers. Furthermore, such data has hidden information, information in an efficient manner. The use of technology is
which can be extracted from the huge whirlpool of information the primary motivator for researchers and professionals to
to help clinical decision making. The goal of this research is to adopt information technology and decision support systems.
investigate how data can be extracted into useful information The storage of data increases and consequently, data mining
and can help healthcare practitioners in their decision making. techniques can be used to efficient in extracting hidden
information and extracting knowledge, which can be used to
The use of efficient and effective prediction system designed improve the quality of care provided to patients, while
specifically for cancer and heart disease can be instrumental in reducing the costs. The use of data mining technique can be
improving the effectiveness of the healthcare organizations used to address several questions [6]:
and can aid clinicians to take strong and effective decisions. It How patient treatment can be improved for patients
is essential that clinicians and patients are aware of the through analysis of patient data?
dangers of fatal diseases such as cancer and heart diseases and How patient records can be used to make sound
therefore, require efficient treatment. Consequently, modern clinical decisions?
healthcare organizations have adopted data mining techniques. How patient data and records can be used to treat
Data mining techniques are used effectively to improve the cancer patients? Should the treatment comprise of
efficiency of classification and prediction systems and chemotherapy or radiation therapy alone or together?
therefore, can aid medical practitioners in their decision
making process. This can be beneficial in improving the The goal of using data mining is to recognize the pattern and
quality of care for patients, while it can reduce operational thus, it aims at discovering the patterns of the data, which
costs and thus, can lay out the foundation for further clinical cannot be detected using conventional statistical methods.
studies. Data mining techniques and procedures are based on several
concepts such as statistical analysis, machine learning and
The goal of this research is to use the dataset on heart disease visualizing. The best model that is based on data
and breast cancer, which is available to the public by using characteristics is applied by the data mining algorithms. Data
447
IJRITCC | June 2016, Available @ http://www.ijritcc.org
________________________________________________________________________________________________________
Volume: 4 Issue: 6 445 - 446
______________________________________________________________________________________________________
mining models are classified into two categories [6], which are 3. C= {C1, C2 Cn} is set of classes of disease depending on
discussed as follows: its severity.
Descriptive Models are used to identify the data patterns with Classification model can be applied using the decision tree,
the help of rules of association, visualization, clustering which is considered to be an important aspect of discovering
techniques and pattern recognition. data and discovering the knowledge. The decision tree model
Predictive Models are used for predicting the data. For is considered to be based on predictive modeling and thus,
instance, for diagnosing a disease, the predictive modelling focuses on classifying the data in the form of tree. Complex
can be used to determine the number of symptoms that are and descriptive trees are called regression trees. They are also
present in other patients along with the viable treatment known as classification trees. For decision trees, training and
options. Prediction uses classification and regression analysis testing data is required. The former is used in constructing data
of the data along with time series analysis. Research suggests trees and thus, is part of the huge pool of data. The testing data
that classification is the main and fundamental aspect of is used for determining the accuracy of the data. It also
predictive modelling. Fig.1 shows the significance and analyzes whether the data has been misclassified in the
importance of data mining in modern healthcare practice. decision tree. Consequently, it is associated with the reliability
and validity of the data.
A. Classification Algorithms C4.5 and C5.0

C4.5 algorithm: [14] C4.5 is a classification algorithm, which
has the ability to create decision trees by using training data.
The formation of the decision trees are based on information
entropy. Each of the characteristic of the data is selected by
the C4.5, which is then divided into subsets, belonging to a
particular class. The criteria of C4.5 is normalized information
gain, which allows it to select the attribute and split the data
accordingly. The attribute that has the highest normalized
information gain is considered to be the criteria for making the
decision. Consequently, after the decision is made, the small
sublists are made. The C4.5 algorithm has some cases, which
are discussed as follows:
1. The samples that are in the same class in the list, then, a leaf
node is created for the decision tree. This is done to select the
class. .
Fig.1 use of data mining in better health delivery [4] 2. Information gain is not provided by the features. C4.5 is
responsible for creating the decision node by using the class
III. CLASSIFICATION TECHNIQUES IN DATA MINING
that has high expected value.
Classification is considered to be an important aspect of None of the features provide any information gain. In this
predictive modelling and it aims at finding the model from the case, C4.5 creates a decision node higher up the tree using the
given set of data. It is considered to be a mapping strategy, expected value of the class.
which maps the data into a particular class, which has been 3. Expected value is used by C4.5 to create the decision node
defined already. This model has to follow a set of rules, which higher if an unseen class has been encountered.
are dependent on the characteristics of the data that is under
review. These rules are also applicable on the classification of i). Tree Generation:
data items that are unknown and would be detected in the
future. It is considered to be one of the most important and Entropy and Gain is used in creating the tree.
essential technique related to data mining. The use of
classification model in medical diagnosis can be used
efficiently. For instance, diagnosing the symptoms of new
patients can be achieved using the classification model with
the already known cases that are available.
Where pi is the proportion of instances in the dataset that take
Classification needs to address the following:
the ith value of the target attribute.
Function; f = D C where each ti D is mapped to f (ti) Gain is:
belonging to some Cj [3].
Where:
1. D is a database of patients with tuples (x1, x2 xn)
2. x1, x2 xn are values of attributes A1,A2 .An relevant
to a particular disease.
448
________________________________________________________________________________________________________
Volume: 4 Issue: 6 445 - 446
______________________________________________________________________________________________________
Where i is a value of X, |Ti| is the subset of instances of T i). Breast Cancer data Attributes:
where X takes the value i, and |T| is the number of instances Total Cases: 599
Attribute Domain
ii). Pruning Trees.
1. Sample code number id number
To reduce the error and to improve accuracy and to avoid 2. Clump Thickness 1 10
overfitting, pruning algorithm is utilized. Pruning tree refers to 3. Uniformity of Cell Size 1 10
the technique that is used in substituting the entire subtree by a 4. Uniformity of Cell Shape 1 10
leaf. If the subtree had expected error rate that is higher than 5. Marginal Adhesion 1 10
the single lead, then the replacement is occurs. In this research, 6. Single Epithelial Cell Size 1 10
we would create the classification tree and use pruning for 7. Bare Nuclei 1 10
simplification. 8. Bland Chromatin 1 10
9. Normal Nucleoli 1 10
B. C5.0 Algorithm 10. Mitoses 1 10
11. Class: (2 for benign, 4 for malignant)
C5.0 and C4.5 algorithm have identical pseudo code.
However, they are both different. The improvements made in
C5.0 as compared to C4.5 are listed as follows: ii). Specification of Attributes:
The target attribute:
1. Speed - C5.0 is quicker and faster than C4.5 in terms of
Class
magnitude.
Sample code number:
2. Memory usage - C5.0 is more efficient in terms of memory ignore
Clump Thickness:
3. Smaller decision trees - C5.0 has the tendency to produce continuous
approximately the same results as C4.5 for small decision Uniformity of Cell Size:
trees. continuous
Uniformity of Cell Shape:
4. Support for boosting C5.0 has improved boosting, which continuous
increase its accuracy and efficiency. Marginal Adhesion:
continuous
5. Weighting - C5.0 has the ability to weigh the data based on Single Epithelial Cell Size:
different attributes. continuous
Bare Nuclei:
6. Winnowing - C5.0 has the ability to winnow the data as continuous
compared to C4.5 Bland Chromatin:
continuous
Normal Nucleoli:
continuous
IV. CASE STUDIES OF CANCER AND HEART DISEASE Mitoses:
PREDICTION continuous
This section of the research focuses on the case studies used:
prediction of breast cancer and prediction of heart diseases. The target attribute is class which can have two values either
2(Benign) or 4(Malignant).Malignant is cancerous.
A. Case Study 1: Breast Cancer Prediction Malignant tumors can invade and destroy nearby tissue and
spread to other parts of the body Benign is not cancerous.
Benign tumors may grow larger but do not spread to other
According to experts, breast cancer is the leading cause of
parts of the body. Value to class attribute is given 2 and 4 to
death among in women all over the world [19]. In developing
avoid the conflict with the values of other attributes. There are
countries such as India, Pakistan and Bangladesh, the
several attributes mentioned above which can have value
incidence of breast cancer among women is on the rise and is
from1 to 10.C 4.5 and C5.0 Programs supports three type of
considered to be the primary cause of death among woman.
files: Names files Provides names for classes, attributes, and
According to the data compiled by Indian Council of Medical
attribute values, Data file describe the training cases for
Research (ICMR), breast cancer is a serious problem in India.
generating the decision tree and/or and test file used to
Reports suggest that it prevails in urban and rural dwellings.
evaluate the produced classifier.
Research suggests that one out of twenty two women are most
likely to suffer from breast cancer [12], while in America with
iii). Decision Tree and Rules Generated:
one in eight being a victim of this deadly cancer.
University of Wisconsin Hospitals, Madison (Dr. William H.
Following Fig.2 depicts the tree generated using c4.5
Walberg) [16] is having dataset for breast cancer online .This
algorithm. Tree size is 29 with 5 train error.5 train errors
online available is used for the breast cancer prediction case
study.
449
________________________________________________________________________________________________________
Volume: 4 Issue: 6 445 - 446
______________________________________________________________________________________________________
means after running the 400 records on C4.5 there are five B. Case Study 1I: Heart Disease Prediction
cases where error was noted down. Heart diseases are also one of the most deadliest diseases.
Because of the life style now a days heart disease are
becoming the very common. Prior knowledge of chances of
getting a heart disease is very helpful for patient as well as
clinicians for planning a better and effective treatment. This
case is all about prediction of heart disease using the heart
disease data set. The algorithms which are used again are C5.0
and C4.5. The purpose is to predict the presence or absence of
heart disease given the results of various medical tests carried
out on a patient.
We have used a total of 909 records with 75 medical
attributes. This dataset is taken from Cleveland Heart Disease
database [14].We have split this record into two categories:
one is training dataset (455 records) and second is testing
dataset (454 records). The records for each category are
selected randomly. Diagnosis attribute is the target
Fig.2.Tree Generated before pruning using c4.5 predictable attribute. Value 1 of this attribute for patients
with heart disease and value 0 for patients with no heart
As pruning a tree is the action to replace a whole subtree by a disease. PatientID is used as the key; the rest are input
leaf which reduces the size of tree. Following Fig.3 depicts attributes. It is assumed that problems such as missing data,
tree after pruning. Tree size is 17. inconsistent data, and duplicate data have all been resolved.
i). Attribute Information:

------------------------
1. Age (age in years)
2. Sex (1 = male; 0 = female)
3. Chest pain type (4 values)
-- Value 1: typical angina
-- Value 2: atypical angina
-- Value 3: non-anginal pain
-- Value 4: asymptomatic
4. Resting blood pressure
5. Serum cholesterol in mg/dl
6. Fasting blood sugar > 120 mg/dl (1 = true; 0 = false)
7. Resting electrocardiography results (values 0, 1, 2)
-- Value 0: normal
Fig.3.Tree Generated after pruning using c4.5 -- Value 1: having ST-T wave abnormality (T wave
inversions and/or ST elevation or depression
of > 0.05 mV)
Fig.4 shows the tree generated after running C5.0, which reads -- Value 2: showing probable or definite left ventricular
400 cases with 10 attributes. hypertrophy by Estes' criteria
8. Maximum heart rate achieved
9. Exercise induced angina (1 = yes; 0 = no)
10. Old peak = ST depression induced by exercise relative to
rest
11. The slope of the peak exercise ST segment
-- Value 1: upsloping
-- Value 2: flat
-- Value 3: downsloping
12. Number of major vessels (0-3) colored by flourosopy
13. Thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
ATTRIBUTES TYPES
------------------------
Real: 1, 4,5,8,10,12
Ordered: 11,
Fig.4.Rules Generated using c5.0 Binary: 2, 6, 9
Nominal: 7,3,13
Variable to be predicted
450
________________________________________________________________________________________________________
Volume: 4 Issue: 6 445 - 446
______________________________________________________________________________________________________
------------------------ Evaluation on training data (150 cases):
Absence (1) or presence (2) of heart disease
Rules
----------------
ii). Decision Tree Rules Generated By C5.0 No Errors
See5 [Release 2.06] Sat Nov 21 19:36:52 2013 7 20(13.3%) <<
Read 150 cases (13 attributes) from heartdisease.data (a) (b) <-classified as
---- ----
77 6 (a): class 1
14 53 (b): class 2
DECISION TREE: C. Working Prediction Model for Cancer
As part of our project we have designed a working model for
Thal > 6: cancer/heart disease prediction. This model will predict the
:...ChestPain > 3: 2 (32/2)
: ChestPain <= 3: breast cancers or heart disease class based on the rules created
: :...STSlope <= 1: 1 (8/2) by C4.5 and C5.0 algorithms.Fig.6 shows the interface for
: STSlope > 1: 2 (12/3) input, which take Medical profiles of a patient such as age,
Thal <= 6: sex, blood pressure and blood sugar etc as input and it can
:...OldPeak > 2.8: 2 (6)
OldPeak <= 2.8: predict about presence or absence of cancer/heart disease.
:...ChestPain <= 3: 1 (60/6)
ChestPain > 3:
:...Vessels <= 0: 1 (23/6)
Vessels > 0: 2 (9/1)
RULES:
Rule 1: (60/6, lift 1.6)

ChestPain <= 3
OldPeak <= 2.8
Thal <= 6
-> class 1 [0.887]
Rule 2: (51/5, lift 1.6)
ChestPain <= 3
STSlope <= 1
-> class 1 [0.887]
Fig.5.Interface for input
Rule 3: (65/9, lift 1.5)
OldPeak <= 2.8
Vessels <= 0 Fig7. Bellow shows a particular case belonging to class1 for
Thal <= 6
-> class 1 [0.851] heart disease.
Rule 4: (27/1, lift 2.1)

ChestPain > 3
Vessels > 0
-> class 2 [0.931]
Rule 5: (32/2, lift 2.0)
ChestPain > 3
Thal > 6
-> class 2 [0.912]
Rule 6: (31/3, lift 2.0)
STSlope > 1
Thal > 6
-> class 2 [0.879]
Rule 7: (6, lift 2.0)

OldPeak > 2.8
Thal <= 6
-> class 2 [0.875]
Fig.6.Interface for output
Default class: 1
451
________________________________________________________________________________________________________
Volume: 4 Issue: 6 445 - 446
______________________________________________________________________________________________________
V. IMPORTANCE OF OF OPEN SOURCE SOFTWARE IN These components are HIS - Hospital/Health service
HEALTHCARE Information System, PM - Practice (GP) management, CDS -
Providing quality services at affordable prices is major Central Data Server, HXP - Health Xchange Protocol [22].
challenge healthcare organizations (hospitals and medical This advanced pathology management system is providing all
centres).In term of ICT infrastructure hardware and software features like Grossing, Sectioning, Reporting and Sample
are capital goods for an organization. tracking with decision support.
Less price of a software means ICT is available at lower cost.
This helps an organization to add to its resources and improves Rules generated are used in this system to help the clinician in
its process. decision making.
Open Source software is an important and growing class of
software.
Open Source software is distinguished not by programming
language, operating environment, nor application domain, but
rather by the license(s) that governs the use, distribution, and,
most importantly, the rights to access and modify the
software's source code [21].
The philosophy of open source permits users to use, change,
and improve the software, and to redistribute it in modified or
unmodified forms. Together, software source code, licensing,
and community have dramatically changed many conventional
assumptions about software and the software industry itself.
Acceptability of open source software is increasing day by
day. Some of the reasons for using open source software
include low total cost of Ownership, lack of software piracy
issues, and availability of source code leading to high degree
of customizability and scalability and extensive support freely
available on Internet. When the source code of a program is
available anyone can contribute by improving the code, adding Fig.7.Grossing option in APMS
new features, correcting errors, etc.
Healthcare is one of the important sectors for the economy
of any developing country; if we get low cost ICT solutions for
healthcare it is very beneficial for economic growth. Open
source software have potential to be a key player for low cost
quality healthcare delivery. Care2x, OpenVista, OpenEMR are
some of free and open source healthcare software worldwide
used.
VI. MODIFIED CARE2X ADVANCED PATHOLOGY
MANAGEMENT SYSTEM BASED (APMS)
We have developed one Advance Pathology Management
System based on Care2x for Pathology of UrgentCare
Hospital. UrgentCare is one of the premier hospitals in India
with 160 beds. UrgentCare Pathology reports 100-150 cases
per day. To improve the work process of this Pathology there
is a requirement of an advance pathology management system.
And maintaining the low cost was our primary goal. For this
purpose we opted one existing open source software Care2x to
customise it as per our requirement. CARE2X is an open
source Web based Integrated Healthcare Environment (IHE) Fig.8.Sample tracking in APMS
[22] under GNU/GPL. The project was started in May 2002
until today the development team has grown to over 100 At the time of reporting system prompt the suggestions based
members from over 20 countries. Its source code is freely on the rules. Based on the sample symptoms suggestion
distributed and available to the general public. populated. This system is a step towards the evidence based
medicine.
CARE2X [22] HIS is built upon other open-source projects:
the Apache web server from the Apache Foundation the script VII. CONCLUSIONS AND FUTURE WORK
language PHP [23] and the relational database management If we talk about performance of these two algorithms, C5.0
system mySQL [24]. CARE2X is modular and highly scalable handles missing values easily but C4.5 shows some errors due
so it is very easy to scale this application as per requirements. to missing values. Over running the dataset of breast cancer of
CARE2X is currently composed of four major components. 400 records C4.5 shows 5 train error whereas C5.0 show only
Each of these components can also function individually.
452
________________________________________________________________________________________________________
Volume: 4 Issue: 6 445 - 446
______________________________________________________________________________________________________
3 train errors. C5.0 produces rules in a very easy readable form without Neurogenesis in Adult Primate Neocortex, Science,
but C4.5 generates the rule set in the form of a decision tree. vol. 294, Dec. 2001, pp. 2127-2130,
doi:10.1126/science.1065467.
Data mining techniques play an important role in finding [8] Sellappan Palaniappan , Rafiah Awang Web-Based Heart
patterns and extracting knowledge from large volume of data. Disease Decision Support System using Data Mining
It is very helpful to provide better patient care and effective Classification Modeling Techniques Proceedings of
diagnostic capabilities. Evidence Based Medicine (EBM) is a iiWAS2007.
new direction in modern healthcare. [9] Infectious Disease Informatics and, outbreak detection,Daniel
Zeng1, Hsinchun Chen, Cecil Lynch, Millicent Eidson, and Ivan
EBM is as an important approach to make clinical decisions Gotham.
about the care of individual patients. This decision about [10] AMPATH Medical Record System AMRS): Collaborating
patient is based on the best available Evidence. Its task is to toward An EMR for Developing Countries Burke W. Mamlin,
prevent, diagnose and medicate diseases using medical M.D. and Paul G. Biondich, M.D., M.S.Regenstrief Institute,
evidence. It is all about providing best evidence, at right time Inc. and Indiana University School of Medicine, Indianapolis,
IN
in right manner to the clinician. External evidence-based
[11] Global Epidemiological Outbreak Surveillance System
knowledge cannot be applied directly to the patient without Architecture:Ricardo Jorge Santos(1) and Jorge Bernardino
adjusting it to the patients health condition. If the rules CISUC Centre of Informatics and Systems of the University of
generated by this system is approved by medical experts that Coimbra University of Coimbra)ISEC Engineering Institute
can be used as evidence for further use. of Coimbra Polytechnic Institute of Coimbra portugal
CARE2X is flexible generic multi-language open-source [12] http://www.medindia.net/news/view_news_main.asp?x=7279
project. CARE2X is a very feature rich HIS, fully configurable [13] Managing Diagnostic Process Data Using Semantic Web,Vili
Podgorelec, Luka Pavlic Institute of Informatics, FERI,
for any clinical structure. After customization, it has the University of Maribor, Slovenia.Twentieth IEEE International
potential to become functional software to support workflows Symposium on Computer-Based Medical Systems (CBMS'07)
of Indian hospital. Efforts were made to explore the possibility 0-7695-2905-4/07
of providing a low cost solution to Indian hospitals. [14] http://en.wikipedia.org/wiki/C4.5_algorithm.
[15] ARIHITO ENDO, TAKEO SHIBATA, HIROSHI TANAKA
REFERENCES Comparison of Seven Algorithms to Predict Breast Cancer
Survival Biomedical Soft Computing and Human Sciences,
[1] Jaree Thongkam, Guandong Xu, Yanchun Zhang and Fuchun Vol.13, No.2, pp.11-16 (2008).
Huang Breast Cancer Survivability via AdaBoost
AlgorithmsHDKM,2008,wollongon,australia. [16] http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin
+(Prognostic) (Breast cancer dataset).
[2] Diana Dumitru Prediction of recurrent events in breast cancer
using the Naive Bayesian classification Annals of University of [17] http://archive.ics.uci.edu/ml/datasets/Heart+Disease (Heart
Craiova, Math. Comp. Sci. Ser.Volume 36(2), 2009, Pages 92-96 Disease dataset).
ISSN: 1223-6934. [18] DMS Tutorial: http://dms.irb.hr/tutorial/tut_dtrees.php.
[3] Kaur, H., Wasan, S. K.: Empirical Study on Applications of [19] ICMR Bulletin: http://www.icmr.nic.in/bufebruary03.pdf.
Data Mining Techniques in Healthcare, Journal of Computer [20] Tipawan Silwattananusarn and Dr. KulthidaTuamsuk Data
Science 2(2), 194-200, 2006. Mining and Its Applications for Knowledge Management : A
[4] Nevena Stolba and A Min Tjoa The relevance of data Literature Review from 2007 to 2012 International Journal of
warehousing and data mining in the field of evidence-based Data Mining & Knowledge Management Process (IJDKP)
medicine to support healthcare decision making December 24, Vol.2, No.5, September 2012.
2005.R. Nicole, Title of paper with only first word capitalized, [21] Michael Tiemann President Open Source Initiative Vice
J. Name Stand. Abbrev., in press. President Open Source Affairs, Red Hat November 1, 2009
[5] Wu, R., Peters, W., Morgan, M.W.: The Next Generation How Open Source Software Can Save the ICT Industry One
Clinical Decision Support: Linking Evidence to Best Practice, J Trillion Dollars per Year.
Healthcare Information Managment. 16(4), 50-55, 2002. [22] CARE2X; an Open Source Project. http://www.CARE2X.org .
[6] Siri Krishan Wasan, Vasudha Bhatnagar and Harleen Kaur*The [23] PHP An Open Source widely used language for web
impact of data mining techniques on medical diagnostics Data development, http://www.php.org .
Science Journal, Volume 5, 19 October 2006.
[24] MySql Largest Open Source Database used by many renowne
[7] Herbert Diamond, Michael P. Johnson, Rema Padman, Kai leading organizations http://www.mysql.com.
Zheng, Clinical Reminder System: A Relational Database
Application for Evidence-Based Medicine Practice
INFORMSSpring National Conference, Salt Lake City, Utah-
April 26, 2004.D. Kornack and P. Rakic, Cell Proliferation
453
________________________________________________________________________________________________________

Text Recognition Past, Present and Future

Uploaded by

Copyright:

Available Formats

Text Recognition Past, Present and Future

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Text Recognition Past, Present and Future

Uploaded by

Copyright:

Available Formats

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169

Volume: 4 Issue: 6 447 - 453

Keywords- Health care Prediction, data mining, EBM

C4.5 and C5.0 classification algorithms for prediction,

A. Classification Algorithms C4.5 and C5.0

i). Attribute Information:

Rule 1: (60/6, lift 1.6)

Rule 4: (27/1, lift 2.1)

Rule 7: (6, lift 2.0)

You might also like