Expert Systems With Applications: Wen-Hsiang Wu, Jia-You Liu, Hen-Hong Chang
Expert Systems With Applications: Wen-Hsiang Wu, Jia-You Liu, Hen-Hong Chang
Expert Systems With Applications: Wen-Hsiang Wu, Jia-You Liu, Hen-Hong Chang
P
C
k1
P
n
i1
^ l
m
k
x
i
; k 1; 2; . . . ; C; 1
a
k
denotes the proportion of class k in the population, where
a
k
2 0; 1 and
P
C
k1
a
k
1;
^
h
kjl
P
n
i1
x
ijl
^ l
m
k
x
i
P
n
i1
^ l
m
k
x
i
; k 1; 2; . . . ; C;
j 1; 2; . . . ; J; l 1; 2; . . . ; L
j
; 2
h
kjl
represents the probability of response level l, l 1; . . . ; L
j
, for the
jth manifest variables, j 1; . . . ; J, in the kth class, and
P
L
j
l1
h
kjl
1.
and,
^ l
k
x
i
P
J
j1
P
L
j
l1
ln
^
h
x
ijl
kjl
wln ^ a
k
n o n o 1
m1
P
C
k1
P
J
j1
P
L
j
l1
ln
^
h
x
ijl
kjl
wln
^
a
k
n o n o 1
m1
;
k 1; 2; . . . ; C; i 1; . . . ; n: 3
Table 1
B-code representing an example involving an SLE patient.
B-Code Causes of
diseases
Viscera or
bowels
Level or
body parts
Pathomechanism
or patterns
B-Code 1 4 2 1 1
B-Code 2 7 5 2 3
B-Code 3 0 0 3 4
B-Code 4 0 0 0 5
Table 2
Transformed data for B-code of pathomechanism or patterns.
Pathomechanism/patterns
Vacuity Depression Impediment Stasis Water-rheum
Others Complex
Yes (1)/No (0) 1 0 1 1 1 0 0
W.-H. Wu et al. / Expert Systems with Applications 38 (2011) 281287 283
Minimize C
m;w
l; a; H
X
C
k1
X
n
i1
l
m
k
x
i
lnf
k
x
i
; h
k
wlna
k
f g
X
C
k1
X
n
i1
l
m
k
x
i
X
J
j1
X
L
j
l1
lnh
x
ijl
kjl
w
X
C
k1
X
n
i1
l
m
k
x
i
lna
k
4
Subject to
X
C
k1
l
k
x
i
1;
X
C
k1
a
k
1 and
X
L
j
l1
h
kjl
1;
where m > 1 and w P0 are xed constants.
The difference between the EM and fuzzy clustering algo-
rithms is that the fuzzy clustering algorithm permits the power
(weighting exponential) of l
k
x
i
to be increased to l
m
k
x
i
and
adds a weight, w, for lna
k
to wlna
k
, where w P 0 is a xed
constant.
Given the necessary conditions of Eqs. (1)(3), the fuzzy cluster-
ing algorithm can be summarized as follows:
Step 1: Fix 2 6 C 6 n and x any e
g
> 0; g 1; 2; 3, given an ini-
tial value of ^ l
0
k
x
i
.
Step 2: Calculate ^ a
t
k
with ^ l
t1
k
x
i
using Eq. (1).
Step 3: Calculate
^
h
t
kjl
with ^ a
t
k
and ^ l
t1
k
x
i
using Eq. (2).
Step 4: Calculate ^ l
t
k
x
i
with ^ a
t
k
and
^
h
t
kjl
using Eq. (3).
Step 5: Compare ^ a
t
k
with ^ a
t1
k
;
^
h
t
kjl
with
^
h
t1
kjl
and ^ l
t
k
x
i
with
^ l
t1
k
x
i
.
Repeat Steps 25 until some convergence criterion is satis-
ed. If for all k^ a
t
k
^ a
t1
k
k < e
1
; k
^
h
t
kjl
^
h
t1
kjl
k < e
2
and
k^ l
t
k
x
i
^ l
t1
k
x
i
k < e
3
, then stop, otherwise t = t + 1
and return to Step 2.
Step 6: Finally, x
i
must be classied using the decision rule: if
^ l
s
x
i
max
k1;...;C
^ l
k
x
i
; k 1; . . . ; s; . . . ; C, then x
i
is a
member of Class s.
The Akaike information criterion (AIC) is a model that selects
the number of components in a mixture, and this system uses this
criterion to select the most suitable cluster. Fig. 3 shows that three
to ve are the most suitable clusters because of having the lowest
value with the same solution in different initial iteration values.
According to Lin, when m ! 1 and w = 1, then all parameters of
the fuzzy clustering algorithm will approach the results of the EM
algorithm Lin et al. (2004). Therefore, a stable cluster would be
similar using both algorithms. This system applies m = 1.2 and
w = 1 in the fuzzy clustering algorithm to identify the most stable
of the various different clusters.
Next, the system compares the classication results between
different clusters and calculates the discordant case numbers be-
tween the EM algorithm and FCA. The most suitable cluster is ob-
tained by identifying the clustering result with the least discordant
case numbers and ratio. From Table 4, the least discordant case
Table 3
Frequency table for SLE disease pattern variables.
Symptoms/phenomena
Heat Yin vacuity Dampness Impediment Blood vacuity
Frequency 2102 2092 1899 1002 899
Symptoms/phenomena
Qi vacuity Stasis Liver Kidney Reversal
Frequency 758 665 660 491 418
Symptoms/phenomena
Water-rheum Wind Depression Lungs Stomach
Frequency 307 295 258 207 188
Symptoms/phenomena
Spleen Unrestrained yang Skin Heart Large Intestine
Frequency 166 151 81 55 46
Heat Yin
Vacuity
Dampness
Impediment
Blood
Vacuity
Stasis Liver Kidney Reversal Water-
Rheum
Wind Depression
Lungs
Stomach
Vacuity
Qi
Fig. 2. Bar chart representing SLE disease pattern variables.
284 W.-H. Wu et al. / Expert Systems with Applications 38 (2011) 281287
numbers appear in three clusters. Consequently, this system se-
lects the three clusters for classifying the disease patterns.
3. Result
Because heat and yin vacuity occurred in almost all the va-
lid records, they are reinserted among the main disease patterns in
an additional table (Table 5). The proposed system selects the dis-
ease patterns with probability exceeding 0.5 as the major disease
patterns, and then selects those with probability below 0.5 and
300 times higher than other clusters as the minor disease patterns.
The latent class model analysis yields the three groups and their
respective frequencies, as listed in Table 6:
1. Cluster 1: contains 887 observations and includes the major
disease patterns Heat, Yin Vacuity, Qi vacuity, Dampness, and
Blood vacuity. ( ) as well as the minor disease
pattern Kidney and Water-rheum ( ).
2. Cluster 2: contains 483 observations and includes the major
disease patterns Heat, Yin Vacuity, Liver, Blood vacuity Damp-
ness, Kidney, and Impediment ( ).
3. Cluster 3: contains 677 observations and includes the major
disease patterns Heat, Yin Vacuity, Dampness, and Impedi-
ment ( ).
This study assesses the accuracy of this system by comparing
the results of clustering with the experience of TCM experts. The
expert was asked to complete a questionnaire dealing with the
clustering results calculated using the expert system(see Appendix
A.) The accuracy is estimated based on the ratio of scores assessed
using a TCM expert versus the expected full scores. The accuracy of
each main cluster is listed in Table 6 and the overall accuracy is
77.47%.
This system can calculate the probability of each SLE patient
classied in different clusters and select the highest probability
for determining the main cluster to which each patient belongs,
and then suggests appropriate herbal treatments for the patient.
For example, if a patient displays the disease patterns Heat, Yin
Vacuity, Qi vacuity, Dampness and Water-rheum, the most likely
cluster calculated by this system is cluster 1 and the treatment for
cluster 1 is presented, as illustrated in Fig. 4. Based on this system,
TCM physicians can modify the prescription dosage and herbs.
4. Discussion
In the proposed diagnostic system, the changing manifestations
of SLE are summarized into three main disease patterns, helping to
simplify disease pattern complexity and help TCM physicians in
indicating concordant treatments. Good accuracy is achieved in
diagnosing SLE compared to the experience of TCM physicians.
More important is the fact that the proposed system can mine
the implications of the clinical database achieving something that
even TCM experts have not proposed. In SLE, Qi vacuity ( )
and Water-rheum ( ) are two key clues for differentiating dis-
ease pattern clusters. Both of these clues are infrequent disease
patterns but, once they appear, indicate a critical point of the dis-
ease progression. From the results, main cluster 1 comprises pa-
tients affected by more serious conditions and who require
immediate consultation and intervention.
Based on the experience of TCM experts, this study collected
and recommended commonly used herbal formulas and herbs into
different main clusters, as listed in Table 7. This latent class model
Fig. 3. Akaikes information criterion (AIC) in different initial iteration values and
different clusters.
Table 4
Discordant case numbers and ratio of the classication results between the EM and
FCA algorithms in different clusters.
EM algorithm versus FCA 5 Clusters 4 Clusters 3 Clusters
Discordant case numbers 933 660 286
Discordant case ratio (%) 45.58 32.24 13.97
Table 5
The probability of the selected B-codes in main clusters.
Cluster
Dampness Impediment Blood vacuity Qi vacuity Stasis Liver Kidney Reversal Water-rheum
1 0.8892
+
0.2923 0.5499
+
0.7101
+
0.3703 0.2814 0.1615
0.2111 0.2656
2 0.7754
+
0.5435
+
0.5455
+
0.0447 0.3147 0.6380
+
0.5910
+
0.0020 0.0003
3 0.9998
+
0.6683
+
0.0789 0.0010 0.2441 0.0492 0.0005 0.3256 0.0501
Note: major disease pattern is marked as + and minor disease pattern is marked as .
Table 6
Results of the main clusters in the B-code using the latent class model.
Cluster Probability (%) Major patterns Minor patterns
1 46.72 Heat, Yin Vacuity, Dampness, Qi vacuity, and Blood
vacuity ( )
Kidney ( ), Water-rheum( )
2 23.25 Heat, Yin Vacuity, Kidney, Liver, Dampness, Impediment,
and Blood vacuity ( )
3 30.03 Heat, Yin Vacuity, Dampness, and Impediment ( )
W.-H. Wu et al. / Expert Systems with Applications 38 (2011) 281287 285
based system performs well in diagnosing patients with SLE, and
may also provide treatment suggestions for the various clusters.
Few consensuses exist among practitioners regarding TCM
diagnosis and treatment for certain diseases, particularly those
with variable manifestations, such as rheumatoid arthritis (Zhang,
Bausell, Lao, & Lee, 2004). Using B-code can help integrate different
clinical databases involving different experts.
As an interface, the B-code combines all the TCM diagnostic
attributes and transforms the subjective clinical descriptions into
quantiable data. Although this study adopts a data set as an
examples of single expert, this study applies the methodology to
integrate clinical databases from different TCM experts. After
merging these clinical databases, it is possible to establish a more
comprehensive diagnostic expert system.
Regarding other expert systems, Bayesian network is another
data-driven method for extracting expert knowledge, but cannot
disclose the thinking process and diagnosis logic as the system
presented here can. Furthermore, the system presented here can
deal with infrequent attributes which are hard to manage in
Bayesian network (Wang et al., 2004). Sometimes, those infre-
quent attributes are important clues in differentiating disease
patterns and determining therapeutic strategies. Consequently,
the proposed system selects attributes with probabilities exceed-
ing certain thresholds as minor disease patterns (ve times prob-
ability more than other clusters in this study). However, problems
of high dimensionality occur because of excessive numbers of
disease patterns. In the proposed system, variables with frequen-
cies of less than 295 times were eliminated to simplify the
dimensionality.
Fuzzy neural network (FNN) is another way of constructing an
expert system, but is also unable to construct expert knowledge.
FNN classier must be based on the rules according to how the ex-
Fig. 4. Traditional Chinese medicine diagnostic system for patients with SLE.
Table 7
The recommended herbal treatments of the expert system.
Main cluster Major disease patterns Herbal formulas Herbs
Minor disease patterns
Cluster 1 Heat, Yin Vacuity, Dampness, Qi vacuity, and Blood vacuity
( )
Polyporus decoction ( ) Miltiorrhizae Radix ( )
Codonopsitis Radix ( )
Kidney ( ), Water-rheum. ( ) Sweet Dew Beverage ( )
Anemarrhena, Phellodendron, and
Rehmannia Pill ( )
Cluster 2 Heat, Yin Vacuity, Kidney, Liver, Dampness, Impediment, and
Blood vacuity ( )
Large Gentian and Turtle Shell Powder
( )
Miltiorrhizae Radix ( ) Millettiae
Radix et Caulis ( )
Sweet Dew Beverage ( )
Lycium Berry, Chrysanthemum, and
Rehmannia Pill ( )
Cluster 3 Heat, Yin Vacuity, Dampness, and Impediment ( ) Sweet Dew Beverage ( )
Anemarrhena, Phellodendron, and
Rehmannia Pill ( )
286 W.-H. Wu et al. / Expert Systems with Applications 38 (2011) 281287
perts work, and thus the application is limited to the original data
set (Xu, Meng, Wang, Lu, & Li, 2009). However, this system can ex-
tract more expert knowledge after accumulating more databases.
Comparing with FNN, this system didnt have good enough accu-
racy, and the reason for this may be resulted from the limitation
for the completeness of doctor data and loss of the infrequent
but important variables.
Although the proposed system can identify the main cluster and
propose herbal treatments for SLE patients, it lacks decision rules
between disease patterns and symptoms. Because the denitions
of symptoms in TCM remain incomplete, it is difcult to create a
database of symptoms.
This system may assist TCM physicians in identifying the main
clusters of SLE patients. This method can be used to interpret the
decision rules used in clustering the TCM disease pattern, as well
as for future construction of a clinical decision-support system.
The system also has potential to serve as a teaching system for
TCM students to help them in learning clinical experience from
experts.
To summarize, this expert system gathered 2047 valid records
and classied three clusters of key disease patterns. Compared
with the experience of the TCM expert, the accuracy rate is
77.47%. This diagnostic system helps determine the disease pat-
terns of SLE and may help TCM physicians in making clinical
suggestions.
Acknowledgements
The authors thank the Committee of Chinese Medicine and
Pharmacy, Department of Health, Executive Yuan of Taiwan, ROC
for supporting this research under Contract Nos. CCMP-97-RD-
026, DOH96-TD-I-111-TM and CCMP95-RD-044. Professor Tzung-
Yan Lee is highly appreciated for his advice and kindly help in this
study.
Appendix A. The questionnaire of accuracy of the clustering
results of latent class model
This questionnaire is aim to know whether the SLE patients
could be divided into different classical disease pattern clusters
to be a reference of clinical treatment. According to your clinical
experience, if the SLE patients having these B-codes in this table
were suitable for divided into the disease pattern cluster in this list,
please mark a check in the right column. (1 stands for Yes and
0 stands for No).
References
Bernatsky, S., Boivin, J. F., Joseph, L., Manzi, S., Ginzler, E., Gladman, D. D., et al.
(2006). Mortality in systemic lupus erythematosus. Arthritis and Rheumatism,
54(8), 25502557.
Chang, H. H., Wu, W. H., Chen, J. C., Lo, L. C., & Ma, C. C. (2000). Disease coding
system for traditional chinese medicine. Journal of Chinese Medicine, 11(3),
123128.
Chang, H. H., Wu, W. H., Chen, J. C., & Lin, C. T. (2008). Determining the disease
patterns in b-code data from SLE patients. Biomedical Soft Computing and Human
Sciences, 13(1), 6368.
Curtis, J. R., Westfall, A. O., Allison, J., Bijsma, J. W., Freeman, A., George, V., et al.
(2006). Population-based assessment of adverse events associated with long-
term glucocorticoid use. Arthritis Care and Research, 55(3), 420426.
Danchenko, N., Satia, J. A., & Anthony, M. S. (2006). Epidemiology of systemic lupus
erythematosus: A comparison of worldwide disease burden. Lupus, 15,
308318.
Henderson, J., Granell, R., Heron, J., Sherriff, A., Simpson, A., Woodcock, A., et al.
(2008). Associations of wheezing phenotypes in the rst 6 years of life with
atopy, lung function and airway responsiveness in mid-childhood. Thorax, 63,
974980.
Hesketh, S. R., & Skrondal, A. (2008). Classical latent variable models for medical
research. Statistical Methods in Medical Research, 17, 532.
Hochberg, M. C. (1997). Updating the American College of Rheumatology revised
criteria for the classication of systemic lupus erythematosus. Arthritis and
Rheumatism, 40(9), 1725.
Lin, C. T., Chen, C. B., & Wu, W. H. (2004). Fuzzy clustering algorithm for latent class
model. Statistics and Computing, 14, 299310.
Maddison, P. J. (2002). Is it SLE? Best Practice & Research Clinical Rheumatology, 16,
167180.
Mcelhone, K., Abbott, J., & Teh, L. S. (2006). A review of health related quality of life
in systemic lupus erythematosus. Lupus, 15, 633643.
Parker, B. J., & Bruce, I. N. (2007). High dose methylprednisolone therapy for the
treatment of severe SLE. Lupus, 16, 387393.
Rahman, A., & Isenberg, D. A. (2008). Mechanisms of disease: Systemic lupus
erythematosus. The New England Journal of Medicine, 358, 929939.
Shevlin, M., Murphy, J., Dorahy, M. J., & Adamson, G. (2007). The distribution of
positive psychosis-like symptoms in the population: A latent class analysis of
the National Comorbidity Survey. Schizophrenia Research, 89, 101109.
Wang, X. W., Qu, H. B., Liu, P., & Cheng, Y. Y. (2004). A self-learning expert system for
diagnosis in traditional Chinese medicine. Expert Systems with Applications, 26,
557566.
Xu, L. S., Meng, M. Q. H., Wang, K. Q., Lu, W., & Li, N. M. (2009). Pulse images
recognition using fuzzy neural network. Expert Systems with Applications, 36(2),
38053811.
Zhang, G. G., Bausell, B., Lao, L., & Lee, W. L. (2004). The variability of TCM pattern
diagnosis and herbal prescription on rheumatoid arthritis patients. Alternative
Therapies in Health and Medicine, 10(1), 5863.
Zhu, F. S. (2001). Classication of clinical types of systemic lupus erythematosus.
New Journal of Traditional Chinese Medicine, 33(7), 1415.
Disease pattern cluster Heat, Yin Vacuity, Dampness, Blood vacuity, and Qi vacuity Very
suitable
Suitable Acceptable Unsuitable Very
unsuitable
Yin
Vacuity
Heat Dampness Impediment Blood
vacuity
Qi
vacuity
Stasis Liver Kidney Reversal Water-
rheum
1 1 1 0 1 1 1 0 1 0 1
1 1 1 0 0 0 1 0 1 0 0
1 1 1 0 1 1 1 1 0 0 0
1 1 1 0 1 0 1 0 0 0 0
1 1 1 0 1 1 1 0 0 0 0
1 1 1 0 0 1 1 1 0 0 0
1 1 1 0 0 0 1 1 0 0 0
1 1 1 0 1 1 1 0 0 0 1
Note: this table is a part of the original questionnaire as an example.
W.-H. Wu et al. / Expert Systems with Applications 38 (2011) 281287 287