Digitalcommons@University of Nebraska - Lincoln Digitalcommons@University of Nebraska - Lincoln

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

View metadata, citation and similar papers at core.ac.

uk brought to you by CORE


provided by DigitalCommons@University of Nebraska

University of Nebraska - Lincoln


DigitalCommons@University of Nebraska - Lincoln

Biological Systems Engineering: Papers and Biological Systems Engineering


Publications

2008

Rule-based Mamdani-type fuzzy modeling of skin permeability


Deepak R. Keshwani
University of Nebraska-Lincoln, [email protected]

David D. Jones
University of Nebraska-Lincoln, [email protected]

George E. Meyer
University of Nebraska-Lincoln, [email protected]

Rhonda M. Brand
Feinberg School of Medicine at Northwestern University

Follow this and additional works at: https://digitalcommons.unl.edu/biosysengfacpub

Part of the Biological Engineering Commons

Keshwani, Deepak R.; Jones, David D.; Meyer, George E.; and Brand, Rhonda M., "Rule-based Mamdani-type
fuzzy modeling of skin permeability" (2008). Biological Systems Engineering: Papers and Publications. 80.
https://digitalcommons.unl.edu/biosysengfacpub/80

This Article is brought to you for free and open access by the Biological Systems Engineering at
DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Biological Systems
Engineering: Papers and Publications by an authorized administrator of DigitalCommons@University of Nebraska -
Lincoln.
Published in Applied Soft Computing 8 (2008), pp. 285–294; doi: 10.1016/j.asoc.2007.01.007
Copyright © 2007 Elsevier B.V. Used by permission. http://www.elsevier.com/locate/asoc
Submitted March 21, 2006; revised January 9, 2007; accepted January 31, 2007; published online February 7, 2007.

Rule-based Mamdani-type fuzzy modeling of skin permeability


Deepak R. Keshwani,1 David D. Jones,2 George E. Meyer,2 Rhonda M. Brand 3
1 Department of Biological & Agricultural Engineering,
North Carolina State University, Raleigh, NC 27695
2 Department of Biological Systems Engineering, University of Nebraska–Lincoln,
215 L. W. Chase Hall, East Campus, Lincoln, NE 68583-0726
3 Department of Internal Medicine, Evanston Northwestern Healthcare,
Feinberg School of Medicine, Evanston, IL 60201
Corresponding author — D. D. Jones, tel 402 472-6716, fax 402 472-6338, email [email protected]

Abstract
Two Mamdani type fuzzy models (three inputs–one output and two inputs–one output) were developed to predict the permeabil-
ity of compounds through human skin. The models were derived from multiple data sources including laboratory data, published
data bases, published statistical models, and expert opinion. The inputs to the model include information about the compound
(molecular weight and octonal–H2O partition coefficient) and the application temperature. One model included all three parame-
ters as inputs and the other model only included information about the compound. The values for mole molecular weight ranged
from 30 to 600 Da. The values for the log of the octonal–H2O partition coefficient ranged from –3.1 to 4.34. The values for the ap-
plication temperature ranged from 22 to 39 8C. The predicted values of the log of permeability coefficient ranged from –5.5 to –
0.08. Each model was a collection of rules that express the relationship of each input to the permeability of the compound through
human skin. The quality of the model was determined by comparing predicted and actual fuzzy classification and defuzzification
of the predicted outputs to get crisp values for correlating estimates with published values. A modified form of the Hamming dis-
tance measure is proposed to compare predicted and actual fuzzy classification. An entropy measure is used to describe the ambi-
guity associated with the predicted fuzzy outputs. The three input model predicted over 70% of the test data within one-half of a
fuzzy class of the published data. The two input model predicted over 40% of the test data within one-half of a fuzzy class of the
published data. Comparison of the models show that the three input model exhibited less entropy than the two input model.
Keywords: Mamdani fuzzy modeling, Hamming distance, skin permeability

1. Introduction enforcing the validation criteria, the size of the database re-
mains small and the range of the predictors is limited. An-
Determination of skin permeability is an important is- alytical approaches have been proposed by Edwards and
sue in the area of transdermal drug delivery and environ- Langer [2] to model skin permeability. However, with this
mental toxicity. Transdermal delivery offers a less invasive approach, assumptions are made on the behavior of the
means to administer drugs. In addition the concentration of system. These assumptions are difficult to validate and the
the drugs can be maintained at a steady state. Identifying a resulting description of the system is often over simplified.
compound’s potential to be toxic via a transdermal route is The functional nature of the skin as a barrier is complex.
critical for certain high-risk occupations such as chemical This complexity results in uncertainty that cannot exclu-
manufacturers and painters. sively be described by random measures. Hence, predict-
In the area of skin permeability, a common modeling ap- ing skin permeability can be deemed an ambiguous en-
proach is to develop empirical models from experimentally deavor and fuzzy modeling provides a mean to account
derived databases [7,9,13,14]. However, skin permeability for this ambiguity. Estimating the skin permeability coef-
databases are typically small in size and numerous incon- ficients of compounds is vital to determining potential for
sistencies exist within them. Vecchia and Bunge provide a toxic exposure and transdermal drug delivery.
fully validated skin permeability database where each data Pannier et al. [11] and Keshwani et al. [6] have shown
point met a set of defined criteria for inclusion [17]. Despite that rule-based fuzzy modeling of skin permeability is a

285
286  Keshwani et al. in A p p l i e d S o f t C o m p u t i n g 8 (2008)

promising approach. However, the rules for these models The purpose of this study was to develop generalized
were strictly data driven and examination of the results re- rule based fuzzy models from multiple knowledge sources
vealed inconsistencies that can be attributed to sparse data to predict skin permeability and subsequently test its per-
in some regions. This paper presents a Mamdani fuzzy formance by comparing defuzzified outputs to actual val-
modeling scheme where rules are derived from multiple ues from test data and comparing predicted and actual
knowledge sources such as previously published databases fuzzy classifications. The overall approach followed in this
and models, existing literature, intuition and solicitation of study is illustrated in Figure 1. The process begins with
expert opinion to verify the gathered information. knowledge acquisition, continues to model building and
The output or consequence of a Mamdani-type model is then finally testing the model performance. In the context
represented by a fuzzy set. To assess model performance, of skin permeability, this approach is not common in that it
a crisp estimate of the consequence is usually made by de- combines information from multiple sources for model de-
fuzzification methods such as the centroid, weighted aver- velopment. In the context of fuzzy modeling, the proposed
age, maximum membership principle and mean member- approach of converting the predicted fuzzy output and the
ship principle [15]. The crisp values can be compared to the actual crisp value into fuzzy classification sets is not well
actual values from the data set and a correlation coefficient defined in literature.
can be determined. Depending on the shape of the output
fuzzy set, defuzzification methods do not effectively char-
acterize the output with the corresponding ambiguity as-
2. Theory
sociated with the prediction. The nature of the ambiguity
in the prediction might be of interest to researchers in the
2.1. Mamdani-type fuzzy modeling
area of skin permeability. An alternative strategy could be
implemented such that the actual values of the output infer As the complexity of a system increases, the utility of fuzzy
an ordinal set representing a three point fuzzy classifica- logic as a modeling tool increases. For very complex sys-
tion (low, medium and high) that could be compared to the tems, few numerical data may exist and only ambiguous
actual fuzzy classification using distance measures. In ad- and imprecise information and knowledge is available.
dition, the ambiguity associated with the predicted fuzzy Oduguwa et al. [10] recognized and attempted to capture
sets can be quantified by calculating entropy [4]. qualitative aspects of the engineering design process.

Figure 1. Overall approach to develop skin permeability Mamdani models.


Rule-based Mamdani-type f u z z y m o d e l i n g o f s k i n p e rm e a b i l i t y   287

Figure 2. Example of a Mamdani type fuzzy inference system.

Fuzzy logic allows approximate interpolation between


input and output situations [15]. Two main types of fuzzy
modeling schemes are the Takagi–Sugeno model and the
fuzzy relational model. The Takagi–Sugeno scheme is a
data driven approach where membership functions and
rules are developed using a training data set. The param-
eters for the membership functions and rules are sub-
sequently optimized to reduce training error. The rela-
tionship in each rule is represented by a localized linear
function [1]. The final output is a weighted average of a set
of crisp values. The Mamdani scheme is a type of fuzzy re-
lational model where each rule is represented by an IF–
THEN relationship. It is also called a linguistic model be-
cause both the antecedent and the consequent are fuzzy
propositions [1]. The model structure is manually devel-
oped and the final model is neither trained nor optimized.
The output from a Mamdani model is a fuzzy membership
function based on the rules created. Since this approach is
not exclusively reliant on a data set, with sufficient exper- Figure 3. Different defuzzification methods: (A) max-member-
tise on the system involved, a generalized model for effec- ship principle; (B) mean-max-membership principle; (C) cen-
tive future predictions can be obtained. troid principle. Note: x* is the defuzzified value.
Consider a simple two input–one output Mamdani type
fuzzy model. The rule structure is represented in Figure
2. Each row of membership functions constitutes an IF–
THEN rule, also defined by the user. Depending on the val- 2.2.2. Area-based methods
ues used, the input membership functions are activated to A popular area-based defuzzification procedure is the cen-
a certain degree. The contributed output from each rule re- troid method. As the term implies, the point of the output
flects this degree of activation. The final output is a fuzzy membership function that splits the area in half is selected
set created by the superposition of individual rule actions as the crisp value (Figure 3c). This method however does
(Figure 2). not work when the output membership function has non-
convex properties.
2.2. Defuzzification methods Depending on the shape of the membership function of
the output, defuzzification routines may not produce effec-
The fuzzy output is obtained from aggregating the out-
tive values for the predicted output. For example, in Figure
puts from the firing of the rules. Subsequent defuzzifica-
4A, the predicted output indicates a high degree of ambi-
tion methods on the fuzzy output produce a crisp value.
guity. However, the defuzzified value using the mean-max
Two common techniques for defuzzification are the max-
membership principle that does not convey the ambigu-
ima methods and area-based methods, which are briefly ex-
ity. The centroid method has drawbacks when the output
plained. Several such methods are explained by Ross [15].
membership function is non-convex (Figure 4B). The de-
fuzzified value is at a point that has low membership. In
2.2.1. Maxima methods
an effort to compensate for these drawbacks, an alternative
The maxima methods identify the locations where maxi-
approach to model validation is proposed that uses a dis-
mum membership occurs. Either one such point is selected
tance measure to compare actual and predicted fuzzy clas-
as the defuzzified value (Figure 3A) or an average of all
sifications consisting of three point ordinal sets.
points with maximum membership is selected as the crisp
value (Figure 3B). The advantages of the maxima methods
2.3. Distance measures between fuzzy sets
are their simplicity and speed [12]. The major disadvantage
is loss of information as only rules of maximum activation For two fuzzy sets A and B in the same universe, the Ham-
are considered. ming distance [16] is an ordinal measure of dissimilarity.
288  Keshwani et al. in A p p l i e d S o f t C o m p u t i n g 8 (2008)

Table 1. Sample calculations to compare the Hamming dis-


tance and the proposed modified Hamming distance
Actual fuzzy Predicted fuzzy
classification classification
Case Low Medium High Low Medium High Da HDb

a 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0


b 1.0 0.0 0.0 0.9 0.1 0.0 0.1 0.2
c 1.0 0.0 0.0 0.8 0.2 0.0 0.2 0.4
d 1.0 0.0 0.0 0.8 0.1 0.1 0.3 0.4
e 1.0 0.0 0.0 0.6 0.4 0.0 0.4 0.8
f 1.0 0.0 0.0 0.5 0.5 0.0 0.5 1.0
g 1.0 0.0 0.0 0.6 0.2 0.2 0.6 0.8
h 1.0 0.0 0.0 0.5 0.3 0.2 0.7 1.0
i 1.0 0.0 0.0 0.0 1.0 0.0 1.0 2.0
j 1.0 0.0 0.0 0.0 0.2 0.8 1.8 2.0
k 1.0 0.0 0.0 0.0 0.1 0.9 1.9 2.0
1 1.0 0.0 0.0 0.0 0.0 1.0 2.0 2.0
a Modified Hamming distance calculated using Equation (3).
Figure 4. Problems with defuzzification methods: (A) draw- b Hamming distance calculated using Equation (1).
back of maxima method; (B) drawback of centroid method.
Note: x* is the defuzzified value.

The Hamming distance (HD) is defined as: ferent predicted fuzzy classifications. The proposed mod-
ified Hamming distance gave different values that effec-
(1) tively distinguish between these cases.

2.4. Entropy of a fuzzy set


where n is the number of points that define the fuzzy sets
A and B, μA(xi) the membership of point xi in A and μB(xi) Entropy is a measure of fuzziness associated with a fuzzy
is the membership of point xi in B. The Hamming distance set. The degree of fuzziness can be described in terms of a
is smaller for fuzzy sets that are more alike than those that lack of distinction between a fuzzy set and its complement.
are less similar. In comparing an actual fuzzy set to the pre- For a fuzzy set A, entropy [7] is calculated as:
dicted fuzzy set, a small Hamming distance is ideal. In our
study, the model-testing phase involved comparison of (2)
predicted and actual fuzzy classifications (low, medium
and high). For example, if the actual value was classified
where n is the number of points that define A, and μA(xi) is
low and the predicted value was classified medium, then
the membership of point xi in A. In this study, the concept
the prediction is off by one class. If the actual value was
of entropy was used to quantify the ambiguity associated
classified low and the predicted value was classified high,
with the predicted fuzzy outputs. In the absence of actual
then the prediction is off by two classes. In this case, the
values, entropy values are essentially a measure of confi-
classifications for actual and predicted are fuzzy (for exam-
dence in outputs predicted by a fuzzy model.
ple, 0.60 low, 0.35 medium, 0.05 high). A modified form of
the Hamming distance measure is proposed in the methods
section. This new measure was developed in lieu of certain
3. Methods
drawbacks with the Hamming distance.
Consider the example classification sets in Table 1. For
3.1. Knowledge acquisition phase
an actual classification set of (low = 1, medium = 0, high
= 0), the distance formula was applied to evaluate the de- A Mamdani-type fuzzy model involves developing mem-
gree of misclassifications for a number of possible pre- bership functions and defining the subsequent rules. Three
dicted sets. An exact match would result in a distance of main knowledge sources were used to obtain information
0. When the prediction is off by one class, the distance is 1 in this regard. A description of these sources and examples
and when the prediction is off by two classes, the distance of information acquired from each are described below.
is 2. The Hamming distance is also calculated in each case.
From the results in Table 1, the proposed distance measure 3.1.1. Skin permeability database
is better than the Hamming distance at distinguishing be- A fully validated database from Vecchia and Bunge [17]
tween different levels of classification. In cases i, j, k and l, was used as a guide during model development. This da-
the Hamming distance (HD) gave the same value for dif- tabase is one of the most comprehensive available, where
Rule-based Mamdani-type f u z z y m o d e l i n g o f s k i n p e rm e a b i l i t y   289

each included case has to meet pre-define criteria to vali- 3.3. Proposed distance measure
date its inclusion. Information on octanol–water partition
As indicated in the theory section, a modified form of the
coefficient (log Kow), molecular weight (MW), temperature
Hamming distance is proposed which enables better dis-
(T), experimental skin permeability coefficients (log Kp) are
tinction between different levels of classification (see Table
some of the parameters included for each point in the da-
1). The proposed distance measure D(A, P) is defined as:
tabase. log Kow ranged from –3 to 5, MW ranged from 30 to
600, temperature ranged from 22 to 39 °C and log Kp ranged
from –6 to 0. The database was helpful in determining the
number of membership functions needed for each param-
eter included in the models and their properties. For ex-
ample, prior work that involved developing a data driven
(3)
fuzzy model using this database indicated that 25, 32 and
37 °C were suitable position for the centers of membership
functions in the temperature domain [6]. where A is the actual fuzzy classification, P the predicted
fuzzy classification, n the number of classes that define A
3.1.2. Skin permeability literature and previous models and P, μA(xi) is the membership of point xi in A and μP(xk) is
Discussion on the theory of skin permeability and the bar- the membership of point xk in P.
rier nature of the skin are provided by Flynn [3]. Pub-
lished models by Potts and Guy [13,14], Moody et al. [9] 3.4. Model testing phase
and Kirchner et al. [7], Pannier et al. [11], Keshwani et al. Test data consisting of three inputs (log Kow, MW, T) and
[6] and Magnusson et al. [8] provide an understanding on two inputs (log Kow, MW) was obtained from the Vecchia
the influence of certain input parameters on skin permea- and Bunge database [17]. The models were tested in two
bility and the corresponding impact on assigning member- ways.
ship functions. For example, a review of literature indicates
that hydrophilic and lipophilic compounds may follow dif- 3.4.1. Comparing fuzzy classifications
ferent pathways in penetrating the skin [14]. This infor- The three output membership functions created in both
mation is reflected in the discontinuity between the mem- models are categorized as low, medium and high. The ac-
bership functions for compounds that are hydrophilic and tual value from the test data was evaluated using the pa-
lipophilic (seen in Figure 7). rameters of these membership functions to produce a fuzzy
set represented by three points (Figure 5). This fuzzy set
3.1.3. Expert opinion represents the degree of belongingness (μ) to each of the
The database, literature and models can only guide the de- three categories (low, medium and high). The predicted
velopment of preliminary membership functions and rules.
For data driven fuzzy models, optimization routines mod-
ify the membership functions for a training data set. For
Mamdani models, solicitation of expert opinion can be con-
sidered a pseudo-optimization step. The main information
solicited from the expert was regarding the nature of the
inputs and output membership functions and the subse-
quent rules. For example, it was suggested that the effect of
molecular weight levels off at both low and high extremes.
This information is reflected in the shape of the low and
high membership functions on the molecular weight do-
main (seen in Figure 8).

3.2. Model development phase


Two Mamdani models were created. The inputs used in the
first model were log Kow, molecular weight, and tempera-
ture, and the predicted output was the skin permeability
coefficient (log Kp). The inputs in the second model were
log Kow, and molecular weight predicting log Kp. The fuzzy
logic toolbox in MATLAB [16] was used to build the fuzzy
inference systems. Based on the information collected from
the various sources, membership functions were created
for each input and the output and subsequent rules were
developed for each model. The fuzzy inference system was
then presented to the expert for suitable modifications to
the membership functions and the rules. Figure 5. Obtaining fuzzy classification set for actual value.
290  Keshwani et al. in A p p l i e d S o f t C o m p u t i n g 8 (2008)

3.4.2. Defuzzifying the predicted output


The centroid method was used to defuzzify the output of
the Mamdani models. The crisp predictions were com-
pared to the actual values from the test data and R2 esti-
mate of correlation was calculated. This is a common form
of comparison utilized for most modeling strategies. How-
ever, defuzzifying the output results in a loss of informa-
tion regarding the ambiguity of the prediction. In the ab-
sence of actual values, the confidence in the prediction can
be determined based on the degree of ambiguity.

4. Results

4.1. Membership functions and rules


The first Mamdani model was developed with three in-
puts (log Kow, MW, and temperature) to predict log Kp as
an output. The second model was developed with two in-
puts (log Kow and MW). Four membership functions were
developed for log Kow (Figure 7) to linguistically represent
hydrophilic to highly lipophilic compounds. The range of
log Kow used in the model is –4 to 8. There is a disconti-
nuity between the hydrophilic and the lipophilic mem-
bership functions. This stems from the hypothesis that hy-
drophilic compounds may penetrate the skin in a manner
Figure 6. Obtaining fuzzy classification set for predicted fuzzy different from lipophilic compounds [14]. Hence, there is a
output. lack of knowledge and information available on the com-
pounds with a log Kow that occurs between the hydrophilic
output from the Mamdani model is a fuzzy set represented and lipophilic membership functions. A Mamdani model-
by 101 points. Based on the relative contributions from each ing scheme enables representation of this lack of knowl-
output membership function (high, medium and low), the edge in the model structure.
predicted fuzzy set of 101 points was reduced to a fuzzy set Three membership functions were developed for molec-
of three points (Figure 6). The relative contributions from ular weight (Figure 8) representing low, medium, and high
each output membership function were estimated by inte- linguistic classes. The range for MW used in the model is
grating the predicted fuzzy set over the range of the mem- from 10 to 1000. The high molecular weight membership
bership function. Equations (4)–(6) were used to develop function is important as data for most existing models does
the predicted fuzzy classification: not contain information for such heavy compounds.

(4)

(5)

(6) Figure 7. Membership functions for log Kow.

In the above equations, μL(P), μL(P), and μL(P) constitute


the predicted fuzzy classification, μi (P) is the membership of
each point in the predicted fuzzy set and a–f are the ranges
of the output membership functions defined in Figure 6.
For each test case, an actual fuzzy classification and a
predicted fuzzy classification were obtained. The modified
Hamming distance measure (3) was used to determine the
similarity between the two fuzzy sets. Apart from a com-
parison to actual values, the ambiguity associated with
each predicted value was quantified using an entropy mea-
sure (2) as defined in the theory section. Figure 8. Membership functions for molecular weight.
Rule-based Mamdani-type f u z z y m o d e l i n g o f s k i n p e rm e a b i l i t y   291

Table 2 Rules developed for three input model


IF log Kow AND MW AND T THEN log Kp
Hydrophilic Low Room Medium
Hydrophilic Low Skin Medium
Hydrophilic Low Core High
Hydrophilic Medium Room Low
Hydrophilic Medium Skin Medium
Hydrophilic Medium Core Medium
Hydrophilic High Room Medium
Hydrophilic High Skin Low
Figure 9. Membership functions for temperature.
Hydrophilic High Core Low
Low lipophilicity Low Room Medium
Low lipophilicity Low Skin Medium
Low lipophilicity Low Core Medium
Low lipophilicity Medium Room Low
Low lipophilicity Medium Skin Low
Low lipophilicity Medium Core Medium
Low lipophilicity High Room Low
Low lipophilicity High Skin Low
Low lipophilicity High Core Low
Medium lipophilicity Low Room High
Medium lipophilicity Low Skin High
Figure 10. Membership functions for log Kp. Medium lipophilicity Low Core High
Medium lipophilicity Medium Room Medium
Medium lipophilicity Medium Skin High
Three membership functions were developed for tem- Medium lipophilicity Medium Core High
perature (Figure 9) representing room, skin, and core body Medium lipophilicity High Room High
Medium lipophilicity High Skin Medium
temperature. The range for temperature was 20–40 °C.
Medium lipophilicity High Core High
Three membership functions were developed for log Kp High lipophilicity Low Room Low
(Figure 10) representing low, medium, and high permea- High lipophilicity Low Skin Medium
bility. The range of the output was from –8 to 0 with least High lipophilicity Low Core Medium
permeability occurring at –8. High lipophilicity Medium Room Medium
Based on gathered information and expert opinion, 36 High lipophilicity Medium Skin Low
High lipophilicity Medium Core Low
rules (Table 2) were developed to map the input member- High lipophilicity High Room Low
ship functions to the output membership functions for the High lipophilicity High Skin Low
three input model. Similarly, 21 rules were developed for High lipophilicity High Core Low
the two input model. The two input model contains mul-
tiple rules where the same antecedents result in a different
consequence. This stems from the fact that absence of tem-
output membership functions (Figures 6 & 7). Each fuzzy
perature as an input adds more ambiguity to the prediction
classification set was represented by three membership val-
of log Kp. The output (log Kp) is predicted as low, medium,
ues: high, medium, and low. The proposed distance for-
or high, based on the combination of the input membership
mula was applied in each test case and an estimate of clas-
functions (Table 3).
sification was obtained. The distribution of the calculated
distances for both models is provided in Figures 11 and 12.
4.2. Comparing predicted and actual fuzzy classification
Referring back to Table 1, a distance measure of one im-
During the testing of each model, fuzzy classifications were plies that the model prediction was one fuzzy class away
created for the predicted and actual values using defined from the actual value. A distance measure of two implies

Figure 11. Calculated distance measures—using Equation Figure 12. Calculated distance measures—using Equation
(3)—for three input model test data. (3)—for two input model test data.
292  Keshwani et al. in A p p l i e d S o f t C o m p u t i n g 8 (2008)

Table 3. Rules developed for two input model


IF log Kow AND MW THEN log Kp

Hydrophilic Low Medium


Hydrophilic Low High
Hydrophilic Medium Low
Hydrophilic Medium Medium
Hydrophilic High Medium
Hydrophilic High Low
Low lipophilicity Low Medium
Figure 14. Actual log Kp vs. predicted log Kp for two input
Low lipophilicity Medium Low
model test data.
Low lipophilicity Medium Medium
Low lipophilicity High Low
Medium Lipophilicity Low High
Medium Lipophilicity Medium Medium
Medium Lipophilicity Medium High
Medium Lipophilicity High High
Medium Lipophilicity High Medium
High lipophilicity Low Low
High lipophilicity Low Medium
High lipophilicity Medium Medium
High lipophilicity Medium Low
High lipophilicity High Low Figure 15. Calculated entropy (using Equation (2)) for three in-
put model test data.

that the model prediction was two fuzzy classes from the three input model had an R2 of 0.61. The correlation be-
actual value. Results for the three input model (Figure 13) tween actual and defuzzified predicted values for the three
indicate that 71% of the test data were predicted within half input model is shown in Figure 13. The three input model
a fuzzy class of the actual value. For the three input model, had an R2 of 0.45. The correlation between actual and de-
47% of the test data was predicted within half a fuzzy class fuzzified predicted values for the two input model is shown
of the actual value. In both models, all the test data was in Figure 14. Based on R2 values, the three input model has
predicted within one fuzzy class of the actual value. How- a better performance in predicting crisp log Kp values. In
ever, the performance of the three input model does appear both models, the performance appears to be better at com-
to be significantly better. pounds with higher permeability values.

4.3. Comparing predicted defuzzified values to actual values 4.4. Ambiguity for each prediction
The fuzzy outputs from both models were defuzzified us- The entropy measure (2) was used to quantify the ambi-
ing the centroid principle (9). The crisp predictions were guity or confidence associated with each test prediction in
then compared to the actual values from the test data and both models. Figure 15 shows the distribution of this mea-
estimates of RMSE and correlation were calculated. The sured entropy for all test cases in the three input model.

Figure 13. Actual log Kp vs. predicted log Kp for three input Figure 16. Calculated entropy (using Equation (2)) for two in-
model test data. put model test data.
Rule-based Mamdani-type f u z z y m o d e l i n g o f s k i n p e rm e a b i l i t y   293

Table 4. Comparison of performance of three input and two


input models
Mean % of test
distance data within
R2 of test half fuzzy Mean
Model Inputs used value data class entropy

Three input log Kow, MW, T 0.61 0.32 71 0.41


Two input log Kow, MW 0.45 0.38 47 0.49

is clear that the prediction for case 1 is much better than


Figure 17. Comparing entropy for predicted fuzzy outputs of case 2. For future predictions, when the actual value is not
two test cases. known, the entropy measure provides an estimate of confi-
dence for the prediction (fuzzy or defuzzified). Data-driven
Figure 16 shows the distribution of entropy values for test models provide a crisp estimate for future predictions. But
cases used in the two input model. Figure 17 compares the other than referring to past performance with test data,
predicted fuzzy output from two sample test cases. Case there is no clear estimate on how good the prediction is for
1 had a calculated entropy of 0.097, and case 2 had a cal- the new data point. Using entropy measures to quantify
culated entropy of 0.655. From the shape of the member- ambiguity addresses this issue.
ship function, there is more confidence in the prediction for
case 1 than case 2. Hence, the calculated entropy measures
quantify the ambiguity based on shape assessment. 6. Conclusion

Two Mamdani-type models were developed to predict skin


5. Discussion permeability coefficients using octanol–water partition co-
efficient, molecular weight, and temperature as inputs. Us-
Analysis of the developed Mamdani models involved com- ing multiple knowledge sources, membership functions
parison of actual and predicted fuzzy classifications, cor- and rules were developed to provide generalized models
relation between actual and defuzzified crisp values, and not optimized for a specific data set.
calculating entropy to quantify ambiguity. Table 4 com- Apart from correlation estimates of actual and defuzzified
pares the performance of the three input model versus the predictions, an alternative analysis was performed involv-
two input model. In every category, the performance of the ing comparison of actual and predicted fuzzy classifications.
three input model was better. The R2 value obtained for the A distance measure was used to compare actual and fuzzy
three input Mamdani model is comparable to results from classifications. The proposed measure is a modification of
data driven models by Keshwani et al. [6] using the test the Hamming distance often used to compare distances be-
data from the same database. Magnusson et al. [8] devel- tween fuzzy sets. One of the drawbacks of the proposed dis-
oped crisp rule-based models to classify compounds based tance measure is that it does not take into account the di-
on skin permeability. While a direct comparison between rection of misclassification. The entropy measure used also
the models is not feasible, the degree of classification of the appears to have a drawback: it does not clearly distinguish
Mamdani models developed in this study is comparable between unimodal and slightly bimodal fuzzy outputs.
to results presented by Magnusson et al. [8]. The key dif- The Mamdani model developed is a knowledge-driven
ference is that the predictions in the models presented by predictive model that is not common in skin permeability
Magnusson et al. [8] were crisp and not fuzzy as is the case literature. A major advantage of this modeling approach is
in this study. Taking a fuzzy approach enables the repre- that it enables the use of entropy measures to quantify am-
sentation of ambiguity associated with each prediction. biguity associated with future predictions. This provides a
The R2 values obtained for the models are less measure of confidence for predicting log Kp for compounds
than results from some previously published models when the actual value is unknown. Potential uses of the
[6,7,9,11,13,14]. However, most previously published mod- presented models include rapid assessment of skin perme-
els were entirely data driven and optimized for a specific ability of compounds to identify candidates for transder-
data set. The Mamdani-type model developed is not opti- mal drug delivery and estimate toxicity risks.
mized for a specific data set, and hence it is reasonable to
obtain a lower R2 value. With more thorough knowledge
acquisition and selection of the most significant inputs, the References
Mamdani-type model will have a better performance. Us-
[1] R. Babuska, Fuzzy Modeling for Control, Kluwer Academic
ing multiple knowledge sources and moving away from
Publishers, Massachusetts, 1998.
fitted models can yield a more generalized model for fu-
ture predictions with new data. [2] D. A. Edwards, R. Langer, A linear theory of transdermal
The entropy measures calculated describe the ambigu- transport phenomena, J. Pharm. Sci. 83 (1994) 1315–1334.
ity associated with the fuzzy prediction. From Figure 17, it [3] G. L. Flynn, Physiochemical determinants of skin absorp-
294  Keshwani et al. in A p p l i e d S o f t C o m p u t i n g 8 (2008)

tion, in: T. R. Gerrity, C. J. Henry (Eds.), Principles of Route- [11] A. K. Pannier, R. M. Brand, D. D. Jones, Fuzzy model-
to-Route Extrapolation for Risk Assessment, Elsevier, New ing of skin permeability coefficients, Pharm. Res. 20 (2003)
York, 1990, pp. 673–715. 143–148.
[4] W. Hung, A note on entropy of intuitionistic fuzzy sets, Int. [12] D. T. Pham, M. Castellani, Action aggregation and defuzz-
J. Uncert. Fuzz. Knowledge-Based Syst. 11 (2003) 627–633. ification in Mamdani-type fuzzy systems, in: Proceedings
[6] D. R. Keshwani, D. D. Jones, R. M. Brand, Takagi-Sugeno of the Institute of Mechanical Engineers Part C, J. Mech.
fuzzy modeling of skin permeability, Cutan. Ocular Toxi- Eng. Sci. 216 (2002) 747– 759.
col. 24 (2005) 149–163. [13] R. O. Potts, R. H. Guy, Predicting skin permeability,
[7] L. A. Kirchner, R. P. Moody, E. Doyle, R. Bose, J. Jeffrey, I. Pharm. Res. 9 (1992) 663–669.
Chu, The prediction of skin permeability by using physi- [14] R. O. Potts, R. H. Guy, A predictive algorithm for skin
cochemical data, Altern. Lab. Anim. 25 (1997) 359–370. permeability—the effects of molecular size and hydrogen
[8] B. M. Magnusson, W. J. Pugh, M. S. Roberts, Simple rules bonding activity, Pharm. Res. 12 (1995) 1628–1633.
defining the potential of compounds for transdermal de- [15] T. J. Ross, Fuzzy Logic with Engineering Applications, Mc-
livery or toxicity, Pharm. Res. 21 (2004) 1047–1054. Graw-Hill Inc., New York, 1995.
[9] R. P. Moody, H. MacPherson, Determination of dermal ab- [16] E. Szmidt, J. Kacpryzk, Distances between intuitionistic
sorption QSAR/QSPRs by brute force regression: multi- fuzzy sets, Fuzzy Sets Syst. 114 (2000) 505–518.
parameter model development using MOLSUITE, 2000, J. [17] B. E. Vecchia, A. L. Bunge, Skin absorption databases and
Toxicol. Environ. Health-Part A 66 (2003) 1927–1942. predictive equations, in: J. Hadgraft, R. H. Guy (Eds.),
[10] V. Oduguwa, R. Roy, D. Farrugia, Development of a soft Transdermal Drug Delivery Systems, vol. 123, Marcel
computing-based framework for engineering design opti- Dekker, New York, 2003, pp. 57–141.
mization with quantitative and qualitative search spaces,
Appl. Soft Comput. 7 (2007) 166–188.

You might also like