Ontology and CBR-based Dynamic Enterprise Knowledge Repository Construction
Ontology and CBR-based Dynamic Enterprise Knowledge Repository Construction
Ontology and CBR-based Dynamic Enterprise Knowledge Repository Construction
In this way, not only the title part but also the case
description part can be represented in the following
expression 1 2
,
i i i im
D d d d
T
( =
. Particularly in the title part
as the value of ij
t e|0,1|
, then 1 2
,
i i i im
D d d d
T
( =
is
the same as
1 2
,
i i i im
T t t t
T
( =
.
Set the users query vector as standard, and then the
query vector space should be
*
( , ,... ,..., )
j m
C W I w w w w
T
= - =
1 2
I represents a unit matrix. W
T
is set with the scoring
method. Only the user can understand his demand most
clearly, therefore,
j
W
T
is calculated by the keywords
preferences of the user. We divide the users keyword
preference level into the following fuzzy set:{very
important, important, common, unimportant, very
unimportant}. For quantitative analysis, the fuzzy set can
be mapped into the vector{54321}. If the users
preference vector for the m keyword
is m j
( x , x ,..., x ) x , < <
1 2
1 5
and integer, then
=
j
j
m
j
j
w
x
x
=
1
Therefore, the similarity between the query
information and source case information in different part
is calculated with 4and5respectively.
( ) cos( , )
* *
m
j
j
i i
SIM C T
TI
C w C
=
=
*
*
( , ) COS( , )
d
* *
*
*
m
IN j
j
m
j
j
m
ij
m
j
j
m m
j
ij j
j j
i i
i
i
j
SIM
w
c
w
C C w D C
D C
D C
d c
=
=
=
=
= =
=
1
1
1
1
2 2
1 1
While computing the similarity of query vector
*
C
and
semantic vector
0
C
1
C
n
C
,remember to represent the
largest similarity as max
*
( )
i
C C SIM
.If the value is larger
than or equal to the given thresholdc , the case will be
added into the search results. Finally, the search results
sort according to the similarity and be sent to the user.
D. Case Reuse and Learning
Generally speaking, the scale of the initial case base is
so limited that cannot satisfy all the different needs of
customers, so the dynamic management mechanism
becomes urgent. It mainly includes the realization of the
case reuse, modification and case learning on the basis of
case retrieval. Fig. 5 shows the flow chart of this dynamic
mechanism.
In the process of case reuse, two main problems should
be considered. One is the differences between the new
problem and the retrieval results. The other is which part
of the result can be reused. We define those cases met the
users demand group in the matching set expressed byG ,
that is
*
{ | ( ) }
i i
G C SIM C C c = > .
If
G u =
, then decompose the query vector by level in
accordance with ontology and calculate the similarity for
searching the matching set which is not u .Mark this
level as starting point, and combine the cases from down
to up with the ontology rules which can coordinate the
constraint and standard between different cases in the
process of combination. Two methods of combination are
appropriate: exhaustion and genetic algorithm. When the
number of case combination choices is small, that means
all possible combinations can be listed and the feasible
solution can be found out. We consider the one who got
the largest similarity with the new case as the optimal
solution. Otherwise, genetic algorithm will be more
efficient. Then, the process of case adjustment begins.
G=?
Decompose
query vector
Satisfied?
Learn
the case
Adjust
slightly
SIM ?
Merge
Cases
Store
Input query
vector
Start
End
Calculate
similarity
Match
sub-vector
?
Modify?
Combine
cases
Evaluate
case
Adjust
case
Y
N
Y
N
Y
Y
Y
N
N
N
Figure5. Flow chart of dynamic management
If G u = the process of case combination skips and
directly gets into the case adjustment process. As case
adjustment will also need to consider some other
information besides the case similarity, it should be
adjusted and decided according to case description,
solutions and other affiliated information with the help of
persons, in the manual or semi-automatic way. We define
the case after adjustment as target case, which is more
close to new problem than the source case.
The case is evaluated in practice, and then a new case
which meets our requirements will be learned. In order to
avoid the redundant information in the knowledge
repository, similarity between the new case and the
source case will be also calculated to determine the new
case will be stored or be processed further. Set certain
threshold
c
, S represents source case while N is new
case, so if ( , ) SIM S N s c , store the new case; else, merge
them together.
In this process, the ontology provides the semantic
support for the case retrieval, matching, combination,
adjustment and case learning.
E. Validation and Analysis
In order to verify the retrieval effectiveness, according
to the research work in this paper, a simple prototype
system of consulting industry is developed based on SQL
database and C # language.
Its difficult to make the case database rich at the very
beginning. Therefore, we mainly collect relevant data
from the enterprise original relational database system,
Intranet, and project materials and then sort into 40
(5)
(4)
(2)
(3)
1216 JOURNAL OF SOFTWARE, VOL. 7, NO. 6, JUNE 2012
2012 ACADEMY PUBLISHER
information system project consulting cases with analysis.
These cases are deposited into the knowledge repository
in the description and organization way illustrated above,
which can be seen as the preparation for the operation.
Suppose that a consultant who is mainly responsible
for e-government projects needs to retrieve some
information of history projects which provide project
management services for video conferencing system in
the finance field. Besides, if Ministry of Finance is main
project stakeholders, that will be better. So, the key
retrieval vector is extracted as Industry, Project Types,
Services and Main Project Stakeholders. This consultant
enters the following vector value:
, , Project , [ ]
T
ij
Finance Video conferencing system Management Ministry of Finance T =
The case retrieval interface is shown in Fig. 6.
Figure 6 User query interface
Also, his preferences for these keywords are defined
successively as important, very important, very important,
and unimportant respectively, which is shown in Fig.7. In
this way, the score could be described as
( , , , ) 4 5 5 2
and
got
=[1/4,5/16,5/16,1/8]
T
W
T
with (3). According to the
domain ontology above and the users information, the
retrieval vector can be expressed more precisely with
semantic and pragmatic expansion, which is shown in
detail in the Table. I. Particularly, the dimension of
finance is expressed as government rather than financial
industry, because the information of the consultant shows
that he is responsible for the E-government services and
the major project stakeholder is Ministry of Finance.
Figure 7 Weights determination interface
In this paper, we set
. , . 0 6 0 4 o | = =
the threshold
value of similarity . 0 5 c = then the consultants retrieval
vector was:
*
( / , / , / , / )
T T
C W I = - = 1 4 5 16 5 16 1 8 . Computing
with the retrieval algorithm and the expansion vector
shown above, cases those meet the formula
*
( )
i
SIM C C c >
are delivered as Fig.8.
TABLE I.
RETRIEVAL FEATURE VECTOR AND SEMANTIC EXPAND INFORMATION
ij
t
j
x
j
w
Vector Value synonymy Pragmatics Hyponymy
Industry 4 1/4 Finance
Government(Y);
Financial Industry(N)
Government
Project Types 5 5/16 Video conferencing system Video session system Multi-media
Services 5 5/16 Project Management PM; Supervisor Management
Main Project
Stakeholders
2 1/8 Ministry of Finance
Treasury Department
Finance Bureau
Person;
Organization
Figure8. The matching cases and similarities
Take the optimal matching case 7 as example. The
process is described as follows.
| |
| |
IN
=
( )=1
0.7559
( / , / , / , / )
*
,
, , ,
, / ,
*
( , )
*
( , ) 0.6 1 0.4 0.7559 = 0.9024
T
TI
SIM
TI
T
C C
T
D
SIM C C
SIM C C
T
T
=
=
=
= +
7
7
7
7
1 4 5 16 5 16 1 8
1 4 10 1
0 1 3 1 0
The traditional way of information retrieval is based on
keywords that the computer cannot understand users
potential semantic and personalized query demand.
Therefore, we conduct the semantic and pragmatic
retrieval research based on the domain ontology and
JOURNAL OF SOFTWARE, VOL. 7, NO. 6, JUNE 2012 1217
2012 ACADEMY PUBLISHER
users information. After the analysis, semantic and
pragmatic retrieval can get the best precision and recall
rate of the matching result among traditional retrieval
method, sematic and pragmatic method, as is shown in
TABLE II.
TABLE II.
THE COMPARISON OF THE RETRIEVAL METHOD
Retrieval
Results
Retrieval
Matching
Cases
Actual
Case
Number
Precision Recall
Traditional
searching
3 2 8 0.6667 0.375
Semantic
retrieval
5 4 8 0.8 0.5
Semantic
and
pragmatic
retrieval
8 7 8 0.875 0.875
VI. CONCLUSION
The dynamic knowledge repository construction is a
hot issue in the field of knowledge engineering at present.
This paper focuses on the construction method of
knowledge repository based on ontology and CBR. An
integrated framework is put forward and the operation
processes of dynamic knowledge repository system are
analyzed. With the construction of ontology the query of
the user can be expanded with more semantic information
and the case database can be formed more precisely. A
retrieval algorithm is designed and the dynamic
mechanism of the knowledge management is illustrated.
Moreover, taking a consulting company as the case study
background, the retrieval process is verified, which
proves the high accuracy and completeness of the
retrieval algorithm.
As the initial case samples are small and centralized,
further work should be done to conduct quantitative
analysis based on a large size and wide distribution of
case samples.
ACKNOWLEDGMENT
This work was supported in part by the National
Natural Science Foundation of China under Grant
71102111 and Beijing Institute of Technology under
Grant 3210050320908.
REFERENCES
[1] Annette M. Mills, Trevor A. Smith. Knowledge
management and organizational performance: a
decomposed view [J]. Journal of Knowledge Management,
2011, Vol.15 Iss1, pp.156171.
[2] K. D. Joshi, Lei Chi, Avimanyu Datta, Shu Han. Changing
the Competitive Landscape: Continuous Innovation
Through IT-Enabled Knowledge Capabilities
[J].Information Systems Research, Vol.21, No.3,
September 2010, pp.472-495.
[3] Chou, Shih-Wei. Knowledge creation: absorptive capacity,
organizational mechanisms, and knowledge
storage/retrieval capabilities [J].Journal of Information
Science, December 2005, vol.31, pp. 453-465.
[4] Stanford University Knowledge System Laboratory. A
translation approach to portable ontology specifications [R].
Stanford: Stanford University Knowledge System
Laboratory, 1992.
[5] Norbert Gronau, Frank Laskowski .Using Case- Based
Reasoning to Improve Information Retrieval in Knowledge
Management Systems [J].Springer-Verlag GmbH, 2003;
2663, pp.94-102.
[6] Wimalasuriya, Daya C., Dejing Dou. Ontology-based
information extraction: An introduction and a survey of
current approaches [J].Journal of Information Science,
June 2010, vol.36, pp.306-323.
[7] Wenhuan Lu, Ikeda, Mitsuru.A uniform conceptual model
for knowledge management of international copyright
law[J].Journal of Information Science, February
2008,vol.34,pp.93-109.
[8] Akbari, Ismail, Fathian, Mohammad. A novel algorithm
for ontology matching [J].Journal of Information Science,
June 2010, vol.36:pp.324-334.
[9] Chimay J. Anumba, Raja R.A. Issa , Jiayi Pan, Ivan
Mutis.Ontology-based information and knowledge
management in construction[J].Journal of Knowledge
Management,2008, Vol.8,Iss3, pp.218239.
[10] Liao Liangcai, Qin Wei, Shu Yu. Ontology-based Dynamic
Knowledge Management System [J].Computer
Engineering, 2009, 3516, pp.256-261.
[11] Gao Huiying, Yan Zhijun. CBR based Multi-agent System
Model for Case Retrieval of Information Systems
[J].Computer Engineering and Design, 2008, 29(5),
pp.1226-1228.
[12] Gao Huiying, Zhao Jinghua. Ontology-based Enterprise
Content Retrieval Method [J].Journal of Computers, 2010,
5(2), pp.314-321.
[13] Huiying Gao, Qian Zhu. Semantic Web based Multi-agent
Model for the Web Service Retrieval. Proceedings of
International Symposium on Computer Network and
Multimedia Technology, 2009.12, pp.897-900.
Huiying Gao, Dr. Associate professor, was born in
Shandong Province, China, in 1976. She received her
doctor degree in management science and engineering
from Beijing Institute of Technology in 2003. Now she is
an associate professor in the school of Management and
Economics, Beijing Institute of Technology. She has ever
been in Technical University of Berlin, Germany to do
her Ph.D. research work from 2002 to 2003. From 2008
to 2009 she has visited Karlsruhe Institute of Technology,
Germany for half a year. Her current research interests
include theory and method of information systems,
content and knowledge management, semantic retrieval,
intelligent information system etc. Ph: +86 (10)
68918830.
Xiuxiu Chen was born in Shandong Province, China,
1988. She is a Ph.D. degree candidate in the school of
Management and Economics, Beijing Institute of
Technology. Her research work includes information
management and knowledge management. Ph: +86 (10)
68918830.
1218 JOURNAL OF SOFTWARE, VOL. 7, NO. 6, JUNE 2012
2012 ACADEMY PUBLISHER