Vol-1, Issue-4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

International Journal of Research in Engineering and Innovation Vol-1, Issue-4 (2017), 25-29

___________________________________________________________________________________________________________________________________

International Journal of Research in Engineering and Innovation


(IJREI)
journal home page: http://www.ijrei.com
ISSN (Online): 2456-6934

_______________________________________________________________________________________

A new approach for component retrieval from reuse repository using R-tool
Dr. Jasmine K.S1, Ravi Teja2
1Department of MCA, R.V.College of Engineering, Bangalore, Karnataka, India
2 Software Engineer, Aricent, Gurgaon, Haryana, India
_______________________________________________________________________________________

Abstract
A continuing challenge for software designers is to develop efficient and cost-effective software implementations. Many see software
reuse as a potential solution; however, the cost of reuse tends to outweigh the potential benefits. The costs of software reuse include
establishing and maintaining a library of reusable components, searching for applicable components to be reused in a design, as well
as adapting components toward a proper implementation. The study of storage and retrieval methods of software assets in software
libraries gives rise to a number of paradoxes: While this subject has been under investigation for nearly two decades, it still remains
an active area of research in software reuse and software engineering. The new technologies such as the internet, the world wide web
and object oriented programming keep opening new opportunities for better asset packaging, better library organizations, and larger
scale libraries – thereby posing new technical challenges. Also, while many sophisticated solutions have been proposed to this
problem, the state of the practice in software reuse is characterized by the use of adhoc approaches. This can be explained by the
observation that most existing solutions are either too ineffective to be useful or too intractable to be usable. Finally, it is difficult to
imagine a successful software reuse program without a sophisticated, systematic procedure for software component storage and
retrieval. In this context, a new method is suggested here for component classification and retrieval which makes use of R-Tool, an
open source tool for statistical computing and graphics. © 2017 ijrei.com. All rights reserved
Keywords: Software reuse, Component Retrieval, R-Tool
_______________________________________________________________________________________

1. Introduction which exploits semantic connections between documents and


parts thereof, and semantic connections between queries and
Many software organizations realized that developing the documents. The paper is structured as follows. Section 1 starts
software using reusable components could dramatically reduce with a discussion of what is meant by software reuse and a
development effort, cost and accelerate delivery. But the non- reusable component. Section 2 talks about present scenario in
existence of a standard searching technique for finding the the component retrieval process. Section 3 presents
suitable component and also the lack of appropriate tool in this methodology adopted. Section 4 mentions some concluding
field contributed towards the large-scale failures in their remarks and directions for future research in this area.
approach. From the past studies on this field, it is found that
researchers are tried with different approaches to improve the 1.1. What is Software Reuse?
adaptability of the component but very few studies had taken
place in improving the efficiency of component retrieval. Software reuse at its most basic level consists of making use of
Fuzzy linguistic approach is familiar in the information any existing information, component or product when
retrieval process. Vector Space model approach is also widely designing and implementing a new system or product.
accepted for document retrieval process [5]. Due to the There are differing opinions as to which activities constitute
semantic sensitivity and poor representation of large genuine software reuse. Replication of an entire software
documents, it was not a promising approach to achieve its program does not count as reuse. Reuse of assets is dependent
intended goal. In order to overcome these limitations, R-tool is upon both similarities and differences between the applications
used by making use of its statistical and analytical capability in which the component is being used [2].

Corresponding Author: Jasmine K.S 25


Email Id: [email protected]
Jasmine K.S et al/ International journal of research in engineering and innovation (IJREI), vol 1, issue 4 (2017), 25-29

Many organizations already practice a limited form of reuse, thoroughly in the information retrieval literature. Lexical
for example, most developers have libraries of components descriptor-based encoding approach also suffers from a
that they have developed in previous projects, or they use number of problems about developing and using classification
standard libraries, which are available with many vocabulary [11]. Software specific challenges include the fact
programming languages [5]. About 30% of the cases, it is a that one-word or one-phrase abstractions are hard to come by
very ad-hoc method of reuse, and it will work very well on a in the software domain [12]. From the user’s point of view,
small scale and it will not be suitable for entire organizations lack of familiarity with the vocabulary is also pointed out as
[6]. Instead, businesses need to implement a systematic reuse draw back in using a component retrieval system effectively
program in order to gain the full advantages of reuse. [14]. Vector space Model is also not a promising solution for
component retrieval process due to its semantic sensitivity [7].
1.2 What Can Be Reused? In this context, content based indexing schemata and ranking
based on frequency of occurrence of component search with
The definition of a reusable component is “any component that keywords will be a promising approach for establishing
is specifically developed to be used, and is actually used, in semantic connections between queries and documents.
more than one context” [4]. This does not just include code;
other products from the system lifecycle can also be reused, 3. Methods used
such as specifications and designs, and even requirements on
occasion [3]. ‘Components’ in this case can be taken to include 3.1. Proposed Document Retrieval Approach
all potentially reusable products of the system lifecycle,
including code, documentation, design, requirements etc. R is the powerful open-source implementation of the language
There are various criteria that should be satisfied in order for S. R is a very effective statistical tool and well worth the effort
an asset to be successfully reusable. These are grouped into to learn. R is polymorphic, which means that the same function
general, functional and technical requirements [14]. General can be applied to different types of objects, with results tailored
requirements focus on aspects such as compliance with to the different object types [1].
relevant standards, completeness, modularity and simplicity. In the proposed approach, the outstanding statistical and
All components should conform to the general requirements. graphical capabilities of R is used to make a frequency count
Functional requirements include such concerns as which of the word search (based on the software component) and a
business processes it will simulate or automate, and how well visualization mechanism is followed which facilitates easy
it does this. Functional requirements mainly concern Vertical understanding of the method adopted.
or Domain-specific assets and tend to be very specific to each
information domain. Lastly, technical requirements refer to
criteria such as interoperability, portability, communication,
security etc [2].
There are different levels of reuse, which can be considered
[3]. At the highest level, entire applications can be reused on
different platforms provided they are portable. Sub-systems
can be reused within different applications, possibly within
different domains; for example, a login system could be used
in a database application as well as a control application. At a
lower level modules or objects can be reused, and at the very
lowest level single functions can be reused. This is also known
as classification of the granularity of components. Fine grained
is used to describe those smaller and more generic components,
for example file access functions, or I/O functions. Course
grained is used for the more complex components, for example
user-interface packages [5]. Reusable assets can be also being Figure 1: R visualization with simple data
built in-house, retrieved from legacy systems or can be bought
from an external source [4]. R can import data from CSV files, Excel, SAS and produces
the output in pdf, jpg, png formats and also table output.
2. Present scenario in the component retrieval process
Steps Adopted
Existing approaches to software component retrieval process
cover a wide spectrum of component encoding methods and Step: 1
search or matching algorithms. The encoding methods differ
with respect to their soundness, completeness, and the extent Consider three files as input and stored as .csv files namely
to which they support an estimate of the effort it takes to cc1.csv, cc2.csv, and cc3.csv (Assumption: searching for a
modify a component. Text-based encoding and retrieval is document which addresses the ‘Cloud Computing’ concept in
neither sound nor complete. Its disadvantages have been the reuse repository)

26
Jasmine K.S et al/ International journal of research in engineering and innovation (IJREI), vol 1, issue 4 (2017), 25-29

Step: 2

Plot the graphs which shows the frequency of occurrence of all


words for the three files separately, (cc1Rplot, cc2Rplot,
cc3rplot)

Step: 3

Consider also the complete word 'Cloud Computing ' and find
the frequency of it.
(Actually, the words are arranged in ascending order from a to
z).

Step: 4

Consider the output of the Search (the file which contains


maximum count of the word ‘Cloud computing’) as the best
result.
Figure 4: Tool generated Word frequency table from cc2.csv file

Figure 5: Histogram of datadet2


Figure 2: Tool generated Word frequency table from cc1.csv file

Figure 6: CC2RPlot
Figure 3: Histogram of dataset1

27
Jasmine K.S et al/ International journal of research in engineering and innovation (IJREI), vol 1, issue 4 (2017), 25-29

Figure 7: Tool Generated Word frequenncy table cc3.csv Figure 10: Tool Generated ‘Cloud Computing’ word count
table and tail words

4. Conclusion

In this paper, we have used the Statistical & graphical approach


using R-tool for document retrieval. In this approach, If there
are more than one documents with the same representation,
every such document is retrieved based on the ranking of the
word count [14]. The approach makes use of content based
indexing which is a promising solution for Component
retrieval from reuse repository. Future research can be aimed
in this direction.

References

[1] Richard Cotton, “Learning R”, O’Rielly, United Statesof


Figure 8: Histogram of dataset3 America, 2013, pp.3-7.
[2] González-Calero,P.A,.”Applying Knowledge Modeling
and Case-Based Reasoning to Software Reuse”. IEEE
Proceedings – Software, vol. 147, no. 5,n2000.
[3] Sathish Kumar Soora,” A Framework for Software Reuse
and Research Challenges”, International Journal of
Advanced Research in Computer Science and Software
Engineering, Volume 4, Issue 10, October 2014,pp441-
448.
[4] Lucrédio, D.; Almeida, E. S.; Prado, A. F., ‘A Survey on
Software Components Search and Retrieval’, Proceedings
of 30th IEEE EUROMICRO Conference, Component-
Based Software Engineering Track,2004.
[5] Lucrédio, D.; Gavioli, A.; Prado, A. F.; Biajiz, M.;
Yamamoto, C. H.; Almeida, E.S.; Meira, S. R. L.
“Component Retrieval using Metric Indexing”,
proceedings of IEEE International Conference on
Information Reuse and Integration (IRI), Las Vegas,
Nevada, USA,2004.
[6] Sharma, WS and Rao, Vijay.” A rough–fuzzy approach
Figure 9: Tool Generated ‘Cloud Computing’ word frequency for retrieval off candidate components for software reuse”,
table Pattern Recognition Letters 24(6), 2003,pp. 875-886.

28
Jasmine K.S et al/ International journal of research in engineering and innovation (IJREI), vol 1, issue 4 (2017), 25-29

[7] Zhang D.; Tsai J. J. P” Machine Learning and Software


Engineering”, Software Quality Journal, Issue 11,2003,
pp.87-119.
[8] W.B.Frakes and B.A Nejmeh,”An information system for
software reuse, Software Reuse: Emerging Technology”,
IEEE, CS Press,1990, pp.142-151.
[9] Y.S. Yoelle S.Maarek, D.M.Berry, and G.E. Kaiser,”An
information retrieval approach for automatically
constructing software libraries”, IEEE Trans. Software
Engineering, vol.17, no.8, 1991,pp.800-813.
[10] Y. Yang, T. Ault, T. Pierce, and C. W. Lattimer,
“Improving text categorization methods for event
tracking”, Proceedings of ACM SIGIR Conference on
Research and Development in Information Retrieval
(SIGIR'00),2000, pp65-72.
[11] Naohiro Ishii, Tsuyoshi Murai, Takahiro Yamada,
Yongguang Bao,”Text Classification by Combining
Grouping, LSA and kNN, icis-comsar”, 5th IEEE/ACIS
International Conference on Computer and Information
Science and 1st IEEE/ACIS International Workshop on
Component-Based Software Engineering, Software
Architecture and Reuse (ICIS-COMSAR'06),2006,
pp.148-154.
[12] Dhillon I. S., Fan J., and Guan Y, ”Efficient Clustering of
Very Large Document Collections”, Data Mining for
Scientific and Engineering Applications, Kluwer,2001.
[13] Ravi Teja, Dr. Jasmine K.S, “ Data mining tools - An
Analytical approach”, International Journal of
Engineering Research & Technology (IJERT),Vol.4,
No.27,May2016,pp.4-8, ESRSA Publication.
[14] Jasmine K.S, Dr.R.Vasantha,”A Mixed Method Approach
for Efficient Component Retrieval from a Component
Repository”, International Journal of Software
Engineering and Applications, Journal of Scientific
Research, ISSN:1945-3116,Vol.4,No.7,July 2011, pp.
442-445

29

You might also like