Title:: Data Mining & Warehousing
Title:: Data Mining & Warehousing
Title:: Data Mining & Warehousing
Mailing address:
Ms. Jayamma.R,
Ms. Archana.B
Email: [email protected]
[email protected]
Abstract
Information Overload
What to do?
In our opinion, corporate nirvana in the use of data mining tools
and data warehousing will only be achieved when companies link the
concept of data mining to equally sophisticated information retrieval tools.
These tools will work on the basis of combined machine and human
intervention in more intelligent ways than those presently offered in to
day's information retrieval tools. Corporations will need to run two
complementary data/information retrieval processes. One process will
literally mine data and allow software to detect hidden patterns. Another
process will query information through the posing of specific questions
and secure targeted answers.
Ever since Gibson, we are familiar with applying a spatial metaphor to the
Internet. Documents on the Web are distributed in a "cyberspace." By
extension of the metaphor, one "navigates" across the Web, visits sites,
and so on.
Part of the excitement of the Web flows directly from this
paradigm shift in how we view what information is, where it is, and how
one piece of information is related to another. In the near future we will
see the reality of the Web adhere even more strongly. We will be able to
navigate cyberspace with the aid of good maps, maps which tell us where
we are and what is nearby. Cyberspace will acquire the textures of real
space, with landmarks both personal and official. We will be able to mark
our trails like a cyber-Hansel and cyber-Gretel. We will be able to measure
distances in any number of useful ways, effectively warping that space to
our specification. All this has very interesting implications for progress in
achieving greater accuracy in information retrieval
Mapping Cyberspace
Personal agents are entering our lives more and more. And
over the next few years, they will come to have even more importance in
the area of information retrieval.
Coming from many sources, notably artificial life, agents have
been gaining popularity as a way of conceptualizing software design. The
agent has a certain autonomy, and inherent rules of behavior. There may
be many agents in a system, each responsible for one or many tasks, and
able to cooperate with other agents. For example, in the "Chiliad
publishing system," an agent maybe responsible for the maintenance of a
particular document, or subject area.
Some of these ideas are extensions of the already familiar
personalized newspaper, such as Point Cast. However, by having many
agents, and having them interacting in sophisticated ways and under
expert control, and then having the results presented in a visual map, we
create a system with both quantitative and qualitative advantages. The
multiplicity of agents contributes to the robustness of the system, since
imperfections in a given agent need not propagate. It also contributes to
its speed, since an agent-based system is naturally scalable.
Conclusion
If data mining and data warehousing are to become corporate
nirvana for the 21st Century, then they must be built as complex adaptive
systems with the business end user firmly in mind. As data mining
companies work on adding complements of information retrieval
processes and tools to their present suites of offerings, this will vastly
speed up the adaptation of the data mining industry to the broader needs
of the business end user. When the worlds of data mining and knowledge
extraction expand to include information retrieval from alternative media
formats and text based data, then "data mining" and "data warehousing"
will become the hottest buzzwords for businesses in the information age
References
www.dwinfocenter.org
www.extension.umn.edu
www.datawarehousing.com
www.firstmonday.org
www.marketsearch.com