Data Mining Concepts and Applications: Six Factors Behind The Sudden Rise in Popularity of Data Mining
Data Mining Concepts and Applications: Six Factors Behind The Sudden Rise in Popularity of Data Mining
Data Mining Concepts and Applications: Six Factors Behind The Sudden Rise in Popularity of Data Mining
Classification
Supervised induction used to analyze the historical
data stored in a database and to automatically
generate a model that can predict future behavior
Common tools used for classification are:
Neural networks
Decision trees
If-then-else rules
Data Mining Concepts and Applications
Clustering
Partitioning a database into segments in which the
members of a segment share similar qualities
Association
A category of data mining algorithm that
establishes relationships about items that occur
together in a given record
Data Mining Concepts and Applications
Sequence discovery
The identification of associations over time
Explore - this stage consists of the exploration of the data by searching for
unanticipated trends and anomalies in order to gain understanding
and ideas;
Model - this stage consists on modeling the data by allowing the software to
search automatically for a combination of data that reliably predicts a
desired outcome;
Assess - this stage consists on assessing the data by evaluating the usefulness
and reliability of the findings from the DM process and estimate how well it
performs. The SEMMA process offers an easy to understand process, allowing
an organized and adequate development and maintenance of DM projects.
Knowledge Discovery in Databases
KDD process
1. Selection
2. Preprocessing
3. Transformation
4. Data mining
5. Interpretation/evaluation
Text Mining
Text mining
Application of data mining to non-structured or less
structured text files. It entails the generation of
meaningful numerical indices from the
unstructured text and then processing these indices
using various data mining algorithms
Text Mining
Text mining helps organizations:
Find the “hidden” content of documents, including
additional useful relationships
Relate documents across previous unnoticed divisions
Group documents by common themes
Text Mining
Applications of text mining
Automatic detection of e-mail spam or phishing
through analysis of the document content
Automatic processing of messages or e-mails to route
a message to the most appropriate party to process that
message
Analysis of warranty claims, help desk calls/reports,
and so on to identify the most common problems and
relevant responses
Text Mining
Applications of text mining
Analysis of related scientific publications in journals
to create an automated summary view of a particular
discipline
Creation of a “relationship view” of a document
collection
Qualitative analysis of documents to detect deception
Text Mining
How to mine text
1. Eliminate commonly used words (stop-words)
2. Replace words with their stems or roots (stemming
algorithms)
3. Consider synonyms and phrases
4. Calculate the weights of the remaining terms
Web Mining
Web mining
The discovery and analysis of interesting and
useful information from the Web, about the Web,
and usually through Web-based tools
Data Mining Project Processes
Web Mining
Web content mining
The extraction of useful information from Web pages
Web structure mining
The development of useful information from the
links included in the Web documents
Web usage mining
The extraction of useful information from the data
being generated through webpage visits, transaction,
etc.
Web Mining
Uses for Web mining:
Determine the lifetime value of clients
Design cross-marketing strategies across products
Evaluate promotional campaigns
Target electronic ads and coupons at user groups
Predict user behavior
Present dynamic information to users
For more on Data Mining Techniques….
http://www.statsoft.com/textbook/data-mining-
techniques/