Ebook Text Analytics Beginners Guide

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Text Analytics

Beginner’s Guide
Extracting Meaning from
Unstructured Data
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Contents

Text Analytics 3
Use Cases 7
Terms 9
Trends 14
Scenario 15
Resources 24

2 ©2013 Angoss Software Corporation. All rights reserved.


Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics
Successful companies today both listen to and understand what customers are
saying and are taking action in response to customer feedback by incorporating the
voice of the customer (VOC) into business strategies for sales, marketing and
customer service using text analytics.

Powerful trends in social media, e-discovery, The transformed information from text
customer services (call center transcriptions analytics can be combined with structured
of voice calls, customer complaint emails and data (e.g., sales and demographic data)
instant messaging) and customer-centric and analyzed using various business
business strategies are driving IT leaders to intelligence or predictive and automated
consider text analytics as a powerful discovery techniques.
business tool.

3
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics
What is Text Analytics?

Text analytics describes a set of Text analytics is the process of analyzing


linguistic, statistical and unstructured text, extracting relevant
machine learning techniques information, and transforming it into
that model and structure the structured information that can be
information content of textual leveraged in various ways.
sources for business
intelligence, exploratory data
analysis, research or
investigation.

4
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics
Structured Data
Today, 80% of business information ...although structured data
originates in unstructured data; primarily continues to be the primary source
text with no identifiable structure. for business intelligence.

5
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Text Analytics
Unstructured Data

INTERNAL EXTERNAL

• Emails • Blogs
• Customer Surveys • Social Media
• Documents • Tweets
• Call Center Notes • Online Forums
• Claims Records • Articles / Reports
• Customer Forms • Web
• Customer Letters

6
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Use Cases

Text Analytics transforms unstructured data into


structured data for analysis to help...
• Monitor and analyze brand reputation • Understand customer feedback
• Determine purchase behavior • Improve customer retention
• Identify product issues • Predict and reduce churn
• Summarize surveys, customer reviews • Identify and reduce claims fraud
• Improve customer service and • Develop cross-sell, upsell strategies
customer experience management • Design next best offer strategies

7
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Use Cases

Marketing Business Industry-Specific


• Voice of customer • Competitive intelligence • Fraud detection
• Social media analysis • Document categorization • E-discovery
• Churn analysis • Human resources • Warranty analysis
• Market research • Records retention • Medical research
• Survey analysis • Risk analysis
• Website navigation
• News feeds analysis

8
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms

Or…given a collection of text, text analytics 1. Entity: “Who, where, when” is


tells you who, where, when, what, and being discussed?
how so that you can figure out ‘why’.
2. Theme: “What” are the important
words?

3. Classification: “What” are the


important concepts?

4. Sentiment : “How” is the


conversation going? Is it positive
or negative?

9
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms
Entity
“Who, where, when” is being discussed?
Yahoo wants to make its Web e-mail service a place you never want to
– or more importantly – have to leave to get your social fix. Entity Type
The company on Wednesday is releasing an overhauled version of its
Yahoo Mail Beta client that it says is twice as fast as the previous Yahoo Company
version, while managing to tack on new features like an integrated
Twitter client, rich media previews and a more full-featured instant
messaging client.
Twitter Company
Yahoo says this speed boost should be especially noticeable to users Facebook Company
outside the U.S. with latency issues, due mostly to the new version
making use of the company's cloud computing technology. This means U.S. Place
that if you're on a spotty connection, the app can adjust its behavior to
keep pages from timing out, or becoming unresponsive.
Besides the speed and performance increase, which Yahoo says were
the top users requests, the company has added a very robust Twitter
client, which joins the existing social-sharing tools for Facebook and
Yahoo.

10
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms
Theme
“What” are the important words being used?
Yahoo wants to make its Web e-mail service a place you never want to
– or more importantly – have to leave to get your social fix. Theme Score
The company on Wednesday is releasing an overhauled version of its
Yahoo Mail Beta client that it says is twice as fast as the previous Cloud computing 4.11
version, while managing to tack on new features like an integrated
Twitter client, rich media previews and a more full-featured instant technology
messaging client.
E-mail service 2.672
Yahoo says this speed boost should be especially noticeable to users
outside the U.S. with latency issues, due mostly to the new version
making use of the company's cloud computing technology. This means
Top users 2.669
that if you're on a spotty connection, the app can adjust its behavior to
keep pages from timing out, or becoming unresponsive.
requests
Besides the speed and performance increase, which Yahoo says were
the top users requests, the company has added a very robust Twitter
client, which joins the existing social-sharing tools for Facebook and
Yahoo.

11
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms
Classification/Concepts
“What” are the important, high-level concepts?
Yahoo wants to make its Web e-mail service a place you never want to
– or more importantly – have to leave to get your social fix. Concept Score
The company on Wednesday is releasing an overhauled version of its
Yahoo Mail Beta client that it says is twice as fast as the previous Software and .56
version, while managing to tack on new features like an integrated
Twitter client, rich media previews and a more full-featured instant Internet
messaging client.

Yahoo says this speed boost should be especially noticeable to users


Social Media .60
outside the U.S. with latency issues, due mostly to the new version
making use of the company's cloud computing technology. This means Technology .49
that if you're on a spotty connection, the app can adjust its behavior to
keep pages from timing out, or becoming unresponsive. Business .72
Besides the speed and performance increase, which Yahoo says were
the top users requests, the company has added a very robust Twitter
client, which joins the existing social-sharing tools for Facebook and
Yahoo.

12
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Terms
Sentiment
“How” is the conversation going ? Positive or negative?
Yahoo wants to make its Web e-mail service a place you never want to Entity Sentiment
– or more importantly – have to leave to get your social fix.
The company on Wednesday is releasing an overhauled version of its Yahoo .534
Yahoo Mail Beta client that it says is twice as fast as the previous Twitter .48
version, while managing to tack on new features like an integrated
Facebook .534
Twitter client, rich media previews and a more full-featured instant
messaging client. Concept Sentiment
Software and Internet 0.0
Yahoo says this speed boost should be especially noticeable to users
outside the U.S. with latency issues, due mostly to the new version Social Media .48
making use of the company's cloud computing technology. This means Technology .49
that if you're on a spotty connection, the app can adjust its behavior to
keep pages from timing out, or becoming unresponsive. Theme Sentiment
Besides the speed and performance increase, which Yahoo says were Cloud computing 1.3
the top users requests, the company has added a very robust Twitter technology
client, which joins the existing social-sharing tools for Facebook and
Mail service .16
Yahoo.
Top user requests .83

13
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Trends

1. Social media analytics adoption drives text analytics.


2. Analytics moves beyond sentiment analysis.
3. The market begins to get the connection between text and Big Data.
4. Marrying structured and unstructured data becomes more popular.
5. The cloud becomes more popular for text analytics.

Text Analytics Victory Index Report, January, 2013

14
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Book Reviews: Customer Feedback
An online book retailer tracks customer feedback by analyzing reviews
and comments from online forums and social media.

They use Angoss KnowledgeREADER™ to extract meaning from the


text to discover what is being discussed and how – the sentiment
(positive or negative), and answer:

• What are customers saying on a regional basis?


• How frequently do certain entities, themes and topics occur?
• Which themes and topics occur together, and are related?
• How is sentiment trending over time?
• What is the context of what is being discussed at the document
level?

15
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Sentiment Dashboard

Sentiment breakdown across all


reviews

Sentiment distribution across all


documents

Sentiment distribution for Top 10


topics, themes and entities

16
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Comparison Analysis
The retailer can compare overall
sentiment across stores, or isolate
individual topics, themes, entities
and phrases to determine how those
items are discussed between various
regions

For example, you can see that the


topic “Technology” is viewed more
negatively in Store 2, but it is also
discussed more frequently as well.

17
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Trend Analysis
By isolating topics, themes, entities
or phrases, the retailer can examine
how frequently they were
mentioned.

They can also view how customer


sentiment regarding these terms
changed alongside the frequency of
their occurrence.

18
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Association Discovery
Using the Association Map, the
retailer can visually determine the
frequency with which certain terms
occur, and how closely they relate to
other terms used in customer
reviews.

The retailer can quickly assess how


well certain subjects are received,
and how much relative interest their
customers have in those subjects.

19
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Document Summary

Individual terms can be isolated, as


well as the sentences and documents
that reference them – giving you a
detailed look at the context used in
reviews.

Each text record can be completely


isolated for a full examination of the
content and sentiment contained
within.

20
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Decision Tree
Price

High 26,820 14.21%


Low 161,985 85.79%

Total 188,805 100.00%

rank_1_topic

null Advertising Agriculture Banking Business Elections Environment Hardware Health Science Software and Internet
Automotive Aviation Art Beverages Economics Fashion Social Media Traditional Energy Technology
High 92 30.77% High 768 39.65%
Hotels Education Biotechnology Marriage Mobile Devices Intellectual Property War
High 551 23.93% Low 207 69.23% High 853 11.83% Low 1,169 60.35%
Video Games Investing Crime Real Estate Labor
High 1,862 19.87% Low 1,752 76.07% Total 299 0.16% Low 6,356 88.17% High 3,064 21.83% Total 1,937 1.03%
Weather Law Disasters Renewable Energy Popular Culture
Low 7,509 80.13% Total 2,303 1.22% Total 7,209 3.82% Low 10,971 78.17%
Religion Food Robotics
High 8,662 10.94% Total 9,371 4.96% High 889 15.18% Total 14,035 7.43%
Politics Travel
Low 70,489 89.06% High 5,906 17.57% Low 4,968 84.82%
Space
Total 79,151 41.92% Low 27,711 82.43% High 1,136 9.16% Total 5,857 3.10%
Sports
Total 33,617 17.81% Low 11,270 90.84%

High 3,037 13.43% Total 12,406 6.57%


Low 19,583 86.57%

Total 22,620 11.98%

KnowledgeREADER can be used to analyze the output of your text analysis with structured data, and use data
mining and predictive analytics techniques to expand customer insights.

In this example, the retailer has created a Decision Tree that allows them to determine the price breakdown across
book genres. The Decision Tree uses ‘High’ and ‘Low’ price brackets to segment genres.

The retailer can now determine if there is a correlation between price, genre and overall sentiment. They may use
these insights to inform product inventory or pricing decisions.

21
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Strategy Tree
Price

High 5,208 10.67%


null
Low 43,581 89.33%

KnowledgeREADER can be used to build and


Beverages
Total 48,789 25.84%
Hotels
Avg Rating 5.00
Real Estate
Avg Sale Price $13.45
Video Games
0.34

deploy predictive strategies with Strategy Trees.


Avg Sentiment
Weather
Most Common Phrase wonderful
Treatment E-Mail BOGO

Price

High 1,211 21.10%


Low 4,529 78.90%
Advertising
Total 5,740 3.04%
Aviation

Here, the retailer has identified segments based


Avg Rating 5.00
Business
Avg Sale Price $18.65
Economics
Avg Sentiment 0.30
Most Common Phrase great

on price and genre. In addition, they can track key


Treatment E-Mail New Hot Reads

Price
Agriculture
High 1,755 13.67%
Art
Low 86.33%

metrics that drive store performance.


11,079
Crime
Total 12,834 6.80%
Disasters
Avg Rating 5.00
Health
Avg Sale Price $15.32
Space
Avg Sentiment 0.24
Sports
Most Common Phrase wonderful
Traditional Energy
Treatment E-Mail Buy 3 Get 4th Free

Price

Total
Avg Rating
17 0.01%
4.65
$13.24
Total 75,890 40.19%
Automotive

Banking
Marriage
High
Low

Total
618
5,813

6,431
9.61%
90.39%

3.41%
Combined with the text analysis output, this
measures the average sentiment, rating, sale price
Avg Sale Price Avg Rating 5.00
null
null Avg Rating 3.01 Renewable Energy
Total 100.00% Avg Sentiment $12.40
188,805 Avg Sale Price
Avg Sale Price $14.71 Robotics
Most Common Phrase null 0.26
Avg Rating 4.20 [1,4] Avg Sentiment
Avg Sentiment 0.12 Travel
Treatment Ignore
Avg Sale Price $15.05 word_count Most Common Phrase wonderful
Most Common Phrase great

and the most common themes discussed in each


Avg Sentiment 0.22 Treatment E-Mail BOGO
Total 188,788 99.99% Treatment Ignore
Most Common Phrase great
Avg Rating 4.20 Price
Price
[1,5644] Avg Sale Price $15.05 rating 22.68%
High 1,819
0.22 High 16,414 14.54%
Avg Sentiment

segment.
Biotechnology Low 6,200 77.32%
Most Common Phrase Low 96,484 85.46%
great Elections Total 8,019 4.25%
Total 112,898 59.80%
5 rank_1_topic Science Avg Rating 5.00
Avg Rating 5.00
Technology Avg Sale Price $17.98
Avg Sale Price $15.27
War Avg Sentiment 0.23
Avg Sentiment 0.30
Most Common Phrase wonderful
Most Common Phrase wonderful
Treatment E-Mail 25% Off Coupon

Price

Education
Intellectual Property
High
Low

Total
3,725
16,815

20,540
18.14%
81.86%
10.88%
By associating a treatment with each segment, the
retailer can automatically assign specific actions
Labor Avg Rating 5.00
Law Avg Sale Price $17.05
Religion Avg Sentiment 0.27
Most Common Phrase wonderful

or activities to each segment.


Treatment E-Mail New Hot Reads

Price

High 369 25.31%


Low 1,089 74.69%

Total 1,458 0.77%


Environment
Avg Rating 5.00
Hardware
Avg Sale Price $20.97

Now, the book retailer can quickly turn insight into


Avg Sentiment 0.24
Most Common Phrase wonderful
Treatment E-Mail 25% Off Coupon

action.
Price

High 1,295 15.99%


Fashion
Low 6,802 84.01%
Food
Total 8,097 4.29%
Investing
Avg Rating 5.00
Mobile Devices
Avg Sale Price $16.66
Politics
Avg Sentiment 0.29
Popular Culture
Most Common Phrase wonderful
Treatment E-Mail New Hot Reads

Price

High 414 41.82%


Low 576 58.18%

Total 990 0.52%


Social Media
Avg Rating 5.00
Software and Internet
Avg Sale Price $25.10
Avg Sentiment 0.34
Most Common Phrase great
Treatment E-Mail 25% Off Coupon

22
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Scenario
Angoss KnowledgeREADER
KnowledgeREADER is an industry-first software application that brings a new age
of integrated customer intelligence by combining visual text discovery and
sentiment analysis with the power of predictive analytics.

Now, customer intelligence professionals and marketers can easily understand and
model customer feedback without relying on data analysts.

KnowledgeREADER delivers unparalled customer intelligence and voice of the


customer insights to support customer experience management—above and
beyond what text analytics users have come to expect.

Learn more about KnowledgeREADER

23
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

Resources
Video
Quick Tour of KnowledgeREADER

Articles
Voice of the Customer, How to Move Beyond Listening to Action
Text Analytics Categorization and Concept Topics
Text Analytics Phrase and Theme Extraction
Text Analytics Sentiment Extraction
Text Analytics Named Entity Extraction

Brochure
KnowledgeREADER

Web
KnowledgeREADER

24
Text Analytics | Use Cases | Terms | Trends | Scenario | Resources

About Angoss
Angoss Software Corporation is a global leader in delivering
business intelligence software and predictive analytics to
businesses looking to improve performance across sales, marketing
and risk. With a suite of desktop, client-server and big data software
products and Cloud solutions, Angoss delivers powerful approaches
to turn information into actionable business decisions and
competitive advantage. Angoss software products and solutions are
user-friendly and agile, making predictive analytics accessible and
easy to use.

For more information visit www.angoss.com.

25 ©2013 Angoss Software Corporation. All rights reserved.

You might also like