A Hybrid Model of Sentimental Entity Recognition o

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Wang et al.

EURASIP Journal on Wireless Communications


and Networking (2016) 2016:253
DOI 10.1186/s13638-016-0745-7

RESEARCH Open Access

A hybrid model of sentimental entity


recognition on mobile social media
Zhibo Wang1,2, Xiaohui Cui1*, Lu Gao1, Qi Yin1, Lei Ke1 and Shurong Zhang1

Abstract
With new forms of media such as Twitter becoming increasingly popular, the Internet is now the main conduit of
individual and interpersonal messages. A considerable amount of people express their personal opinions about
news-related subject through Twitter, a popular SNS platform based on human relationships. It provides us a data
source that we can use to extract peoples’ opinions which are important for product review and public opinion
monitoring. In this paper, a hybrid sentimental entity recognition model (HSERM) has been designed. Utilizing 100
million collected messages from Twitter, the hashtag is regarded as the label for sentimental classification. In the
meanwhile, features as emoji and N-grams have been extracted and classified the collected topic messages into
four different sentiment categories based on the circumplex sentimental model. Finally, machine learning methods
are used to classify the sentimental data set, and an 89 % precise result has been achieved. Further, entities that are
behind emotions could be gotten with the help of SENNA deep learning model.
Keywords: Feature selection, Sentiment analysis, Sentiment classification, Entity recognition

1 Introduction so on [1]. To be specific, comments about products in


The social network is highly developed nowadays, and tweets are well worth mining. Sellers can get the buyers’
people can almost access it anywhere. In this aspect, comments in real time and then update their own
there are many projects which are worth doing research products to be more competitive in the market place;
on. Twitter allows users to login, through the webpage, buyers can get others’ experience through these com-
mobile devices, PC, or other client. Users can send an ments to help them decide whether to buy a product.
about 140-word message to Twitter every time for Due to the characteristics of a tweet such as they can
sharing information, and the fans who follow these users be produced in real time and can spread widely and
can repost or share these messages. These messages quickly, they have large influence on network public
which express the users’ opinions, thoughts, and emo- opinion transmission [2]. It is necessary for the govern-
tions in 140 words are called tweets. By the end of ment to know and analyze the public opinion on some
March 2012, there were more than 140 million active hot social issues preventing the views of the public from
users, living in all over the world, in Twitter. Every day, being deluded by criminals who may harm the country
Twitter’s users posted about 340 million tweets which and society. So, recognizing emotions and entities of
use over 10 TB storage and the number is rising con- Twitter data has a very important reference value.
tinuously. Because of these, Twitter is one of the top 10
most visited websites.
For these reasons, Twitter attracts a large number of 2 Related work
natural language processing experts to research on this As a main form of media in the social network and the
field. The data mining and analysis of Twitter can be main microblog abroad, Twitter attracts people more
used in various fields, such as epidemic prediction, and more. Tweets contain different tendencies and
population migration, public opinion surveillance, and emotional characteristics; and mining these features is
meaningful for public opinion monitoring, marketing,
* Correspondence: [email protected]
and rumor control. In general, most emotional analysis
1
International School of Software, Wuhan University, Wuhan 430079, China only divides the text emotion into three categories:
Full list of author information is available at the end of the article

© The Author(s). 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 2 of 12

neutral, positive, and negative. It is limited to help people Finally, combining some rules, the names could be recog-
listen to the real voice and emotion of society. nized automatically. Turkish scholars [12] did the
The study of social media is relatively new. As early as named-entity recognition on their domestic twitter. In
2005, Park et al. [3] began to analyze the emotion on their article, a new named-entity-annotated tweet corpus
Twitter. They labeled more than 20,000 tweets and was presented and the various tweet-specific linguistic
created an emotion polarity dataset by labeling the phenomena were analyzed. After that, Derczynski L and
neutral-positive-negative emotional tag manually. Next, his group [13] also worked on the similar field. Even
they developed an emotion classifier by using machine Xu et al. [14] have a patent on named-entity recogni-
learning method based on Naïve Bayes, support vector tion inquiry.
machine (SVM), and conditional random fields (CRF). Some relevant techniques in data mining of tweets’
Read et al. [4] put forward that they used twitter applica- fine-grained sentiment analysis will be researched, in-
tion program interface (API) to get a great number of cluding the methods of tweets collection, the tweets
emotion icon and demonstrated these icon’s effect on pre-processing, and the construction of knowledge.
emotion classification in detail. Go et al. [5] developed Based on a tweets’ emotional dictionary, sentiment ana-
three machine learning classifiers based on Naïve Bayes, lysis based on weighted emotional words meaning and
maximum entropy, and SVM by using a non-supervisor sentiment analysis based on multi-feature fusion.
machine learning algorithm. They added the emotion Tweets text has the features of a large amount of data,
icons into the selected features which caused the accuracy covering a wide range and rapidity, so it impossible to
of the emotional tendency discrimination to be more than monitor hot events and analyze guidance of public
80 %. This research has been applied to many business opinion manually. For processing the huge amount of
fields such as online shopping, online film review, and unstructured text data, machine learning and deep
online 4S shop message. For instance, Fei HongChao learning have had some certain breakthrough in the
analyzed the review text aimed at Yahoo English Sports. field of text processing. In the part of sentiment ana-
Through that, the attitude of the investors to the stock lysis, we will build a circumplex sentiment model by
market could be discovered. Ghose et al. researchers using hashtags as the classification tags, catching N-
started to apply the LingPipe for emotion classifica- gram, and emoji features. Then, the emotion will be
tion. They tried to increase the accuracy of classifiers classified through the processing of a SENNA model. It
by labeling the training set manually and then recog- is possible to classify four kinds of emotions which we
nized the emotion tendency of the original text. The described in advance.
amount of research about text mining based on emo-
tion is growing, and the related research fields are ex- 3 Definition of question
tended at the same time. R. Pavitra [6] established an We aim to deduce the user’s emotion by analyzing their
analysis model based on the weakly supervised joint tweet text. To give formalized definition of this question,
sentiment-topic mode and created a sentiment the- we predefine some like below:
saurus with positive and negative lexicons to find the
sentiment polarity of the bigrams. Wang and Cui [7, 8] Definition 1(Tweet words w): Since each word in
worked on group events and disease surveillance to re- a tweet is possible related to users’ emotion,
search the sentiment analysis. They also extended the so we add up all the words in blog text and use a
data source to multimedia for research on sentiment two-tuples to represent it, w = {t, a}. t is the text
analysis [9]. form of w, a is the frequency of w in a tweet.
Recently, with the development of computer tech- Definition 2 (sentiment dictionary D): for each
nology on information searching and search engines, sentiment, we can design a dictionary which
named-entity recognition has been a hot topic in the can represent it sharply, called sentiment
field of natural language processing. Asahara [10] per- dictionary. The dictionaries of different sentiment
formed the automatic identification on names and or- can include same words, since dictionary exacts
ganizations by SVM and got some good results. Tan influence on the sentiment analysis as entirety.
utilized a method based on transferring the wrong drive We use a two-tuples to represent each sentiment
to get context contact rules of naming entity places. dictionary: Di = {d, v}. d is each word in dictionary,
Next is using rules to implement automatic identifica- v is central vector of this sentiment. The closer
tion the names of places. According to the data test, the user’s vector model is to central vector, the more
accuracy of this method can achieve 97 %. Huang et al. likely the user is to be this sentiment. The words
[11] got a large amount of statistics data from vast real in dictionary also can be represented as two-tuples:
text data, and they calculated every one’s reliability of d = {t, c}. t is the text form of d, c is the relevancy
continually-words-construction and words construction. of the word d and the sentiment.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 3 of 12

4 Methods On the condition that avoids damaging the core con-


4.1 Data collection tent of texts, the main purpose of feature extraction is to
Twitter has played a vital role in the spread of social hot reduce the dimension of feature vectors as far as pos-
spots; therefore, a comprehensive, timely, and accurate sible, reducing the number of words needed to process.
access to twitter data is the basis and premise of our Thus, it should simplify the computational complexity
data analysis. At present, there are three main acquisi- and ultimately improve the efficiency and speed of text
tion methods for Twitter: API acquisition, webpage processing.
analysis acquisition, and APP capture acquisition. There are several factors that could be used to imple-
Our training set data uses dictionary tables as key- ment text feature extraction, such as word frequency
words. Besides, a tweet acquisition system based on a method, principal component analysis method, TF-IDF
search API has been designed. Aims to categorize four method, N-gram method, and emoji method
different emotions, it contains 50 thousand tweets as Actually, word frequency method may filter some words
training data. which have useful information but small frequency, TF-IDF
The system contains the following modules: method cannot differently calculate the weight based on
the different specific locations of the word, but N-gram
A, the keyword dictionary module: used to construct method can filter the text information according to the
data dictionaries and send requests to the API given threshold value and there are lots of emoji to express
according to the words in the dictionary their emotion in tweets nowadays. Considering the format
B, the pre-process module: used to preprocess the data and content of Twitter, features will be extracted from the
in real time, such as preprocessing time and extraction text by N-gram and emoji methods in this project.
of useful data N-gram model will be a continuous identification of
C, the database storage module: based on the returned large quantities of data words. In general, these data
data format, to use MongoDB, NoSql database, and vocabulary can be letters, words, syllables, and so on,
set up a collection of four different emotional words and we use N-gram model to implement the automatic
dictionaries classification of emotion.

4.2 Text feature extraction (1)Revised N-gram model


At present, the mainstream research of text feature ex- In statistical concepts, N-gram model can be
traction aims in the feature selection algorithm and text understood as any word in a text such as
representation model. algorithm w_i, and its emergence probability only
The basic unit of the feature is called text features. has some connection with the times it appeared.
Text feature must have the following characteristics: This determines the w_i’s independence. We can
apply the probabilistic conditional probability
(1)The ability to distinguish features of the target can be used to predict the probability of the w_i
and the non-target texts appeared.
(2)The ability to clearly express and identify text If using W represents a string of sequence which
content have N elements, then W = .... And the appearance
(3)Easiness to implement the classification between probability of the sequence is P (W):
features
(4)The number of features should not be too many P ðW Þ ¼ Pð ÞP ð j ÞP ð jÞ …P ð j … Þ ð1Þ

Fig. 1 Revised N-Gram model

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 4 of 12

Table 1 Emoji classification table of Twitter

If there is a large amount of data, according to the In the same way, we can also get the concept of N-
Markov assumption, the probability of a word gram model.
appears only associated with the probability of the In our research, we used the improved version
word in front of it, and then problem becomes of the N-gram model, namely adding padding
simple. Therefore, uni-gram model changed to the characters (general spaces or whitespace) at the
binary model bi-gram. beginning of each bi-gram and tri-gram to increase
the number of grams, improving the prediction
P ðW Þ material Pð Þ P ð j Þ P ð jÞ … P ð j Þ ð2Þ
accuracy of the models, as shown in Fig. 1.
In the same way, we can get tri-gram, the probability Sometimes, a tweet contains only a few words,
of words appearance only related to the probability using the tri-gram model can only static few
of two words in front of it. characteristics, but the feature quantity is
improved significantly after adding the padding
P ðW Þ material Pð Þ P ð j Þ P ð j ÞP ð jÞ … P ð j Þ ð3Þ characteristic.

Fig. 2 Circumplex sentimental model.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 5 of 12

(2)Emoji Table 3 Precision between the classifiers and characteristics


The emoji classification table of Twitter found in Characteristics Naïve Bayes Logistic regression SVM KNN
Table 1. Uni-gram 86.3 85.2 89 88.1
Uni-gram, emoji 86.4 84.2 89.1 88.5
4.3 Sentimental entity analysis
Uni-gram, punctuation 86.6 85.6 88.9 88.8
This section contains the application of sentimental
models in the analysis of emotion and emotion detection All features 86.9 85.9 89.8 89.1
of Twitter text. At present, most sentimental model are
based on psychological theories and social network sen- column that is added to an existing column. An example
timental hypothesis. of the pipeline is text segmentation that divides texts
According to the theme needed and the variety of into a large number of words; TF-IDF transforms it
sentimental, this paper adopts the circumplex and four into a feature vector. In this process, the tag will be
coordinate pole edge as sentimental model and key- processed for model fitting when the ID, text, and
words. Figure 2 is an example of circumplex sentimental words are transferred into the conversion process.
model. In order to maintain the independence of the Classification is an important task in data mining.
four kinds of emotion and reduce affection overlaps, it Using a machine learning classifier to train a set is for
removes sentimental words between axes. the purpose to let the machine trained classifier automatic
Finally, it acquires four types of emotional keywords in classify the content of new data and thus liberating human
the dictionary. Keywords in this dictionary are the training resources to undertake artificial classification. Therefore,
data set that used to train the data. The dictionary table is it is very important to clarify which kind of classifier to
shown in Table 2. use to classify data in the field of data mining.
There are four different classifiers to choose to classify
4.4 Text classifier training data: Naïve Bayesian, logical regression, support vector
Scikit-learn is a Python module for machine learning, machines, and K-Nearest Neighbor algorithm. Naïve
built on the basis of SciPy and 3-ClauseBSD open Bayesian is a generative model, by calculating the prob-
source license. Its main features are (1) the advantages ability to classify, can be used to handle multiple classi-
of simple operation, efficient data mining, and data fications. Logical regression can be implemented by
analysis, (2) unrestricted access that can be re used in simple and need low storage resources. SVM is the best
any cases, (3) basis on NumPy, SciPy, and Matplotlib, existing classifier, and existing means can be used directly
and (4) commercial open source BSD license certificate. without modification. You can get a lower error rate, and
A pipeline refers to the data processes for what a pipe- SVM can do well in classification decisions on the data
line always contain multiple feature conversion phases. outside the training set. K neighbor algorithm can not
The feature transformation could be regarded as a new only be used for classification but also can be used for re-
gression analysis.
In order to achieve data classification, Naïve Bayesian,
Table 2 Four types of sentimental dictionaries logistic regression, SVM, and K-Nearest Neighbor algo-
HappyActive UnhappyActive rithm have been implemented in our classifiers, and a five-
fold cross validation also contributes to the evaluation of
#elated,#overjoyed,#enjoy,#excited, #nervous,#anxious,#tension,#afraid,
the final results. Table 3 shows the precision of the classi-
#proud,#joyful,#feelhappy,#sohappy #fearful,#angry,#annoyed,#annoying
fiers, and Table 4 shows the recall rate of the results.
#veryhappy,#happy,#superhappy, #stress,#distressed,#distress, In text classification, support vector machine (SVM) is
#happytweet,#feelblessed,#blessed, #stressful#stressed,#worried,#tense, one of the best classifier algorithms. It is because the use
#amazing,#wonderful,#excelent, #bothered,#disturbed,#irritated, of a support vector machine can transform linearly non-
#delighted,#enthusiastic #mad,#furious separable text to high-dimensional space that can make
HappyInactive UnhappyInactive
it classifiable. The kernel function is used to transform
from low dimensions to high dimensions and obtain the
#calm,#calming,#peaceful,#quiet, #sad,#ifeelsad,#feelsad,#sosad,
#silent,#serene,#convinced,#consent #verysad,#sorrow,#disappointed, Table 4 Recall rate between the classifiers and characteristics
#contented,#contentment,#satisfied, #supersad,#miserable,#hopeless, Characteristics Naïve Bayes Logistic regression SVM KNN
#relax,#relaxed,#relaxing,#sleepy, #depress,#depressed,#depression, Uni-gram 86.3 85.5 89.3 88.3
#sleepyhead,#asleep,#resting, #fatigued,#gloomy,#nothappy, Uni-gram, emoji 86.4 84.9 89.7 88.5
#restful,#placid #unhappy,#suicidal,#downhearted, Unigram, punctuation 86.5 85.3 88.6 88.9
#hapless,#dispirited All features 87 86 90 89

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 6 of 12

result of the high dimension. Compared with the Naïve


Bayes and KNN algorithms, SVM has the advantages of
high classification speed, high accuracy of the training
data size, and universal property.
We choose to use SVM classifier in our experiment
and the following is the principle:
SVM method is to make the sample space mapped
into a high-dimensional and even infinite dimensional
feature space (Hilbert space) through a non-linear map-
ping p, so that the nonlinear separable problem in ori- Fig. 3 Deep learning training model
ginal sample space is transformed into linearly separable
problem in feature space.
In our paper, we regard precise rate and recall rate as M right
the evaluation index. i
Recall rateðiÞ ¼  100% ð5Þ
Assume M all
i
M right represents the number of tweets which is
i And the average precise rate and average recall rate
correctly divided into class Ci can be calculated according to the formulas:
M wrong represents the number of tweets which is mis- Xm
i i¼1
Precision ratei
takenly divided into class Ci Average precise ¼ ð6Þ
m
Xm
M all represents the number of tweets which is Recall ratei
i¼1
i Average recall ¼ ð7Þ
m
included in the class Ci actually
Then Algorithm SVM: The SVM based on the analysis of
Twitter user’s sentiment.
M right Input
i
Precise rateðiÞ ¼  100% ð4Þ
M right þ M wrong  D: sentimental classification dictionary
i i  T: all target tweet

Fig. 4 Words in high-dimensions space

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 7 of 12

 S: sentimental category table included in system Table 6 Label table of sentiment classification in Twitter
Emotion type Label
Output: the sentiment E of target tweet t Happy-active 1
Happy-inactive 2
1. For each T’s every word w do
Unhappy-active 3
2. For each α1 not satisfied KKT condition
3. distance (α1, α2)
If min =X Unhappy-inactive 4
m
4. Check i‐1
αi  label ¼ 0
5. Calculate α1, α2
6. Check label * (WTX + b) ≥ 1.0 presented in the Hinton paper. The basic idea of deep
7. Calculate b1,b2 learning is that, for a system S with N level (S1, S2… SN).
8. Returnα matrix and b matrix If input is I and output is O, a formula could be used to
express this system as I= > S1= > S2= > S3 = > Sn= > O.
4.5 Entity detection This system should automatically learn some features to
Named-entity recognition, also called entity recognition, help people make decisions. By adjusting parameters in
means the entities with specific meaning in a series of each layer of the system, the output of the lower level
text and it mainly refers to the names of people, places, could be the input for the higher level. And by piling from
organizations, and proper noun. There are four main various layers, there could be a hierarchical expression of
techniques for named-entity recognition: input information.
Deep learning training model of the system is in Fig. 3.
(1)The statistical-based recognition method. The main Natural language is the human-use communication
statistical models for named-entity recognition and direct language, but in the order for computers to
include hidden Markov model, decision tree make computational identification, it needs to convert
model, support vector machine (SVM) model, natural language into computer-use symbols, usually
maximum entropy model, and conditional called the digital of natural language. In deep learning,
random fields model. words are embedded to represent words. The word em-
(2)The rule-based recognition method. It mainly uses bedding method was proposed by Bengio more than a
two pieces of information: restrictive clauses and dozen years ago. The words in the language are mapped
named-entity words. into the high-dimensional vector with 200 to 500 dimen-
(3)The recognition method of combining rules sions. By training word vector with deep learning, each
and statistics. Some mainstream. Named-entity word has their corresponding spatial coordinates in
recognition systems combine the rules and the high-dimensional space. The sample of space coordinate
statistics. First, they use the statistical methods to map is in Fig. 4:
the image to recognize the entities, and then, At the beginning of the training process of the word
correct and filter them by the rules. vector, each word will be given a random vector. For ex-
(4)The recognition method based on machine learning. ample, deep learning is used to predict if a quintet
This technology in English is developed to some phrase is true, such as “Benjamin likes play the basket-
extent. Classifying English words by SVM machine ball”. If taking the replacement for any one of the words
methods can achieve an accuracy more than 99 % in this sentence, such as replacing “the” with “theory”,
when the places or names of people are recognized. “Benjamin likes play theory basketball” is obviously not
true in the grammar. Using models trained by deep
Deep learning is a new branch in the field of machine learning. It is possible to predict whether changed quintet
learning and a kind of algorithm that stimulates func- phrases are true or not.
tions of the brain. Deep learning originated from the SENNA not only proposed the method for building
deep belief nets originated from the Boltzmann machine word embedding but also solved the natural language

Table 5 A sample table of sentiment classification in Twitter Table 7 Sample labeling of Twitter
Tweets Type Tweets Label
Happy-active 1
Unhappy-active 3
Happy-active 1
Unhappy-inactive 4

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 8 of 12

Fig. 5 Precision rate and recall rate of Naïve Bayes

processing tasks (POS, Chunking, NER, SRL) from the tweets as the tag which are automatically classified by
perspective of neural network language model system. In using a machine method. A hashtag, as a level of tag in
SENNA, each word can be directly found from the the tweets, is used to record a certain topic. The paper be-
lookup table. lieves that the tweets with the label of a certain sentiment
The word vectors of HLBL in SENNA which are dif- category belong to the category, so as to implement the
ferent from each other are used to depict different se- automatic classification of the machine.
mantics and grammar usage. The word vectors of a
word are combined by various forms of the vector word
eventually. SENNA directly pieced the vectors together 5.2 Data preprocessing
and represented the words. Not like the traditional news or media data, tweets are a
Then, the emotion will be classified through the pro- kind of data as daily expression, so they have a lot of
cessing of a SENNA model. It is possible to classify four error and “noise”. Before classification, the “noisy” data
kinds of emotions which we described in advance. should be deleted as following:

1) Deleting username and URL


5 Experiment and analysis A part of usernames contains sentiment words and
5.1 Data acquisition are even contained by our sentiment dictionary.
The data set used in the paper is collected from Twitter However, the username has little use in the
between Mar. 1st and May 1st of 2015 by means of the sentiment analysis. In order to exclude the
search API in the open platform of Twitter. The diction- interference on the experimental results by the
ary of keywords is included in four kinds of sentiment username, the paper regards all the labels with the
whichused to get the training data. It takes a great num- “@” symbol as USERID. Meanwhile, the tweets
ber of resources to classify the tweets due to its enor- contain a lot of URL which are useless for text
mous quantity. Therefore, the paper selects hashtag in the processing, so they are replaced by the word URL.

Fig. 6 Precision rate and recall rate of logistic regression

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 9 of 12

Fig. 7 Precision rate and recall rate of KNN

2) Spelling mistakes 5.3 Text feature extracting


Tweets have a large number of words with spelling During the building of pipeline, the module adopts the
mistakes, such as “asd” (sad), “Happyyyyyy!!”, text feather of N-gram and adds English punctuation,
‘’Cooool~~”, some of which are unintentional emotional symbols, and bag of words as feathers. Taking
misspellings and some for emphasis. In order to the emoji as an instance, the tweets containing the same
reduce the interference, the module corrects the category of emoji are classified as the same category
misspelled words by using the wrong words (Table 5).
dictionary of tweets. Emoji is stored as unicode, so it is the first to classify
3) Abbreviations the unicode character and signed by number (Table 6).
There are many abbreviations of words in tweets Scanning every tweet, and add a key-value pair in list
such as “good4you” which means “good for you”. A when meeting an emoji (Table 7).
particularly large number of words are expressed by Wordnet builds a network including all synonyms
the short form of letters and numbers; therefore, the and can replace all words in the same synonym network
correcting of abbreviation is also a significant part of as the same word. Therefore, in the sentiment classifi-
preprocessing. cation, the module reduces the feature dimensionality
4) Confliction between sentiment categories and increases the classification accuracy by using the
There are four sentiment categories, and some large replacement.
tweets contains hashtag in two different sentiment This module adopts a sentiment dictionary provided
categories. These tweets are named conflicting by Harvard University that has fewer words so as to re-
tweets. In order not to affect tweets classification duce the dimension of the feature vector space and the
accuracy, the module deletes these tweets! amount of calculation.

Fig. 8 Precision rate and recall rate of SVM

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 10 of 12

Table 8 Four kind of emotion and the entity extracted from them
Happyactive HappyInactive UnhappyActive UnhappyInactive
Ganada Netflix Mad Men Pocket Full Of Gold
NDSU youtube LiveYourTruth AmnestyOnline
Sensation Levis HolyBible AhmedahliCom
Game Newspaper KingLikeAQueen EdgarAllanPoe
Filmphotography VallartaGV NTUWFC Elena
KimFCoates yoga backstreetboys JonnyValleyBoy
Ft. Beyonce CandyCrushSaga BLOOMparenting GinyTonicBlog
Drake ICandlelighters STFU Louise Havasupai
StuartPWright HillCountry YL train Bethany
Longley_Farm SLU Rebecca De Mulder David Letterman

5.4 Classifier training and result analysis SVM is different. The value of precision rate is around
From the graph, we can see that by using different way 89 % (88.9~89.8 %), and the recall rate is around 89 %
of feature extraction, the precision rate and recall rate of (88.6~90 %), too. It proved that with uni-gram, emoji,
Naïve Bayes is different. The value of precision rate is and punctuation of these all features, the precision rate
around 86.5 % (86.3~86.9 %), and the recall rate is around can be up to the maximum 89.8 % (Fig. 8).
86.5 % (86.3~87 %), too. It proved that with uni-gram, Comparing the data in this project, it is obvious that
emoji, and punctuation of these all features, the precision by using uni-gram, emoji, and punctuation as character-
rate can up to the maximum 86.9 % (Fig. 5). istics and SVM as emotional classifiers, the classification
From the graph we can see that by using different way accuracy could reach 89.8 %. SVM is the best sentimen-
of feature extraction, the precision rate and recall rate of tal classification method for our experiment.
Logistic regression is different. The value of precision
rate is around 85 % (84.2~85.9 %), and the recall rate is 5.5 Results and analysis of named-entity recognition
around 85 % (84.9~86 %), too. It proved that with uni- After the training of emotional classifiers, an automatic
gram, emoji, and punctuation of these all features, the classifier has been implemented to deal with 5000 new
precision rate can be up to the maximum 85.9 % data values. Also, the named-entity recognition has been
(Fig. 6). processed for each type of data. In this section, the
From the graph, we can see that by using different way SENNA deep learning toolkit is adopted for entity ex-
of feature extraction, the precision rate and recall rate of traction for each type of data and at the same time all of
KNN is different. The value of precision rate is around these words are sorted. Table 8 shows the top 10 results.
88.5 % (88.1~89.1 %), and the recall rate is around For each type of emotional entities, visual graphic dis-
88.5 % (88.3~89 %), too. It proved that with uni-gram, plays as follows (Figs. 9, 10, 11, and 12).
emoji, and punctuation of these all features, the preci- By using SENNA, we extracted the emotional entities
sion rate can be up to the maximum 89.1 % (Fig. 7). from 5000 new data values. Actually, these entities are
From the graph, we can see that by using different way the reasons why users show these types of emotions.
of feature extraction, the precision rate and recall rate of The result shows that Netflix could let the users feel
“HappyInactive”. At the same time, when we read the

Fig. 9 HappyActive word clouds Fig. 10 HappyInactive word clouds

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 11 of 12

level. However, identification accuracy of entity


recognition theory in deep learning can reach
more than 95 %.

The technology provided in this paper can be applied


to the public opinion monitoring of social media. The
training of the classifier is can be also used for Sina
Weibo in Chinese public opinion analysis. In this re-
search, we find that the scale of the data used in the ex-
periment has a huge effect on the precision of the final
Fig. 11 UnhappyActive word clouds results. So, in order to improve the accuracy of the clas-
sifier, the extensibility of the classifier for all tweet data
is of great significance. Besides, our work in the future
original essay, we noticed that foreign users often spend will transfer the emotional mining technology to Chin-
their leisure time on Netflix. Most people feel excited ese texts, so as to mine the feelings of Chinese Sina
when they arrived in Canada, which is an extraordinary Weibo users.
beautiful place. And students in NDSU always express
their activity during the exam period or graduation. Acknowledgements
This research is supported in part by the National Nature Science Foundation
of China No. 61440054, Fundamental Research Funds for the Central
6 Conclusions Universities of China No. 216274213, and Nature Science Foundation of
This project aims at the current Tweets sentiment ana- Hubei, China No. 2014CFA048. Outstanding Academic Talents Startup Funds
of Wuhan University, No. 216-410100003. National Natural Science Founda-
lysis technology such as traditional attitude classification
tion of China (No. 61462004). Natural Science Foundation of Jiangxi Province
for positive-negative and median-type, multivariate emo- (No. 20151BAB207042). Youth Funds of Science and Technology in Jiangxi
tion model classification. At the same time, we alter the Province Department of Education (No. GJJ150572).
machine learning method in the emotion classification
and deep learning method in entity recognition. Competing interests
The authors declare that they have no competing interests.
Innovation points of this project are the following:
Author details
1
1. In the current emotional analysis, most research International School of Software, Wuhan University, Wuhan 430079, China.
2
School of Software, East China University of Technology, Nanchang 330013,
just takes into account the classification of the China.
positive and negative attitude of emotion, and no
one worldwide found the composition causes behind Received: 26 May 2016 Accepted: 2 October 2016
the emotional analysis. But in our daily life, the
emotional composition analysis of the hot issues
References
and public opinion monitoring has a very wide 1. L Xiangwen, X Hongbo, S Le, Y Tianfang, Construction and analysis of the
application. third Chinese Opinion Analysis Evaluation (COAE2011) Corpus. J. Chin.
2. In order to use deep learning methods in named- Inform. Process. 01, 56–63 (2013)
2. A Apoorv, X Boyi, V Ilia, R Owen, P Rebecca, Sentiment analysis of Twitter
entity recognition, previous researchers used the data, in Proceedings of the Workshop on Languages in Social Media [D], LSM
pos tags. But for the data with a lot of noise, using ’11 (Association for Computational Linguistics, Stroudsburg, PA, USA, 2011),
this method, it is difficult to achieve a considerable pp. 30–38
3. Pak A, Paroubek P. Twitter as a corpus for sentiment analysis and opinion
mining[C] Seventh Conference on International Language Resources &
Evaluation. 2010
4. Read J. Using emoticons to reduce dependency in machine learning
techniques for sentiment classification. Proceedings of the ACL Student
Research Workshop. Association for Computational Linguistics, 2005:43–48
5. Go A, Bhayani R, Huang L. Twitter sentiment classification using distant
supervision. Cs224n Project Report, 2009:1–12
6. R Pavitra, PCD Kalaivaani, Weakly supervised sentiment analysis using joint
sentiment topic detection with bigrams. Electronics and Communication
Systems (ICECS), 2015 2nd International Conference on. IEEE. 2015, 889–893
(2015)
7. Zhibo W, Kuai H, Luyao C, Xiaohui C. Exploiting Feature Selection
Algorithm on Group Events Data Based on News Data. IIKI2015(2015
International Conference on Identification, Information, and Knowledge
in the Internet of Things.), 2015:62–65
8. C Xiaohui, Y Nanhai, W Zhibo, H Cheng, Z Weiping, L Hanjie, J Yujie, L
Fig. 12 UnhappyInactive word clouds Cheng, Chinese social media analysis for disease surveillance. Pers. Ubiquit.
Comput. 19, 1125–1132 (2015)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Wang et al. EURASIP Journal on Wireless Communications and Networking (2016) 2016:253 Page 12 of 12

9. W Zhibo, Z Chongyi, S Jiawen, Y Ying, Z Weiping, C Xiaohui, Key technology


research on user identity resolution across multi-social media. CCBD2015, 2015
10. CL Coh, M Asahara, Y Matsumoto, Chinese unknown word identification
using character-based tagging and chunking, in Proceedings of the 41st
Annual Meeting of the association of Computational Linguistics, Sapporo,
2003, pp. 197–200
11. H Degen, Y Yuansheng, W Xing, Z Yanli, Z Wanxie, Identification of Chinese
names based on statistics. J. Chin. Inform. Process. 02, 31–37+44 (2001)
12. Küçük D, Jacquet G, Steinberger R. Named entity recognition on Turkish
tweets. Language Resources and Evaluation Conference. 2014
13. L Derczynski, D Maynard, G Rizzo et al., Analysis of named entity recognition
and linking for tweets. Inform. Process. Manage. 51(2), 32–49 (2015)
14. G Xu, H Li, J Guo, Named Entity Recognition in Query: US, US9009134, 2015

Submit your manuscript to a


journal and benefit from:
7 Convenient online submission
7 Rigorous peer review
7 Immediate publication on acceptance
7 Open access: articles freely available online
7 High visibility within the field
7 Retaining the copyright to your article

Submit your next manuscript at 7 springeropen.com

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:

1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at

[email protected]

You might also like