Academia.eduAcademia.edu

Using Freeware Resources to Analyse Sentiments in Social Media

Abstract

The social media is gaining a lot of importance among businesshouses, academicians, medical practitioners, politicians, among others, due to its role in creating awareness about products, services, and socio-political views. The end users of these products, services, and views provide their feedbacks in the form of comments. An accurate determination of the sentiments of end users is crucial in designing policies and plans for products and services in future. As the processing power and storage capacities of computers have increased several folds, researchers can focus more on the accuracy of sentiment detection than consumption of computational resources. In this paper, we are applying a set of heuristics to analyse sentiments using freely available dictionary resources and open source tools. We have tested these heuristics over a large data set collected from standard sources. The experimental results are promising and opening new research directions in dictionary-based sentiment analysis.

I. INTRODUCTION

With the growing importance of social media in all walks of life including education, business, healthcare, politics etc., adequate attentions are being given to users' views about entities and products. The Internet penetration is increasing due to the availability of electronic devices in the form of desktops, laptops, tablets and smart phones. Easy accessibility of Internet is encouraging users of these devices to browse the Internet, view contents, perform transactions, and most importantly express their opinions about the contents described on the e-business portals and social media. The views expressed by the users accumulate into a large amount of data, which, if analysed properly, can provide interesting observations. In social media research lexicon, the user's comments are termed as sentiments and their deep analysis is considered as an important task from academic as well as industrial point of view. Some applications of sentiment analysis include book and movie reviews, recommender systems, and political campaign analysis. A careful analysis of the textual data generated by the users on social media gives a useful insight about the products, entities and events. This helps other stakeholders such as management and end users to take informed decisions. As these opinions are highly unstructured, analysing these views is a daunting task for the research community.

Keeping in view the importance of user's sentiments, several sentiment analysis tools have been created. Some of the freely available online sentiment analysis tools are: Social Mention (http://socialmention.com) for searching a term in blogs, microblogs, images, videos, etc., Twitter Sentiment (http://www.sentiment140.com) to discover twitter sentiment about any product, brand or person. Besides, some commercial tools such as conversation miner (http://converseon.com/miner), Attensity Analyze (http://www.attensity.com/attensity-analyze), Factiva (http://new.dowjones.com/products/factiva/) are also available. Some of these tools rely on a limited set of the emotions and determine the sentiment based on a set of keywords. Therefore, these tools are not able to capture the implicitly expressed views of the users. On the other hand, many other tools use sophisticated data mining or natural language processing techniques. These tools have been tested over users' views about movies or their tweets on microblogging website Twitter. Their performances are rarely evaluated against more complicated posts such as Facebook updates in political and social contexts. This gap motivates us to conduct a comparative research to judge their performance over big data collected from movie reviews and data collected from Facebook status updates. The research presented in this paper aims to conduct a study over utility of open source dictionary based sentiment analysis tools in predicting sentiments over complex user feedbacks in social media. This research will determine the shortcoming in the existing resources as well as identify novel uses of these tools in order to make them economic and more usable to business and academic community. The analysis of the results of the experiments opens new research directions in the field of sentiment analysis.

The remaining of the paper is organized as follows: Section 2 describes some important research work done in the last decade. Section 3 of the paper describes the method and tools used to conduct this research. Section 4 tabulates and analyses the results. Section 5 discusses conclusions and future directions in sentiment analysis.

II. RELATED WORK The most basic task in sentiment analysis is to classify opinions as positive or negative. This task can be performed at three levels: document, sentence and phrase level analyses [21]. In the document level analysis, sentimental polarity of overall document is computed [6][30] [39]. The sentence level analysis is based on the fact that a document may contain several sentiments and hence individual sentences should be examined for positivity or negativity [12][15] [17]. Some researchers [8] [23] fine-grained the research to the phrase level sentiment analysis, where the importance is assigned to individual words or phrases. Cesarano et al. [5] stressed on sentiment classification based on adjective phrases only, and proposed a scale ranging from -1 to +1 for measuring the degree of polarity in sentiments. Later, Benamara et al. [2] proposed that a combination of adjective and adverb gives more accurate results than adjectives only. Subrahmanian and Reforgiato [35] extended this concept to include verbs along with adjectives and adverbs for sentiment analysis on the same scale as in [5] to get better results. Also, there have been attempts to predict sentimental polarity at one level of granularity and utilize it to predict sentimental polarity at another level [25]. For instance, Zhang et al. [41] used a rule-based approach to determine document level sentiment classification by aggregating the outcomes of sentence level analysis for Chinese documents.

The inclusion of one or more sentiment dictionaries has been central to sentiment analysis researches [9][10] [27] [34]. The dictionary-based approaches for sentiment analysis require development of a well-defined and comprehensive dictionary. Young and Soroka [40] used a dictionary-based approach consisting of a simple word count of the frequency of keywords in a text from a predefined dictionary. They have designed a sentiment dictionary (called Lexicoder Sentiment Dictionary) and tested it against nine other dictionaries as well as against a body of human-coded news content in political context. We are using Lexicoder Sentiment Dictionary for the research presented in this paper.

The source of gathering the user review may vary. It may be from the feedback section of product selling websites, such as Amazon [7] [11] where the language of feedback is very clean. Other sources may be social networking sites such as Facebook pages or Twitter handles of celebrities or political parties [18] [38]. In this case, the language is not clean at all, and the task is more challenging. Several researchers have taken Twitter as a data source to design and test sentiment analysis systems [1][13][28] [42]. The popularity of twitter is perhaps due to the short size nature of the tweets, which offer challenges due to their colloquial nature but are very precise in polarity of sentiments [20].

The alternatives to purely dictionary-based approaches are learning models based approaches, which are particularly useful for cross-domain sentiment analysis. Bollegala et al. [3] proposed a cross-domain sentiment classifier that generates an automatically extracted sentiment sensitive thesaurus during the analysis. There has also been attempt to predict the sentimental inclination of the users review by analysing the context or pinpointing the target of the opinion (also called opinion features). The opinion extract can be done using supervised leaning models [16] [19] which are more suitable for domain specific analysis or unsupervised learning models [32] [33]. Zhen et al [14] proposed a method for opinion feature extraction from online reviews by exploiting the differences in statistical features across domain specific and domain independent corpora. Some of the opinions may not contain strong emotions or may contain views on more than one issue. They require subjectivity analysis or opinion target identification [4]. To handle such opinions, Pang and Lee [29] used machine-learning method based on text categorization techniques. They extract only subjective sentences in the documents (based on minimum cut sets principles in graph theory) and discard the objective sentence in order to prevent polarity classifier from considering potentially irrelevant sentences. They tested their algorithm by classifying movie reviews as positive or negative over a data set consisting of 1000 positive and 1000 negative reviews written by 312 authors. The authors concluded that extracting most subjective sentences only (about 22% of total review) and analysing them for polarity of the movie reviews gives the comparable or sometimes better results than that of a full text review.

III.

SENTIMENT ANALYSIS This section describes the heuristics applied to conduct the research described in this paper. It also presents the nature and sources of data collected from various sources. Experimental results of sentiment analysis research are more easily compared with each other when they rely on publicly available datasets and tools. For this reason, we have used some freely available tools to determine the sentiments of the user comments that are acquired from publically available datasets. They are briefly introduced in this section.

A. Data Collection

Collecting data for sentiment analysis and labelling them according to sentiments is a daunting task. Fortunately, Bo Pang and Lillion Lee [29] have collected and classified 1000 feedbacks each for positive and negative sentiments. This collection is drawn from IMDB's archive of rec.arts.movies.reviews, and it is freely available to research community from the website aliasi.com/lingpipe/demos/tutorial/sentiment/read-me.html.

The overall average size of feedbacks in the negative data set is 610.6 words and 33.42 sentences. The overall average size of feedbacks in the negative data set is 684.29 words and 34.55 sentences.

We prepared another dataset consisting of Facebook status updates. We tried to collect Facebook posts to include as heterogeneous information as possible. Six hundred Facebook status updates were downloaded from Facebook walls of 100 unique users posted between July 10, 2014 to January 29, 2015. The average size of Facebook status updates were 68.82 words; the minimum size being 14 words and maximum size 616 words. The topics ranged from politics, sports, movies, personal accomplishments, among others.

B. Sentiment Analysis Tools

We used three freely available common tools to conduct the research discussed in this paper. A brief description of these tools is given below.

RIOTScan:

RIOTScan is a freely available software (http://riot.ryanb.cc/) designed for calculating meaningful indices from texts. It supports 35 dictionary schemes including financial sentiment dictionary [24], Social ties dictionary [31], among others. We used Lexicoder Sentiment Dictionary [40] and Opinion Lexicon [15] [22]. Opinion Lexicon consists of a list of around 6800 opinion words or sentiment words from English. RIOTScan gives processing option of stemming using Porter algorithm and lemmatization.

SentiStrength: SentiStrength estimates the strength of positive and negative sentiments in short texts, even for informal language [36] [37]. Besides its online trial version (http://sentistrength.wlv.ac.uk/), SentiStrength provides executable version for windows as well as Java implementation on request for academic research. SentiStrength analyses negative feedback on a scale ranging from -1 to -5 and positive feedbacks ranging from 1 to 5 for some keywords in the feedbacks. The overall sentiment value of a sentence is calculated by subtracting the total positive value of the sentence from the total negative value of the sentence. So, if the total negative value is higher than its total positive value, then the sentence is considered as a negative sentence.

Sentiment.vivekn.com: Sentiment.vivekn.com is a free online tool for sentiment analysis for research and academic purposes [26]. This tool works by examining individual words and short sequences of words (n-grams) and comparing them with a probability model. The probability model is built on a pre-labelled test set of IMDB movie reviews. It can also detect negations in phrases, i.e., the phrase "not bad" will be classified as positive despite having two individual words with a negative sentiment.

C. Sentiment Analysis Heuristics

We applied some heuristics to analyse the sentiments in feedbacks and posts. This section describes these heuristics. In the first heuristic, we tried to determine the effectiveness of dictionaries, in their basic forms, on the analysis of sentiments. That is, we tested what percentage of words used in the reviews are from negative sentiment dictionary and what percentage from positive sentiment dictionary. If the percentage of words from negative dictionary exceeds that from the positive dictionary, the view is considered as negative sentiment; otherwise, it is considered a positive sentiment. The percentage of determination of positive or negative words can be done using standard sentiment dictionaries. We used Lexicoder Sentiment Dictionary and Opinion Lexicon to determine the percentage of negative and positive words in feedbacks.

The second heuristic calculates sentiments for each sentence in the user views, and adds them to determine the overall sentiment of the feedback documents. For all the documents marked manually as positive review, the sentiment of the document is equal to the difference between the total of positive sentiments summed over all sentences in the document and that of negative sentiments summed over for all sentiments in document. The difference should be a positive number in order to mark the document as a positive sentiment. Similarly, for all the documents marked manually as negative review, the sentiment of the document is equal to the difference between the total of negative sentiments summed over all sentences in the document and that of positive sentiments summed over for all sentiments in document. The difference should be a positive number in order to mark this document as a negative sentiment. We have used SentiStrength to test this heuristics.

The third and last heuristic tests the documents by examining each word and comparing them with a probability model defined over a pre-labelled movie data. In this case, the decision is taken on the basis of confidence level generated by the probability model. We have used the online tool available at http://sentiment.vivekn.com/.

IV.

RESULTS This section describes the results obtained after applying heuristics and tools described in section 3 over data collection from movie reviews and Facebook. We are describing these results in two phases. In first phase, we are describing results over movie data. As seen from Table I, the percentage of correct prediction using RIOTScan and Lexicoder Sentiment Dictionary is 63.4% and 68.1% for positive and negative documents respectively that is consistent (overall average 65.8%) for both types of feedbacks. However, the prediction fluctuates heavily when we use RIOTScan with Opinion Lexicon. The prediction (Table II) in case of positive documents is as high as 96.6%, which can be a benchmark performance by any standard. But, the prediction fell down to an abysmally low value of 15.2% in case of negative documents. This underlines the need of adding more relevant negative words in the Opinion Lexicon Dictionary, which is relatively an older dictionary.

Table

We applied second heuristic over movie review data using Java version of SentiStrength. We have changed the default policy of assigning 1.5 times higher weightage of negative words in SentiStrength to equal weightage (1.0) for both positive and negative words. We have also used the SentenceCombineTot option instead of the default "Maximum" option of SentiStrength. As shown in Table III, the performance improves in case of negative documents as compared to heuristic one with Lexicoder Sentiment Dictionary, but it goes down in case of positive documents. This opens a research direction that requires experimenting and reassigning the weights assigned to negative and positive words in EmotionLookup Table in SentiStrength. Table IV, the performance of the third heuristic is satisfactorily high and consistent. The reason for good and consistent performance can be attributed to two factors: First, this tool is modelled and tested over similar kind of data and second, the labelled dataset is relatively new. In the second phase of evaluation, we tested all heuristics using documents sets prepared by collecting Facebook status updates. In case of first heuristic with Lexicon Sentiment Dictionary, the performance went up for positive documents (from 63.4% to 72%), but went down sharply to 48% from 68.1 % for negative documents (Table I and Table V). In case of Opinion Lexicon dictionary, the gap between performance over positive and negative documents decreased, but still it is too wide (Table II and Table VI).

As shown in

The pattern for performance of the second heuristic follows the same pattern as the first heuristic using Lexicoder Sentiment Dictionary where performance for positive documents goes up and that for negative documents goes down (Table III and Table VII). On the other hand, the performance for the third heuristics goes down for both positive and negative documents (Table IV and Table VIII). The performance of the third heuristic is consistent with our reasoning for its good performance over movie review data.

The overall performance of all heuristics slips down when we use data from Facebook with an exception of Opinion Lexicon dictionary where performance is very low for negative documents for both types of data. The reason behind the fall of the performance can be attributed to different style of writing of negative comments on social websites like Facebook. The users over Facebook quite often write sarcastic posts. These posts are composed of mostly positive words, but they are negative in true sense. For example, consider the post on Facebook, "Someone takes blessings of his mother before filing nomination (the main agenda of the day. All other activities are just to support this main agenda). Starts from residence to file nomination with great fanfare. He is busy whole day doing everything except filing nomination. The day ends and the filing nomination is yet to be done. You decide how focused he (is) will be in his work. Does it give some clue on his working style? Ok. Remember the days when he was CM. He was doing everything (dharna, protest) except what he is supposed to do as a CM". This post contains more positive terms than negative terms. However, a human will easily judge that this post is a negative remark on the person under consideration. In addition, there are many negative posts on Facebook, which if presented to humans without providing the details of past events and political or social inclination of the user, will be judged as positive comments. For example, consider the post "Tough competition between Sagarika & Shekhar to claim the position of "India's most Progressive Intellectual Sanctimonious Secular"". By reading this comment without prior knowledge of the background of the names mentioned in this post, this comment will be decided as a positive comment. However, in real context the user had posted it as a sarcastic comment. This aspect also opens new research direction in sentiment analysis.

V. DISCUSSION, CONCLUSION AND FUTURE WORK

The growth of social media and increasing participation by the vocal users in expressing their views make sentiment analysis of data a crucial task for success of businesses. Though there are several online tools for analysis of sentiments, but business houses cannot use it for privacy and security reasons. Using free sentiment analysis tools instead of costly commercial products makes good economic sense. Therefore, we have applied some heuristics over freely available tools to test their suitability with real life data from social media. We have presented the results, and have highlighted the comparative performance of these tools for both negative and positive documents. This research identified the shortcomings in the existing resources and tools, and underlined the potentials for further improvement in these tools instead of developing an entirely new set of tools. It also opens several new research directions.

There is a lot of scope in research in sentiment analysis. As this paper clearly highlights, there is an urgent need to conduct intensive studies to update sentiment dictionaries. There can be several ways to populate the sentiment dictionary. We recommend assigning dynamic weights to the terms in dictionary instead of assigning fixed values. In order to achieve this, these tools can be redesigned to update the weight of the terms in dictionaries dynamically with each usage of the tool.

Keeping track of the details of the users may require lot of memory and computing resources, but it can help in deciphering the context, and as a result, prediction of sentiments will be better. Research can be conducted to predict the behaviours of the users based on frequency and nature of their posts in social media. This will help in handling sarcastic posts in better way. This will also help business houses to understand and serve the users in a better way by providing customized products and services to them.

We need to conduct research to develop a continuous scale to analyse the sentiments instead of analysing at discrete scale or a simple positive or negative sentiment. The posts or feedbacks of the users can range from extremely negative to neutral to extremely positive defined over a continuous scale. This will help in assessing the user sentiments in a more accurate way.

The users from countries where English is not a native language post their comments using a combination of English and their native languages. For example, a typical user in India mixes Hindi (written in either Devanagari Script or Roman Script) and English for writing posts on social media. Intensive research should be conducted to integrate the language detection and language translation tools with sentiment analysis tools for indepth and accurate analysis of sentiments.

Finally, the effective utilization of semantic web may significantly help the research in sentiment analysis.

TABLE I .

TABLE V .