Sentiment Analysis Report

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Twitter Sentiment Analysis for US Presidential

Elections 2016
Operations Research Project

Introduction
Peoples opinions change over time. Being able to accurately detect this
change enables us to analyse the potential causes. Its of great
importance to know the main reasons for the dynamics behind general
publics opinion. For example whether a change of sentiment is derived by
how an event is represented in the mass media or the true actual event.
How influential different networks are in the distribution of news, and if
some of them are representing biased information. Being able to provide
quantitative analysis for these questions, will help gain deeper insight into
the system of social information transfer within a society. As a result, I
decided to have a study on Twitter, as a representative of the mass
audience. I used Twitters text to find the crowds dominant sentiment.

What is Sentiment Analysis or Opinion mining?


Sentiment analysis (also known as opinion mining) refers to the use
of text analysis and computational linguistics to identify and extract
subjective information in source materials. Sentiment analysis is widely
applied to reviews and social media for a variety of applications, ranging
from marketing to customer service.
Sentiment analysis aims to determine the attitude of a speaker or a writer
with respect to some topic or the overall contextual polarity of a
document. The attitude may be his or her judgment or evaluation,
affective state (that is to say, the emotional state of the author when
writing), or the intended emotional communication (that is to say, the
emotional effect the author wishes to have on the reader). Opinion mining
can be useful in several ways. It can help marketers evaluate the success
of an ad campaign or new product launch, determine which versions of a
product or service are popular and identify which demographics like or
dislike particular product features. For example, a review on a website
might be broadly positive about a digital camera, but be specifically
negative about how heavy it is. Being able to identify this kind of
information in a systematic way gives the vendor a much clearer picture

of public opinion than surveys or focus groups do, because the data
is created by the customer.

Types of Sentiment Analysis


Document-level of sentiment analysis
Opinions are usually subjective expressions that describe peoples
sentiments, appraisals or feelings towards an entity or an event. Many
blogs or forums allow people to express their opinion in the form of
reviews and comments. When opinions are expressed in the form of
reviews, instead of a simple Yes or No, identifying the actual emotions
would need a subjective analysis of the words used in the review
In document-level of sentiment analysis, each document focuses on a
single entity or event and contains opinion from a single opinion holder.
The opinion here are can be classified in to two simple classes: Positive or
negative (probably neutral). For example: A product review: I bought a
new phone few days ago. It is a nice phone, though it is a little big. The
touch screen is good. The voice clarity is better. I simply love the
phone. Considering the words or phrases used in the review (nice, good,
better, love), the subjective opinion is said to be positive. The objective
opinions are measured using the star or poll system, where 4 or 5 stars
are positive and 1 or 2 stars are negative.

Sentence-level of sentiment analysis


To have more refined view of different opinions expressed in the document
about the entities, we should move to the sentence level. This level of
sentiment analysis filters out those sentences which contain no opinion
and determines whether the opinion on the entity is positive or negative.

Aspect based sentiment analysis


Document level and sentence level sentiment analysis works well when
they refer to a single entity. However, in many cases people talk about
entities that have many aspects or attributes. They will also have different
opinions about different aspects. It often happens in product review and
discussion forums. For example: I am a Nokia phone lover. I like the look
of the phone. The screen is big and clear. The camera is fantastic. But,
there are few downsides too; the battery life is not up-to the mark and
access to Whatsapp is difficult. Categorizing the positive and negatives of
this review hides the valuable information about the product. Therefore,
the Aspect based sentiment analysis focuses on the recognition of all
sentiment expressions within a given document and the aspects to which
the opinions refer.

Comparative sentiment analysis

In many cases, users express their opinions by comparing it with a similar


product or brand. Therefore, the goal here is to identify sentences that
contain comparative opinions.

Challenges in Sentiment Analysis


There are four main factors that currently stop us from relying blindly on
tools for sentiment analysis:
1.

Context: A positive or negative sentiment word can have the


opposite connotation depending on context (e.g. my internet provider
does a great job when it comes to stealing money from me)
2.
Sentiment Ambiguity: a sentence with a positive or negative word
doesnt necessarily express any sentiment. (e.g. can you recommend
a good tool I could use? doesnt express any sentiment, although it
uses the positive sentiment word good). Likewise, sentences without
sentiment words can express sentiment too. (e.g. This browser uses a
lot of memory doesnt contain any sentiment words, although its
clearly negative at a document level.)
3.
Sarcasm: a positive or negative sentiment word can switch
sentiment if there is sarcasm in the sentence (e.g. Sure, Im happy for
my browser to crash right in the middle of my coursework).
4.
Language: a word can change sentiment and meaning depending
on the language used. This is often seen in slang, dialects, and
language variations. An example is the word sick, which can change
meaning based on context, tone and language, although clear to the
target audience.

Abstract
Elections empower citizens to choose their leaders. It gives all an
opportunity for equal voice and representation in our government.
Democracy is government for the people, and by the people, which means
government leaders are determined by participation in elections. As we
approach the 2016 November US presidential election, the public
sentiment towards candidates will influence the future leader of USA. I am
interested in how the public views the top election candidates, namely
Donald Trump, Hillary Clinton, Ted Cruz and Ben Carson. Feelings towards
candidates fluctuate quickly as interviews, debates, responses to global
events, and other issues come to front. To achieve a large, diverse dataset
of current public opinions on the candidates, I decided to use Twitter.
Twitter provides us with live access to opinions about the election across
the globe. It will demonstrate percentage of peoples sentiments on
twitter into positive, negative and neutral and also will showcase the word

cloud in which it will show all the words that have been spoken regarding
the respective candidate in the timeframe during which the tweets were
extracted from twitter. The code for the said project has been written in
python and is somewhat inclined towards the machine learning paradigm.
In the project, the input data has around four thousand tweets in all. Data
collection was the most important as the format of the data needs to be
cleaned so that it can be analysed and further study can be done. The
texts of the collected tweets were used to study peoples sentiments
towards the presidential candidates. As the data set was large, therefore it
was important to bifurcate the dataset into training and test data. The
model was trained on the training data and then tested on the latter. A
random selection of 80% data was done for the training dataset and the
rest 20% for the test dataset. The tweets obtained from Twitter are
compiled in CSV Format (Excel file) and then loaded into python by using
various open source libraries.

Literature Review
The following were read to get an idea about the
research that is going on the topic:1.Probablistics relational models
2. sentiment analysis in social neworks by pozzi alberto
3.Bayesian networks and the nave bayes classifier

You might also like