School of Engineering and Technology: A Dissertation Report On
School of Engineering and Technology: A Dissertation Report On
School of Engineering and Technology: A Dissertation Report On
A
DISSERTATION REPORT ON
“SENTINA”
Sentiment Based Twitter Reply Bot
Submitted to
CMR University
School Of Engineering and Technology,Bagalur
for the partial fulfillment of the requirement for the award of the Degree of
B.TECH
IN
COMPUTER SCIENCE
Submitted by:
K Abhinandhan (REG.NO.18BBTCS045)
KN Prajwal Sai (REG.NO.18BBTCS046)
Karthik G (REG.NO.18BBTCS047)
Manoj Bahadur (REG.NO.18BBTCS063)
DEPARTMENT OF
COMPUTER SCIENCE AND ENGINEERING CMR UNIVERSITY
BAGALUR
2018-19
1|Page
School of Engineering Technology
Main Campus,Off Hennur-Bagalur Main Road,Chagalahatti,Bengaluru-562149
CERTIFICATE
Certified that the project work entitled SENTINA-Sentiment Based Twitter Reply Bot
carried out by Mr Karthik G, REG.NO 18BBTCS047 in partial fulfillment for the Award
of Bachelor of Engineering / Bachelor of Technology in COMPUTER SCIENCE AND
TECHNOLOGY of the CMR University, Bagalur during the year 2018-19. It is certified
that all corrections/suggestions indicated for Internal Assessment have been incorporated in
the Report deposited in the departmental library. The project report has been approved as it
satisfies the academic requirements in respect of Project work prescribed for the said Degree.
External Viva
Date:
2|Page
DECLARATION
I have not submitted the matter embodies to any other university or Institution for the Award
of other Degree.
******
(REG.NO.188BBTCS047)
3|Page
ACKNOWLEDGEMENT
******
(18BBTCS047)
4|Page
ABSTRACT
Social media have received more attention nowadays. Public and private opinion
about a wide variety of subjects are expressed and spread continually via numerous social
media. Twitter is one of the social media that is gaining popularity. Twitter offers
organizations a fast and effective way to analyze customers' perspectives toward the critical
to success in the market place. Developing a program for sentiment analysis is an approach to
be used to computationally measure customers' perceptions. This paper reports on the design
of a sentiment analysis, extracting a vast amount of tweets. Prototyping is used in this
development. Results classify customers' perspective via tweets into positive and negative,
which is represented in a pie chart and html page. However, the program has planned to
develop on a web application system, but due to limitation of Django which can be worked
on a Linux server or LAMP, for further this approach need to be done.
In this Modern Era customer satisfaction is very important for the reputation of the service or
the company. Due to the wide spread internet people will express there reviews and issues
mostly through online social medias. It is expected from the customer that the service
provider to reply for their expressions. It is a humongous task for a person to go through all
the mentions and replies.
5|Page
INDEX
6|Page
LIST OF FIGURES
7|Page
CHAPTER 1
INTRODUCTION
SENTINA is a sentiment based tweet replying bot. It is based on the principle of
Sentiment Analysis which is a topic of Natural Language Processing. Natural Language is
one of the major subfield of Computer science or Artificial Intelligence.
The basic function of our App or Python Script is to read the tweets mentioned to a particular
profile in Twitter. The read tweets are analyzed to find the sentiment of the tweet. Sentiment
here range between Positive, Negative and Neutral. Once the sentiment of the tweet is
analyzed, we reply to the tweet automatically without manual input using pre-written
messages. The pre-written messages are marked for different sentiments, which will be used
accordingly when the sentiment is analyzed.
Sentiment Analysis, also called opinion mining or emotion AI, is the process of
determining whether a piece of writing is positive, negative, or neutral. A common use case
for this technology is to discover how people feel about a particular topic. Sentiment analysis
is widely applied to reviews and social media for a variety of applications.
The lexicon based approach is based on the assumption that the contextual
sentiment orientation is the sum of the sentiment orientation of each word or phrase.
A sentiment classifier takes a piece of plan text as input, and makes a classification decision
on whether its contents are positive or negative. For simplicity, let’s assume that input text is
known a priori to be opinionated (which we could obtain by filtering input text through
another classifier that detects opinionated text from neutral ones).
Centralize all your social media data with one tool. Rather than logging in and out
of multiple social media analytics tools, add your social media profiles and those
of competing brands to a single dashboard. You’ll be able to analyze key metrics
from your customers, campaigns, competitors, and the industry as a whole.
Centralizing your social media accounts, along with those of your competitors,
will allow you to choose the stats that matter and draw comparisons. Bringing
actionable insights to improve your social marketing strategy.
9|Page
CHAPTER 2
COMPONENTS
Software Requirements
1. Tweepy
At the time of writing, the current version of tweepy is 1.13. It was released on January 17,
and offers various bug fixes and new functionality compared to the previous version. The 2.x
version is being developed but it is currently unstable so a huge majority of the users should
use the regular version. Installing tweepy is easy
10 | P a g e
2. TextBlob
TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for
diving into common natural language processing (NLP) tasks such as part-of-speech tagging,
noun phrase extraction, sentiment analysis, classification, translation, and more.
Sentiment Analysis
11 | P a g e
CHAPTER 3
PROCEDURE
● First of all, we create a TwitterClient class. This class contains all the methods to
interact with Twitter API and parsing tweets. We use __init__ function to handle the
authentication of API client.
● In get_tweets function, we use:
fetched_tweets = self.api.search(q = query, count = count)
TextBlob is actually a high level library built over top of NLTK library. First we call
clean_tweet method to remove links, special characters, etc. from the tweet using
some simple regex.
Then, as we pass tweet to create a TextBlob object, following processing is done
over text by textblob library:
○ TextBlob uses a Movies Reviews dataset in which reviews have already been
labelled as positive or negative.
○ Positive and negative features are extracted from each positive and negative
review respectively.
12 | P a g e
○ Training data now consists of labelled positive and negative features. This data
is trained on a Naive Bayes Classifier.
Then, we use sentiment.polarity method of TextBlob class to get the polarity of tweet
between -1 to 1.
Then, we classify polarity as:
if analysis.sentiment.polarity > 0:
return 'positive'
elif analysis.sentiment.polarity == 0:
return 'neutral'
else:
return 'negative'
Finally, parsed tweets are returned. Then, we can do various types of statistical analysis on
the tweets. For example, in the above program, we tried to find the percentage of positive,
negative and neutral tweets about a query.
13 | P a g e
CHAPTER 4
When the app or the Python script is run the reply bot starts reading the tweets. The tweets in
the time line of the profile linked to the given authorization keys are read. The read tweets are
returned to the code as a string input. The string is than sentiment analyzed using the
Textblob library’s polarity function. Polarity of the tweet gives the sentiment associated with
the tweet. Once the sentiment of the tweet is analyzed, we send a reply to the tweeter using
@mentions. The replies are sent based on the sentiment. Replies are pre-written strings
marked to different sentiment. All this happens automatically once the code is run.
Therefore we can run this code in cloud to automate the tweet replying.
14 | P a g e
APPLICATIONS
Reputation management - or you could also call it brand monitoring. We all know
how much good reputation means these days when the majority of us check social
media reviews as well as review sites before making a purchase decision.
Negative reviews put people off and how you handle can define your future as a
business. You could either ignore them (highly not recommended), act rude and
make your situation even worse, or apologize for whatever caused a person to write
a negative opinion and do your best to make up for it.
Competitor monitoring - chances are some of your competitors are getting bad
press online. It’s where you could step in as long as you’re aware of those negative
mentions. During these times we could automate to reply for the user to try our
product or service
15 | P a g e
CONCLUSION
In this Modern Era customer satisfaction is very important for the reputation of the service or
the company. Due to the wide spread internet people will express there reviews and issues
mostly through online social medias. It is expected from the customer that the service
provider to reply for their expressions. It is a humongous task for a person to go through all
the mentions and replies.
Methods like this can be used to filter out the unwanted and not necessary to take care of
messages. Still replying to the customer accordingly for the concerns.
Using this particular method for automating replies will highly reduce man power
requirement and will save lot of time.
The method has its disadvantages which should be mentioned at this point.
Sentiment Analysis is a branch of Natural language processing which is still under
development. We cannot expect one hundred percent accuracy. There can be times where
Sarcasm can be taken as a negative tweet. The analysis doesn’t analyze the context of the
tweet but interpret based on the words used. Most of the data conducted on the accuracy of
the analysis suggest that the method is close to eighty percent.
The system can be trained using data sets and increase the accuracy.
16 | P a g e
APPENDICES
Main
import tweepy
import time
print(name_tag, flush=True)
FILE_NAME = 'last_seen_id.txt'
def retrieve_last_seen_id(file_name):
f_read = open(file_name, 'r')
last_seen_id = int(f_read.read().strip())
f_read.close()
return last_seen_id
def reply_to_tweets():
print('Retrieving and replying to tweets...', flush=True)
# DEV NOTE: use 1060651988453654528 for testing.
last_seen_id = retrieve_last_seen_id(FILE_NAME)
# NOTE: We need to use tweet_mode='extended' below to show
# all full tweets (with full_text). Without it, long
tweets
17 | P a g e
# would be cut off.
mentions = api.mentions_timeline(
last_seen_id,
tweet_mode='extended')
for mention in reversed(mentions):
print(str(mention.id) + ' - ' + mention.full_text,
flush=True)
last_seen_id = mention.id
store_last_seen_id(last_seen_id, FILE_NAME)
if get_tweet_sentiment(mention.full_text.lower()) ==
'positive':
print('found a positive tweet')
print('responding back...')
api.update_status('@' + mention.user.screen_name +
" " +
"Hello, Iam Sentina. It's good
to hear from you. " +
"Thanks for sharing your
experience", mention.id)
if get_tweet_sentiment(mention.full_text.lower()) ==
'negative':
print('found a negative tweet')
print('responding back...')
api.update_status('@' + mention.user.screen_name +
" " + 'Sorry that you had a bad experience' +
" Please email us your issues at
[email protected]", mention.id)
if get_tweet_sentiment(mention.full_text.lower()) ==
'neutral':
print('found a neutral tweet')
print('responding back...')
api.update_status('@' + mention.user.screen_name +
" " +
"Hello, Iam Sentina. Thanks for
reaching us. If your have any issues in the future,"
+ " Please email us your issues
at [email protected]", mention.id)
while True:
reply_to_tweets()
time.sleep(15)
18 | P a g e
Analysis
def get_tweet_sentiment(tweet):
elif analysis.sentiment.polarity == 0:
return 'neutral'
else:
return 'negative'
19 | P a g e
REFERENCE
https://www.python.org/
https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis
https://tweepy.readthedocs.io/en/latest/
https://github.com/tweepy/tweepy
https://csdojo.io/twitter
20 | P a g e