A Survey of Sentiment Analysis From Social Media Data
A Survey of Sentiment Analysis From Social Media Data
A Survey of Sentiment Analysis From Social Media Data
2, APRIL 2020
Abstract— In the current era of automation, machines are Social networks are formed of people who are interconnected
constantly being channelized to provide accurate interpretations through some previous personal relationships, retain the same
of what people express on social media. The human race socially and further would prefer connecting to new associ-
nowadays is submerged in the idea of what and how people
think and the decisions taken thereafter are mostly based on ations to enlarge their personal contacts. It associates people
the drift of the masses on social platforms. This article provides who have a straight connect to the other. Compared to the
a multifaceted insight into the evolution of sentiment analysis former, communities comprise people from multiple fields
into the limelight through the sudden explosion of plethora of having less or no connection between them. The main connect
data on the internet. This article also addresses the process of between individuals in a community lies in the fondness
capturing data from social media over the years along with the
similarity detection based on similar choices of the users in social toward a familiar interest. Apparently, people stay within a
networks. The techniques of communalizing user data have also community for varied reasons, it might be the liking for a
been surveyed in this article. Data, in its different forms, have also special thing, or it might be that the person feels that he/she
been analyzed and presented as a part of survey in this article. should be associated with that community or he/she might
Other than this, the methods of evaluating sentiments have been achieve something by adhering to that community. Clearly,
studied, categorized, and compared, and the limitations exposed
in the hope that this shall provide scope for better research in social networks contain organized arrangement whereas com-
the future. munities contain arrangements which overlap and are nested
amidst them.
Index Terms— Clustering, community, sentiment analysis,
social media, social networks. Social Media is the method of sharing data with a huge and
vast audience. It can be addressed as a medium of propagating
I. I NTRODUCTION information through an interface. Social media in tandem
with social networks helps individuals cater their content to
A S HUMANS, we always tend to get attracted to like-
minded people. Even studies suggest that we are com-
fortable in socializing with people with similar beliefs, with
a wider society and reach out to more people for sharing or
promotion [2].
Sentiment analysis is the procedure of categorizing the
people on whom we can trust and who can facilitate to
views expressed over a particular object. With the advent
help achieve our aspirations. Etymologically, people have a
of varied technological tools, it has become an important
tendency to be associated with similar-minded communities.
measure to be aware of the mass view in business, prod-
Multiple clusters make a community. Modularity is one of the
ucts or in matters of common like and dislike. Tracking
prime mechanisms considered while determining the quantity
down the emotion behind the posts on social media can
of communities [1]. If the characteristics of the clusters are
help relate the context in which the user shall react and
minutely analyzed, then it can be instrumental in helping to
progress.
identify the specific character set of individual clusters or like-
This article provides the flow of social network analysis
minded people groups.
(SNA) from over 200 papers and presents the research that
Putting it in the other way, it can also be said that the
has been performed in social network and its related fields.
presence of a common connect between a set of individuals
This article is organized in the following manner—Section II
ensures that there lie similar principles and purposes between
deals with the inception of social networks in the research
that set of people.
vicinity. Section III mentions the motivation for this arti-
To be more specific, there is the availability of two types
cle. Section IV contains the detailed methods implied to
of social media; Social Networks and Online Communities.
find clusters as well as communities from over 40 papers.
Manuscript received July 8, 2019; revised October 15, 2019; accepted Section V deals with the review done on 45 papers in the
November 18, 2019. Date of publication January 7, 2020; date of current corresponding field and Section VI consists of the variety
version April 3, 2020. (Corresponding author: Siddhartha Bhattacharyya.)
K. Chakraborty and R. Bag are with the Computer Science and Engi- of the efforts that have been meted out for social media
neering Department, Supreme Knowledge Foundation Group of Institu- data. Section VIII deals with the varied techniques applied
tions, Chandannagar 712139, India (e-mail: [email protected]; to detect accurate sentiments from data on social media.
[email protected]).
S. Bhattacharyya is with the Department of Computer Science Section VIII concludes the paper with some future scope
and Engineering, Christ University, Bengaluru 560029, India (e-mail: for research in this field. A crawler, tool to evaluate seman-
[email protected]). tics, an engine that allows language preprocessing and a
This article has supplementary downloadable material available at
http://ieeexplore.ieee.org, provided by the authors. classifier are the main components of a sentiment analysis
Digital Object Identifier 10.1109/TCSS.2019.2956957 system [3].
2329-924X © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
CHAKRABORTY et al.: SURVEY OF SENTIMENT ANALYSIS FROM SOCIAL MEDIA DATA 451
II. O NSET OF S OCIAL N ETWORK AND I TS A NALYSIS field. It is a known fact that qualitative study comprises how
In SNA, the interconnectivity of humans in a social network people observe the reality that is taking place around a human
is called as cliques, which can be defined as a structure being; hence the problems are learned in their natural setting.
where every member of the collection of people is directly This article provides a qualitative insight into the dealings that
and cohesively tied to each other. Bron and Kerbosch [4] can be made with social media data and how they are analyzed
suggested that there are ways to find maximal total subgraphs to help the world to understand the emotions and behavioral
of graphs which are not directed, especially through back- patterns of their contemporaries. The authors strongly feel
tracking algorithms utilizing branch and bound to slice off that this one-of-a-kind paper will help experienced researchers
the branches that do not lead to the formation of clique. get access to a vast variety of meaningful work at the same
Research related to develop algorithms to cluster correlated place and can also judge the amount of work that has already
data from different applications of social networks has been been done in this field. The future scopes for each of the
initiated way back in 1975 [5], where the converging nature papers mentioned in the tables act as an easy reference from
of product-moment corelational matrices was used in the which the idea of further work can be considered. For naïve
CONCOR algorithm to cluster in a hierarchical manner. It is researchers, this article can act as a base upon which they can
to be considered also, that with the latest splurge in the start their work pertaining to social media and its analysis.
amount of social media data, it must have been a humongous As we have become extremely technology dependant, this
task for the search engines to index the billions of data that article will prove as a benchmark to fellow researchers who
they store in the web pages. Accordingly, the framework of further wish to work on the mentioned challenges faced in
Google was explained in detail [6] along with its incorporated social media data.
features of scalability and sturdiness so that they yield perfect As almost all of the conversations take place online and
search results. As providing faultless web page results on not in offline mode, it is important that the social media and
the internet has been a concern since inception of the social its components be understood properly in order to analyze
network and its analysis context, researchers proposed various sentiments. It has been observed that the focus on sentiment
techniques to trace efficient relatable data when information analysis has been more since 2004 [207], and that is why
relating to a broad subject was searched for in the web [7]. the authors have considered papers of social media and it
Certain factors were kept in mind such as the vast amount ancillaries from the year 2008 in this review. Reviewing only
of data that was increasing at an exponential rate and the the work that has been done on sentiments does not meet the
corresponding results that were shown had to be of supe- requirement to properly get a grip of the methods in which
rior quality. Also, in cases of broad topics, the quality of the semantics of sentiments is to be judged. Hence, it was a
pages was important as they would be of most relevance strategic decision to provide a detailed description of social
to the user and finally finding the hub pages that were media, its components and finally jump into the main topic of
densely linked to the set of correlated authoritative pages the most trending sentiment analysis. In a nutshell, the main
through which the outline of the social association could be objective of the paper is to present the work done in this
judged. particular field and to address its limitation making scope of
The contribution of this article lies in the fact that the ample research for scholars.
evolution of digital data which is extremely necessary to IV. DATA ACQUISITION
understand the opinion of the masses has been studied since
Acquiring data from social networks demands a huge atten-
inception. The path initiating from website data to the final
tion in order to accurately predict the ideology of the user
analysis of sentiments has been portrayed. The diverse areas
behind posting that data on social media. Popular social media
in which data have been collected for sentiments to be
sites like Facebook, LinkedIn and Twitter provide media for
analyzed have been made ready in a single place. Ample
the community to post, share, like and comment along with
options have been provided which can be applied to solve
other friends within the same network. On the other hand,
real-life problems collecting data from social media and finally
sites like Youtube, Flickr and Digg also are gearing up in
evaluating them.
providing various facilities for enhancing the connection with
social friends. Data collected from social media have lots of
III. M OTIVATION factors associated with it, it may or may not be noisy, it may
With the plethora of data being amassed on the internet, be homogeneous or heterogeneous, it may be of diverse range,
it is high time that matters relating to social media and its etc.
data be given utmost importance. What others think has to The main techniques through which data are obtained from
be followed by all, is a trend being perceived among the social media include: 1) collection of fresh data; 2) reuse of
masses. This article provides an almost chronological order previously available data; 3) reuse of data not belonging to the
in the development of the social networks, acquiring data specific person; 4) procuring data; and 5) data obtained from
on social media and analyzing them and finally prediction the internet (social media, texts and photosets).
of feelings within these data have been discussed in detail. The data after being obtained are initially processed.
To the best of our knowledge, this article is the first attempt Data processing comprises multiple actions like checking the
to amalgamate this huge amount of information from over an authenticity, understanding the outline, changing and assimi-
assortment of 200 papers mentioning their contributions to this lating into a suitable format for further use. After this step,
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
452 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 2, APRIL 2020
the processed data are analyzed to deduce the outcome of the in conduct of a set of data [14]. On social media, specific
actual underlying emotion of the users behind those data. clustering mechanisms like hierarchical clustering [15] have
Technically, three main methods widely used to acquire data been applied to a great extent for folksonomy, which helps
are [8]. in understanding the intended interest of the client as well as
1) Network Traffic Analysis—This is the method in which the specific matter of the resource [16]. Studies show that
packet streams are collected from a network connection, social graphs have also been used to designate clusters of
which further helps in tracking the browsing information connections which are of importance to the user [17], where
in the network. Due to security concerns, this method is factors like the regularity, intimacy and recentness with the
very rarely used specially in private groups. contact help in identifying a substantially significant online
2) Ad hoc Applications—This is a set of air position association. Social cohesion has been a matter of concern
indicators (APIs) which give the information regarding even before the advent of the computer age and hence intense
the account holder in a specific site and can further track constituent detection and its scrutiny are necessary for the
the vibrant behavior of the user. proper understanding of social networks [18]. Participation
3) Crawling—This is the most popular method used to of users in sharing opinions on different issues on social
acquire data from social media. Public information is media and later distinguishing those themes is not an easy
provided on asking for a specific type of data through task which urges the requirement of clustering web views
queries. Crawling also helps to achieve data through the for intelligent detection of security threatening cases. It is
APIs available in some of the social media sites. in this regard that the scalable distance-based algorithm [19]
shows high precision regarding the mining of essential issues
V. S OCIAL N ETWORKS while eliminating noise. Noisy link detection and its reduced
effect in clustering can also be eliminated utilizing assorted
Social Networks can be defined as the utilization of social arbitrary areas combining related data and social indicators to
media to connect to known or sometimes unknown acquain- cluster information [20]. Clustering has also been applied to
tances. It might be to bond with friends or relatives, contem- combine both socially and geographically to find the proximity
poraries or colleagues and maybe for bonding with users for of visits to the related clusters [21]. This method outdoes
business purposes as well. The entire history leading to the the traditional spatial clustering [22] in grouping a plethora
progress of SNA can be found in [9]. Early research in this of spaces in a fraction of time. With the recent outburst of
area leads back to [10], where movie data have been collected social data, problem of finding recurrent itemsets in large data
to construct models from relational statistics. Data in huge has been solved by MapReduce model [23], which deploys
volumes were considered along with a language to extract k-means clustering algorithm [24] to preprocess the data
related information and an algorithm to produce relational and frequent data sets are mined through a priori [25] and
probability theory to produce a classifier for relational data. Eclat [26] algorithms [27]. The experiment proves that this
In [11], a small sample of emails showed that algorithms sup- method yields excellent results on big data at a very elevated
ported by graphs were more efficient to identify who-knows- pace.
what within the organization compared to content-driven algo-
rithms. Relational dependence networks have been proposed
in [12] emphasizing the fact that models competent of finding B. Community Detection in Social Networks
dependencies result in improved categorization of data. To manage the incessant rise while indexing web pages
and to preserve the stability of precision and recall it was
VI. C LUSTER AND C OMMUNITY IN S OCIAL N ETWORKS devised that identifying a cohesive community and linking
Though clustering is one of the most sought after methods them to relevant links solves the problem [28]. The goal of
utilized to distinguish and estimate community structures in the community is to identify intensely joined associations of
social networks, there lie differences in both the terms. While individuals in social networks. Traditional approaches like
varied type of characteristics is considered while dealing Kernighan–Lin were based on particular problems. Hence,
with clusters, community generally adheres to any one type many algorithms have been proposed to identify community
of attribute while it is detected. There lies distinction in models within networks, some of which may differ from
consideration of connections in both clusters and communities; traditional methods of detecting communities but ultimately
cluster discovery can be done easily on crowded connections, yields similar qualitative results and that too involving sur-
but the discovery of the latter presumes that there is very little plus of vertices [29], while the other makes use of ethereal
connection in the network. techniques, cluster scrutiny and modularity perception [30].
Some techniques have also made use of traditional approaches
imposing some shortcuts, resulting in linear time execution of
A. Clustering and Its Applications the algorithms [31]. In [32], the Lennard-Jones clusters have
Clustering can be defined as the technique to identify been studied to identify the potential energy landscapes (PEL)
normal assembly within a group of units [13]. The probing utilizing the network topology and the community structure
nature of different types of traditional clustering methods was detected. Considering the centrality measure of an edge
along with the comparatively new spectral clustering methods in a network [33], algorithms were also designed to find out the
helps to successfully solve the problem of finding similarity most vital edge by truncating the less important edges succes-
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
CHAKRABORTY et al.: SURVEY OF SENTIMENT ANALYSIS FROM SOCIAL MEDIA DATA 453
sively to ultimately form disconnected groups. Other variations large networks [50] and utilized to include the information
of quick yet general divisive algorithms were designed in about as many communities as there are parameters to handle
the same year [34], [35], one recounting the betweenness bipartite graphs [51]; optical data mining methodologies were
count after elimination of each edge while the other refining being used to identify overlapping communities to facilitate
the computationally expensive GN algorithm which required proper constraint selection post viewing the preliminary data
nontopological data to clarify the branch details which bears visualizations [52]. The narrowing of the distance between
significance to the structure. Initial researches concentrated on humans and social media was kept in mind to opt for a
graphs from which the complete configuration was identified. coclustering construction, which makes use of the intercon-
It was next to which Clauset [36] proposed an approach of nected data among the users and the tags applied on social
agglomerative algorithm which worked on dynamic and too media to understand the preference of the group by an indi-
hefty graphs. This approach took into account one vertex at vidual [53]. Other applications of community-related services
a time and also proved that the straightforward application extend to the maintenance of interpretational significance
of that algorithm which would be appropriate to implement between question–answer duos along with data sets, for which
crawler programs helping unearth neighboring communities a deep belief inspired structure via barely phrase characteris-
on the internet. Out of the algorithms that were implemented tics was projected within the social community as illustrated
on large networks, study in [37] showed methods to explore in [54]. An overall survey of the methods to detect communi-
extremely overlapping, nested and linked association of nodes ties especially in social networks can be found in [55]–[58].
in binary networks. A stringent version of the web community Other than these, edges have been considered to yield opti-
detection is mentioned in [38], which incorporated to the build- mum community detection results [59], multiple data sources
ing of a Gomory–Hu tree [39] yielding in computationally have been combined to enhance the performance of distin-
proficient results. Apart from the varied approaches based guishing communities [60] and prediction made to find how
on the works by GN, one technique has been mentioned trendy information within communities will be broadcasted
in [40], which maps the community identification dilemma contagiously [61].
into the discovery of the ground situation of an unbound
scoped Potts whirl glass through ansatz coalescing data from
VII. C ENTRALITY FACTORS TO M EASURE THE
in cooperation with current and absent associates. Though
I NFLUENCE OF A N ODE IN S OCIAL N ETWORKS
finding communities in networks has been well considered
since its inception, a prescribed definition of the same was Another important aspect of networks is the centrality factor
absent, taking clue of which, researchers in [41] made use where the network with high centrality is conquered by the
of benchmark techniques and expressed community detection entity controlling the passage flow of the network. In [62],
as an inference or maximum likelihood problem. A notable the centrality measure calculation methods have been revisited
example of detecting communities utilizing the eigenvectors and two deviations have been suggested, one, to find the
of matrices has been portrayed in [42], where the modularity centrality measure through the course of traffic and the other
function has been revisited in terms of matrices leading to through the extension of the network. Upon finding the suitable
representation of the optimization job as a spectral quandary. flow, outcomes relating to the significance or contribution
In another case [43], the degree concept from a solitary of the node may also be obtained. Observation of centrality
vertex was extended to subgraphs which reduce the overall and its decreasing effect in the presence of error have been
complexity and the same was applied to produce a tool mentioned in [63]. Design of an effective algorithm that yields
referred to as ModuleNetwork (MoNet) [43], to find commu- the uppermost graded intimate centrality vertex in a network
nity formations with outsized networks. Random walks were in less time has been described in [64]. Zhou et al. [65] have
used to compute resemblance in structures within the ver- demonstrated through parameterized measures how terrorism
tex spaces, which if implemented in hierarchical algorithms, is being broadened utilizing the social networks by connecting
provided efficient results [44]. Also instances of methods to like-minded clusters, and hence their usage should be
taking into account the smallest loops passing through a monitored. As social networks comprise innumerable links and
particular node were also cited in [45], which is claimed nodes, it becomes difficult to survey them in an organized man-
as an adaptation of the neighboring competence introduced ner. To maintain an equilibrium between methodical yet elastic
by Latora and Marchiori [46]. Raghavan et al. [47] probed discovery of social networks, a system called SocialAction was
into a mechanism which considered only the structure of the offered in [66], to efficiently consider several geometric and
network and implemented simple label broadcast algorithm optical network analysis parameters. The parameter ranking
where every node has been provided with a unique label methodology in the paper not only allows receiving summary,
and at every step individual nodes acquire the label that strain nodes, locate outliers but also helps to integrate nodes to
majority of its neighbors currently possess. Other notable decrease complexity, find consistent subgroups and pay atten-
works that need mention in the community detection field tion on communities of attention. Social networks have been
are that of techniques in which heuristics were used to utilized to foretell psychological fitness as portrayed in [67],
optimize the modularity [48], genetics supported approach where dissimilar networks were found, the nonfamily and
used to discover communities by optimizing a simple yet nonfriends network which helped to predict the symptoms of
successful fitness function as mentioned in [49]. While the depression within these isolated individuals. With the passage
concept in [47] was used to detect real-time communities in of time, the increasing prominence of large networks led to
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
454 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 2, APRIL 2020
a major problem of handling the real-time data generated of the actual dynamics between individuals and also allowed
from those networks to understand the basic nuances of to study the progression of these associations with time.
those groups. Backstrom et al. [68] devised methods which But privacy and security were to be considered as the most
predicted how the large groups within the networks showed important criteria while using data with these types. It is in this
growth in terms of members and substance. It also explored context that breaching the security can be a matter of serious
the networks at an individual level to ascertain the joining concern for even private profile users. Hence, users should be
in a certain overlapping community. Wikipedia [69], being aware of the connections and participation in groups as they
one of the biggest repositories of information doubts the can offer to release private information intended to be kept
trustworthiness of the material provided to it. To take charge covered [76]. A social networking middleware MobiClique
of this situation, Sicilia et al. [70] proposed that this problem has been presented in [77], that associates with other devices
should be confronted with in the beginning as reliability issues when they meet avoiding the requirement of a machine in
shall increase with the increase in the growth of the network. between to exchange the messages. Messages can be easily
The people contributing to Wiki and the references that are dispersed within separate networks through this middleware.
linked to external sources were mapped through a graph and To develop a better analysis of social networks, a fresh idea
certain metrics were imbibed to extort data from the network. of ModulLand was presented in [78], where linked parts in
On the commercial front, social networks started playing an a community who have a selected centrality threshold have
important role in the decision-making process of entrepre- been termed as modules. ModuLand techniques comprise four
neurs. It is in this regard, Aldrich and Kim [71] defined three steps—establishing the functions which control a node or
models of networks—accidental, small world and truncated link, followed by the creation of a community landscape,
scale free. Two types of association formation models were formatting the hills in the community and determination of
found, one which was formed logically and the other based higher level and resolving a hierarchy of superior intensity
on individual relationships. Though many variations of the networks.
system were proposed, it was suggested that instead of banking
on a specific type, it was recommended to envelop the entire
VIII. S PAM D ETECTION IN S OCIAL N ETWORKS
entrepreneurial series in the aim of their business to grow.
Businessmen promoting their brands utilized social networks Based on multiple assumptions, an experiment was per-
to reach out to a maximum number of customers through formed and results were demonstrated in [79], where dissem-
social networks. Further availability of data corresponding ination of a fitness conduct was implemented through social
to a specific consumer network could help to identify the networks. Results showed that more the span and number of
controller using the concept of centrality; though ascertaining clusters in the network, the more effect it had on helping
the parameters to be considered still remains an issue. In [72], to extend the health conduct. It was observed that clustered
a real-time network data were considered to evaluate varied networks performed better in accepting the behavior and also
centrality parameters to spread out messages within a network. in less time than random networks. As users may contribute
SenderRank [72] is a novel centrality measure introduced any content or make use of the social platform to spread
in the paper which outperformed many existing measures, any unwanted content through the help of social networks,
but the category of message and its contents along with the Gao et al. [80] proposed a method to measure and distin-
type of networks affect the prediction of favorable customers. guish spam promotions performed through pseudo accounts
As social networks seem to form the entire habit of an in online networks. Based on a data set of the messages from
individual, it should also depict the mental state of a person “Facebook” wall, it was observed that around 97% of the
along with geographical preferences in the social networks accounts were formed for the sole purpose of spreading spam
which ultimately shall manipulate the traveling inclination in the networks and the spam messages are usually activated
of that person. But challenges faced in collecting data for in the wee hours of the morning. Hence, it was more or
this survey showed that many factors like economic stability, less established that social networks were the target to spread
mentality, future plans affect the prediction [73]. On the spam and malware and exposure methods should be devised to
academic front, retrieving information profiles of researchers detect online social spam. Another framework was projected
through the architecture of ArnetMiner structure has been in [81] that could be adapted by available social networking
explored in [74], where a united tagging approach has been sites to restrict spams. This proposed framework had many
used to haul out profiles of researchers from the internet advantages like identifying spam in the network and spreading
without human intervention. This tool will also be helping information about the same through the entire network. It was
in assimilating the existing information related to publications observed that the model worked well in terms of accuracy for
from online libraries into the network, helping to mold the a large amount of data. Hence, comparing it to the rise of
scholarly network in its entirety and provide probing facilities humongous data through social sites, it would prove helpful
to the academicians. The construction of the tool also provides to efficiently detect spam in the sites. It was also anticipated
a probabilistic solution to tackle redundancy problems raised that new social networking sites could prevent spam at an early
due to similar names. With the advent of mobiles into our daily age if this method was imbibed. Multiple classifiers were used
lives, Eagle et al. [75] guided us into the way through which whose results were passed through AND , OR, Majority voting
information could be collected from mobile phones instead and Bayesian strategies to detect spams. As every entity has
of self-reports. This process resulted in a vivid representation a positive and negative aspect to it, social networks bear no
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
CHAKRABORTY et al.: SURVEY OF SENTIMENT ANALYSIS FROM SOCIAL MEDIA DATA 455
exception. An algorithm was also proposed for the same reason linked and the influential hubs of criminal system have a
but which can be applied to small scale networks in [82], tendency to follow the accounts more. Incorporation of the
combining graph theory and machine learning concepts. The Mr. SPA [88] results in categorizing the accounts in divisions,
specialty of this method is that it requires only the graph namely, social butterflies, which haphazardly follow back-and-
topology construction to detect spams. On one hand, when forth through any account on Twitter. But it is to be kept
social networking sites like “Facebook” are being utilized for in mind that these accounts are very rare in nature. Another
malicious activities, on the other hand, it is also used for algorithm, Criminal Account Inference [88] is also applied
educational purposes as well [83]. As the teaching paradigm in this work to gather more details in the field by analyzing
had started to shift from the conventional methods to more their social accounts and by calculating the semantic harmo-
technologically enriched techniques, it was found that back nization amid accounts. A case study has been demonstrated
in 2010, that 73% of the teachers had an account on Facebook in [89] where data were collected from various sites including
compared to students, where 93% of whom had an account. homepages, blogs and Twitter against the backdrop of the
It was also seen that college-goers had the same regularity of National Assembly elections of Korea in 2010. The aim was to
checking Facebook as well as emails, but faculties checked check if this means of social network was used purely to con-
their emails more than checking the social networking site. verse with fellow members and citizens or was it intentional.
But it was found out that both college students and faculties It was found out after experimentation that the politicians
did not consider Facebook as a noteworthy means of sharing were more connected to their contemporaries on Twitter than
instructions, hence adhering to its name of being a social with their citizens. A study conducted on 15 students who
site rather than an educational site. Contrary to this, another were gaining knowledge of imbibing social networks into
work [84] employs a priori algorithm along with association data collected from the internet showed that beginners can
rules to understand the involvement of Facebook in connecting easily grasp the concepts of SNA using NodeXL [90] within
students with each other versus students with teachers. It was a short span of time. The intention was to facilitate beginners
found out that considering a few aspects like the number of with prime quality material to learn online communities as
times the students check their social media accounts and the per their preference [91]. Initially, the network analysis and
time given to Facebook, students believe that Facebook is the visualization (NAV) tool was used and the SNA equipment
best medium to access affluent information. was made available to the beginners. Another paper presents
a methodical system which helps individuals visually surf in
cases of hefty dimensional networks [92]. This approach helps
IX. I NFLUENCE A NALYSIS IN S OCIAL N ETWORKS
to view social communities as well as their involved exchanges
Along with online social networks, mobile social networks in an amalgamated view. The network structure is accumulated
were slowly making its mark in the digital world. Accordingly, in the form of a social graph and their corresponding intercon-
to spread information and to make people convince them nected subgraphs are presented visually. As already discussed,
to adapt those, significant individuals were to be found out. identifying potential users in clusters or communities has
A greedy approach was incorporated in communities to mine been a scope for research always, and continuing with the
the most significant nodes in a mobile social network [85]. same, [93] presents a technique to identify the significant
Initially, the communities were selected and found out which person based on the post exchanges made upon a given theme.
will be used as a medium to disseminate the information The first step involves preparing a graph that exhibits the
followed by a dynamic programming methodology to consider relationship between the posts on the specific theme followed
dominant nodes. This method shows better results than the by the next step where a user graph is prepared which
traditional greedy approach with minimum errors. To facilitate represents the influential users. Based on attributes collected
inter public system procedures, a framework was designed from both the graphs, the most significant one is selected.
in [86] to find out the most number of user profiles that are Not only from the significant users, but the manners among
being used by a person. To accomplish the task, parameters online clients are also prejudiced by their peers using entropy
considered to find out whether two profiles belong to the same assessment techniques as mentioned in [94]. To understand
person, were allotted weights both physically and without the nature of links present in social networks, a clustering
human intervention, semantic comparisons were made and along with a combined sorting algorithm has been used to
collective functions were used to take decisions. The results differentiate between positive and negative links [95]. The
performed well compared to the existing traditional tech- method ensures that there is a social equilibrium between
niques. Another interesting service to take active participation the clusters and preprocessing is done on the network before
in social networks is that of the micro-blogging site Twitter. calculating the indication of the link. But against this concept,
Twitter seems to perform the opposite of normal networks a study mentioned in [96] shows that as most of the previous
of individuals. As there is a provision of 140 characters works are based on supervised learning, the concealed ethics
to be written on its platform, analysis of the tweets shows that initiate a specific behavior of social members are not con-
that most of the posts on Twitter are based on heading or sidered. Hence, a deep belief network-based technique based
trending news [87]. Like other social networking sites, prag- on unsupervised learning has been proposed to determine
matic investigation was also performed on Twitter in search links in the network. Other than machine learning models,
of prevalent computer-generated illicit systems. The internal deep learning mechanisms have also been used for sculpting
social associations reveal that illicit accounts are publicly individual deeds as well as envisage health social networks.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
456 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 2, APRIL 2020
A study of how social networks could influence personal con- XI. A NALYZING S ENTIMENTS ON S OCIAL M EDIA
duct and how assumptions could be made to select parameters As it has been observed from Supplementary Table I, that
to predict the behavior of a person has been studied in detail social media mainly deals with comments from its users, it is
in [97]. Factors such as personality inspiration, implied and very important to efficiently analyze them to help understand
precise social persuasion and ecological measures have been the emotions and opinions of the mass in general. The proxim-
considered to foresee the expected movement of the individual. ity in which humans use social media platforms to present their
Methods have also been presented to efficiently fragment ego views against each and every event leads to the necessity of
network by a genetic algorithm based K-means clustering exploring sentiments and try to resolve ways to evaluate them
structure combined with the information received from the optimally. The prioritization of including the sentiments of
behavior of the social circles in [98]. Hopeful results were public through online platforms has risen in the last few years.
achieved while studying the developing of associations with Sentiment analysis is termed as the method in which invol-
time on social networking sites like Facebook and Twitter. untary procedures are formulated to infer the sentiment of a
Considering the fact that Twitter data are extremely noisy and text. The individual data which have been identified through
comes with loaded additional information, a Twitter-Network computation help in forming planned insights to be utilized
structure has been proposed to entirely model the network by judgment manufacturers. Owing to the rapid technological
employing the hierarchical Poisson–Dirichlet processes for advancement and unbounded access to social media, sentiment
wordings and a Gaussian arbitrary task for social network analysis is constantly gaining popularity in the current business
modeling [99]. The role of social media sites like Twitter scenario. Sentiment analysis necessitates the use of handling
in cases of emergency is extremely accepted, but it has to natural language processing and its varied responsibilities like
be understood that the insufficiency of categorized data at analysis of micro texts, detection of irony, anaphora detection,
the onset of emergency postpones the learning method and situation as well as feature identification.
hence takes time to predict. This problem has been addressed Social media data from several spheres are collected from
in [100], where Convolution Neural Network has been utilized multiple means to extract the sentiments from texts. Pres-
to detect and swiftly categorize important tweets at the time ence of multiple languages within the texts on social media,
of disaster. informal spellings due to message size constraints, spelling
Social networks have resulted in a major impact on the mistakes, grammatical and logical errors make the task of
socio-economic facets of the world and will continue to do analyzing sentiments difficult on social media. In some cases,
so in the upcoming years. Out of the many crucial factors file demonstration in the form of N-gram graphs has been pre-
determining the concept of social networks, some of them sented to evaluate substance-based sentiment analysis [175].
are mentioned here along with a reference to the surveys It ably captures the sentiments of words by matching with a
conducted on these areas. To identify the significant user in part of the string and by making no conjectures to the primary
a social network is a matter of serious concern to researchers language.
worldwide. A survey of all the techniques used based on the
construction of the network as well as the substance available
in the network has been considered in [101]. The trend toward A. Sentiment Analysis Techniques
which the social data are pointing to, termed as opinion There are two main methods of extracting sentiments, viz.,
propagation, is also a matter of concern to the social world lexicon-based approach and classification-based approach. Uti-
analytics. Cercel and Trausan-Matu [102] present a survey of lizing the former one, [176] shows the performance of Seman-
the judgment broadcasting process in detail along with the tic Orientation Calculator (SO-CAL). Initially, emotion-related
scope of research in that field. expressions (comprising different parts of speech) are used to
With the recent surge of data on the internet and the evaluate valence shifters that are responsible in communicating
requirement of gaining meaningful insight from the huge the attitude of the text organization and finally the senti-
amounts of data needs automatic processes for its analysis. ment is calculated. Two theories have been considered while
Data mining techniques are appropriate to help with these calculating sentiments, first, that sentiments are independent
issues. A list of possible procedures followed to understand of contexts and second that sentiments can be articulated
social data based on various facets is available in [103]. through numbers. This work emphasizes on adding a hint
of contradiction that reallocates the rate of the word in the
presence of a negator. Considering health as one of the prime
X. S OCIAL M EDIA DATA U SED IN S OCIAL N ETWORKS
issues of our lives, a study [177] shows how analyzing small
Social media facilitates the vast distribution of information text messages collected from Twitter could help in understand-
through the virtual medium. The content on social media ing the emotions of people against respiratory tract infection
may vary from being personal data to documents, photos A(HINI) vaccination. All the messages that were collected
and official data. It was initially imbibed as a means of were connected to vaccinations as well as they provided the
communicating to each other but as days are going by it has geographical position of the person behind the tweets. The
become entirely business based as it has the advantage to reach accuracy of mining the sentiment was 84.29%, incorporating
out to the entire population concurrently. A detailed study of Naïve Bayes classifier [178] to identify the optimistic and
the contributions made in the field of social media has been pessimistic tweets and maximum entropy classifier to identify
presented in Supplementary Table I [104]–[174]. unbiased and inappropriate tweets [179]. Similarly, for the
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
CHAKRABORTY et al.: SURVEY OF SENTIMENT ANALYSIS FROM SOCIAL MEDIA DATA 457
case of disasters, a method has been presented in [180], yet precise feature detection characteristic of deep learning has
in which visual analysis has been used to exploit location been utilized to find out the inadequately characterized images
related tweets hearting on emotions of public in general. The among half a million images from Flicker. These feeble labels
outbreak of Ebola was considered in this case and queries are fine-tuned with the help of an advanced and domain
like whether dissimilarity between several feeling categorizers shift approach which in turn hone the neural network. Other
can be exposed through the model and whether there remains than this, a huge physically marked visual sentiment factual
optimistic attitude during disasters were addressed in the paper. data via Amazon Mechanical Turk was created in this work.
Another instance shows the usage of hybrid unsupervised From both the works, it could be inferred that well-trained
approach along with language processing, dictionary-based convolution neural networks outperformed existing optical
methods and ontology procedures to create classifiers for emo- sentiment analysis methods on social media. Although,
tions expressed on social media [181]. The Palavras software is Yuan et al. [187] are of the opinion that both texts and images
used for the experimentation to determine the sentiments from are helpful while detecting the emotion of a user on social
Portuguese texts in social networks. The process comprises media. Primarily low-rated traits are dug out from the SUN
collecting all the corpus relating to a particular topic, stan- database [188] and classified to produce 102 intermediately
dardizing the texts, the relevant entity recognition, uncovering rated characteristics which are further utilized to envisage
of the circumstance in which the entity was mined, selection of sentiments. On the other hand, the look wise emotions are
identifying characteristics, sentiment revealing, culminating of predicted using eigenfaces. This entire method performs well
sentiment rates, accumulation of the retrieved data and finally to identify well-built positive and negative sentiments showing
analyzing them. 82% accuracy after the complete execution. An example
of this type of application is also found for microblogs as
B. Opinion Mining mentioned in [189], which along with the proposal of a
framework, helps to acquire a minuscule and large-scale view
While sentiment analysis deals with judging the feeling of the details of the emotions extracted. Sentiment analysis on
within texts, Opinion Mining is said to be the process of social media has been explored in many languages like Czech
judging the attitude of people about an entity. A detailed study language as mentioned in [174], where administered machine
of Opinion Mining on social media, including its problem learning techniques have been employed on document level
definition in detail, categorization of sentiments in varied emotion detection on 10 000 Facebook posts.
aspects, regulations related to designating an opinion, mining To facilitate industries facing a strong challenge against
of features, extracting relative opinions and ultimately spam each other, a comparative analysis of what people are saying
detection within opinions is available in [182]. Sentiments on social media has been modeled in [190], which provides
regarding the transportation service in Milan on Twitter were an option to devise methods required for product-specific
analyzed to provide enhanced itinerary services and also the promotion strategies. The projected model has also been
sentiments could be used to adjust the services as per the implemented into an investigative tool VOZIQ and further
preference of the traveler [183]. Tweets were collected both tested on five trading concerns to produce significant business
commencing and concluding from the respective travel agency, reports. This structure aims to find out the most important
and the model designed in the paper analyzed the contents companies related to a specific business type and provides
to categorize the occurrences as well as the categorization of a detailed report of their performances focusing on the vital
attitudes about the transportation service. One of the straight- features and ultimately paving the way for clever judgment
forward rule-based sentiment analysis methods is mentioned building. Dynamic Architecture for Artificial Neural Networks
in [184], named VADER which results in a 0.96 accuracy (DAN2) [191] addresses such an issue, where the product
compared to other methods. Factors relating to value and efficacy is tested on a Starbucks related tweet data set with
magnitude were considered to design a universal, valence above 80% accuracy in all test cases. The feature in this case
based, manually crafted gold standard word list suited for was evaluated through administered characteristic manufac-
microblog texts of limited characters. turing resulting in a feature with precisely seven dimensions.
Three-class and five-class categorization of emotions were
C. Optical Sentiment Analysis applied on the data set to provide effective insights to placid
Not only in texts, but analysis for human emotions also sentiments which might be required for crucial brand market-
travels through pictures as well. An optical sentiment analysis ing strategies. As mentioned earlier, emotion detection and its
categorization approach [185] relying on deep convolution analysis have been performed on almost all languages of the
neural networks had been applied over a million labeled world, one such application of segregating multiple languages
images collected from Flicker. This method, implemented on a has been projected in [192], where adjective-noun duo has
novel deep learning framework Caffe, helped to determine the been used to create a huge multiple language optical emotion
emotions portrayed in the images through the use of adjective- system considering data of 12 languages from assorted origins.
noun expressions involuntarily extracted from the images. This
approach proved to perform well against conventional methods
like support vector machine (SVM) categorization techniques. D. Multiple Facets of Sentiments and Its Analysis
Another implementation of deep convolution neural frame- Other aspects of social media, where texts are posted along
work has also been employed in [186], where the automatic with images gained attention in [193], which facilitated both
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
458 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 2, APRIL 2020
solitary and multifaceted views on analyzing sentiments on XII. C ONCLUSION AND F UTURE S COPE
social media. The multi-view sentiment analysis (MVSA) data
set can be considered as a threshold which yielded positive It is high time that humans prioritize the unremitting rise
relations between textual and optical data. Data from social of data from social networks. As almost all real-life complex
media can also be utilized to find out the unfinished errands problems ranging from biological to technological types can
of an area of Sejong City, as explained in [194], where the be represented by means of social networks, its challenges
sentiment analysis model has been assessed using the Naïve should also be addressed. Rumor detection [224], echoing of
Bayes classifier at an accuracy of 75%. Mere availability of opinions, trends of online conversations leading to chaotic
lexical source for sentiment analysis, for example, SentiWord- situations and community shaming [225], bring a change to
Net, is not enough, accuracy also matters when issues relating preconceived notions, to understand that social prevalence in
to trends of the public opinion are involved. In [195], SentiMI form of quantity of likes, shares and retweets. Features like
has been created which separates the individual examples from finding the appropriate content and the right time to post
the object-oriented ones in SentiWordNet, and which extracts are some of the important issues that need to be addressed
the parts of speech and evaluates the joint data for both in social networks before imbibing into the lives of humans
optimistic and pessimistic terms. Out of all the methods men- completely. Even the detection of false comments should
tioned above, a novel approach was concentrated on in [196], be addressed at the micro-level of social sites like Twitter
where verbal communication processes were considered to to avoid unnecessary harassment from spams [226], [227].
detect the overall sentiment on a topic. Several disparities Health issues of serious concern should be addressed in further
and unsymmetrical properties of unambiguous and implied research so that they make a strong impact on social media
expressions along with the straight effect that discourse pat- users. It would be fitting at this era if a unified linguistic
terns make on attitude power lead to the implementation of model be prepared that understands the sentiments of the
the above concept. Initially, the level of stir, enhancers and users while she/he is posting comments on the social media.
attenuators on emotion-laden words are studied followed by To make the brain think like humans, the theme of object
ways devised as to how the emotions can be expressed barred perception must be concentrated on to correctly understand
the use of emotion-filled words and finally it shows how the look and feel of any object as humans and simultaneously
the cohesiveness between phrases decide the total nature of their behavioral patterns be studied from their responses to
a review. The first work on irony recognition was reported certain happenings [228]. Video analysis is a major research
in [197], where sentiments were explored to find the sarcasm field that might gain popularity in the upcoming years. Influ-
within texts. The use of personalized features and pretrained ential nodes which are responsible for sharing appropriate
models for trait mining yielded high-performance results. information must be constrained by some feature mining so
The classification was done by applying CNN first followed that irrelevant information may not become viral within a
by SVM [198]. fraction of second. Last but not least, personalization in terms
An overall descriptive advancement in the field of sentiment of content portrayal on social media and social networks
analysis in the last decade has been described in the previous should be given utmost importance to enhance the quality
sections. Supplementary Table II [199]–[218] provides a tab- of the web content. Effective methods to rate the comments
ular format comprising some major details about sentiment of users in social sites for recommendation systems should
analysis. be trodden upon [229]. Further, reducing ambiguity in the
In recent times, constant urge to compute accurate sentiment gallon of data being generated daily in these networks always
identification, researchers are extensively working on varied provides ample scope for research. Importance should also
aspects. Not only that individual technique is explored daily, be given into the amalgamation of literature and technology
but hybridization techniques are also followed which turn where consistency between the adaptation of original novels
out to be better result producers in case of perfect senti- and its visual counterparts are being dealt with recently [230].
ment detection. A visual sentiment analysis framework along Authors of this article have recently trodden into the field
with amalgamation of low and middle level characteristics of sentiment analysis making minor contributions in finding
of images is proposed in [219], resulting in a 9% rise in the near semantic meaning of a word in sentences as well as
accuracy. Supervised learning methodologies like K-nearest in documents [231], [232] and has presented a survey of all
neighbor (KNN) [220] and SVM [198] have been used to the works performed in native Indian languages [233].
extract the emotions within images. Features are designed This one-of-a-kind paper presents a detailed survey of
using the singular value decomposition (SVD) [221] and hue social networks and its related terms. The works that have
saturation intensity (HSI) [222] methods. Sentiment extraction been accomplished relating to cluster, community and social
from social media to create health-related awareness without networks have been described in its scope. This article mainly
the influence of medicines was found out in [223]. The aims to bring out the shortfalls of the wide variety of papers
most popular medicine category with respect to handling, making it easy for researches to apply sentiment analysis meth-
value, cost and regularity of procurement could be derived ods after accumulating data from social media. The novelties
due to a high precision acquired in the process and also have also been mentioned for papers in sentiment analysis to
it was also perceived that people pay attention to health help scholars think of innovative ideas to train machines more
problems pertaining to eye, skin and sexual well-being in the efficiently in recognizing the opinion of the masses. Papers
current age. from the 20th century have been considered mainly following
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
CHAKRABORTY et al.: SURVEY OF SENTIMENT ANALYSIS FROM SOCIAL MEDIA DATA 459
the rise in the trend of social media data and its corresponding [23] J. Tang, J. Sun, C. Wang, and Z. Yang, “Social influence analysis in
analysis. It is anticipated that more of the omnipresent deep large-scale networks,” in Proc. 15th ACM SIGKDD Int. Conf. Knowl.
Discovery Data Mining, Jun. 2009, pp. 807–816.
learning mechanisms can be employed for social networks as [24] J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A k-means
they automatically detect features from patterns and hence will clustering algorithm,” J. Roy. Stat. Soc. C, Appl. Statist., vol. 28, no. 1,
provide more structure to unstructured information without the pp. 100–108, 1979.
[25] G. Schwarz, “Estimating the dimension of a model,” Ann. Statist.,
minimum amount of human intervention. vol. 6, no. 2, pp. 461–464, 1978.
[26] C. Borgelt, “Efficient implementations of apriori and eclat,” in Proc.
R EFERENCES IEEE ICDM Workshop Frequent Itemset Mining Implement. (FIMI),
Nov. 2003, pp. 1–10.
[1] M. Hoffman, D. Steinley, K. M. Gates, M. J. Prinstein, and [27] S. Gole and B. Tidke, “Frequent itemset mining for big data in social
M. J. Brusco, “Detecting clusters/communities in social networks,” media using ClustBigFIM algorithm,” in Proc. Int. Conf. Pervasive
Multivariate Behav. Res., vol. 53, no. 1, pp. 57–73, 2018, doi: 10.1080/ Comput. (ICPC), Jan. 2015, pp. 1–6.
00273171.2017.1391682. [28] G. W. Flake, S. Lawrence, and C. L. Giles, “Efficient identification of
[2] J. Leskovec, “Social media analytics: Tracking, modeling and predict- Web communities,” in Proc. KDD, Aug. 2000, pp. 150–160.
ing the flow of information through networks,” in Proc. 20th Int. Conf. [29] M. E. J. Newman, “Fast algorithm for detecting community structure in
Companion World Wide Web, Mar. 2011, pp. 277–278. networks,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip.
[3] F. Neri, C. Aliprandi, F. Capeci, M. Cuadros, and T. By, “Sentiment Top., vol. 69, no. 6, pp. 66133–66138, 2004.
analysis on social media,” in Proc. IEEE/ACM Int. Conf. Adv. Social [30] L. Donetti and M. A. Munoz, “Detecting network communities:
Netw. Anal. Mining, Aug. 2012, pp. 919–926. A new systematic and efficient algorithm,” J. Stat. Mech., Theory Exp.,
[4] C. Bron and J. Kerbosch, “Algorithm 457: Finding all cliques of an vol. 2004, no. 10, 2004, Art. no. P10012.
undirected graph,” Commun. ACM, vol. 16, no. 9, pp. 575–577, 1973. [31] A. Clauset, M. E. Newman, and C. Moore, “Finding community
[5] R. L. Breiger, S. A. Boorman, and P. Arabie, “An algorithm for structure in very large networks,” Phys. Rev. E, Stat. Phys. Plasmas
clustering relational data with applications to social network analysis Fluids Relat. Interdiscip. Top., vol. 70, no. 6, 2004, Art. no. 066111.
and comparison with multidimensional scaling,” J. Math. Psychol., [32] C. P. Massen and J. P. K. Doye, “Identifying communities within energy
vol. 12, no. 3, pp. 328–383, 1975. landscapes,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip.
[6] S. Brin and L. Page, “The anatomy of a large-scale hypertextual Top., vol. 71, no. 4, 2005, Art. no. 046101.
Web search engine,” Comput. Netw. ISDN Syst., vol. 30, nos. 1–7, [33] S. Fortunato, V. Latora, and M. Marchiori, “Method to find commu-
pp. 107–117, 1998. nity structures based on information centrality,” Phys. Rev. E, Stat.
[7] J. M. Kleinberg, “Authoritative sources in a hyperlinked environment,” Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 70, no. 5, 2004,
J. ACM, vol. 46, no. 5, pp. 604–632, 1999. Art. no. 056104.
[8] C. Canali, M. Colajanni, and R. Lancellotti, “Data acquisition in social [34] M. E. J. Newman and M. Girvan, “Finding and evaluating community
networks: Issues and proposals,” in Proc. Int. Workshop Services Open structure in networks,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat.
Sources (SOS), Jun. 2011, pp. 1–12. Interdiscip. Top., vol. 69, no. 2, 2004, Art. no. 026113.
[9] B. Wellman, “The development of social network analysis: A study
[35] F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi,
in the sociology of science,” Contemp. Sociol., vol. 37, no. 3, p. 221,
“Defining and identifying communities in networks,” Proc. Nat. Acad.
2008.
Sci. USA, vol. 101, no. 9, pp. 2658–2663, 2004.
[10] D. Jensen and J. Neville, “Data mining in social networks,” in
[36] A. Clauset, “Finding local community structure in networks,” Phys.
Dynamic Social Network Modeling and Analysis: Workshop Summary
Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 72,
and Papers (Computer Science Department Faculty Publication Series).
no. 2, 2005, Art. no. 026132.
Amherst, MA, USA: Univ. of Massachusetts, 2003, pp. 287–302.
[11] C. S. Campbell, P. P. Maglio, A. Cozzi, and B. Dom, “Expertise [37] G. Palla, I. Derényi, I. Farkas, and T. Vicsek, “Uncovering the overlap-
identification using email communications,” in Proc. 12th Int. Conf. ping community structure of complex networks in nature and society,”
Inf. Knowl. Manage., Nov. 2003, pp. 528–531. Nature, vol. 435, no. 7043, p. 814, 2005.
[12] J. Neville and D. Jensen, “Collective classification with relational [38] H. Ino, M. Kudo, and A. Nakamura, “Partitioning of Web graphs
dependency networks,” in Proc. Workshop Multi-Relational Data Min- by community topology,” in Proc. 14th Int. Conf. World Wide Web,
ing (MRDM), 2003, p. 77. May 2005, pp. 661–669.
[13] S. M. Van Dongen, “Graph clustering by flow simulation,” [39] R. E. Gomory and T. C. Hu, “Multi-terminal network flows,” J. Soc.
Ph.D. dissertation, Dept. Center Math. Comput. Sci., Univ. Utrecht, Ind. Appl. Math., vol. 9, no. 4, pp. 551–570, 1961.
Utrecht, The Netherlands, 2000. [40] J. Reichardt and S. Bornholdt, “Statistical mechanics of community
[14] U. Von Luxburg, “A tutorial on spectral clustering,” Statist. Comput., detection,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip.
vol. 17, no. 4, pp. 395–416, 2007. Top., vol. 74, no. 1, 2006, Art. no. 016110.
[15] S. C. Johnson, “Hierarchical clustering schemes,” Psychometrika, [41] M. B. Hastings, “Community detection as an inference problem,” Phys.
vol. 32, no. 3, pp. 241–254, 1967, doi: 10.1007/BF02289588. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 74,
[16] A. Shepitsen, J. Gemmell, B. Mobasher, and R. Burke, “Personalized no. 3, 2006, Art. no. 035102.
recommendation in social tagging systems using hierarchical cluster- [42] M. E. J. Newman, “Finding community structure in networks using
ing,” in Proc. ACM Conf. Recommender Syst., Oct. 2008, pp. 259–266. the eigenvectors of matrices,” Phys. Rev. E, Stat. Phys. Plasmas Fluids
[17] M. Roth et al., “Suggesting friends using the implicit social graph,” in Relat. Interdiscip. Top., vol. 74, no. 3, 2006, Art. no. 036104.
Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, [43] F. Luo, J. Z. Wang, and E. Promislow, “Exploring local community
Jul. 2010, pp. 233–242. structures in large networks,” Web Intell. Agent Syst., Int. J., vol. 6,
[18] V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal, “A survey of algorithms no. 4, pp. 387–400, 2008.
for dense subgraph discovery,” in Managing and Mining Graph Data. [44] P. Pons and M. Latapy, “Computing communities in large networks
Boston, MA, USA: Springer, 2010, pp. 303–336. using random walks,” in Proc. Int. Symp. Comput. Inf. Sci. Berlin,
[19] C. C. Yang and T. D. Ng, “Analyzing and visualizing Web opinion Germany: Springer, Oct. 2005, pp. 284–293.
development and social interactions with density-based clustering,” [45] I. Vragović and E. Louis, “Network community structure and loop
IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 41, no. 6, coefficient method,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat.
pp. 1144–1155, Mar. 2011. Interdiscip. Top., vol. 74, no. 1, 2006, Art. no. 016105.
[20] G.-J. Qi, C. C. Aggarwal, and T. S. Huang, “On clustering heteroge- [46] V. Latora and M. Marchiori, “Efficient behavior of small-world net-
neous social media objects with outlier links,” in Proc. 5th ACM Int. works,” Phys. Rev. Lett., vol. 87, Oct. 2001, Art. no. 198701.
Conf. Web Search Data Mining, Feb. 2012, pp. 553–562. [47] U. N. Raghavan, R. Albert, and S. Kumara, “Near linear time algorithm
[21] J. Shi, N. Mamoulis, D. Wu, and D. W. Cheung, “Density-based place to detect community structures in large-scale networks,” Phys. Rev. E,
clustering in geo-social networks,” in Proc. ACM SIGMOD Int. Conf. Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 76, no. 3, 2007,
Manage. Data, Jun. 2014, pp. 99–110. Art. no. 036106.
[22] S. Scellato, A. Noulas, R. Lambiotte, and C. Mascolo, “Socio-spatial [48] V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast
properties of online location-based social networks,” in Proc. 5th Int. unfolding of communities in large networks,” J. Stat. Mech., Theory
AAAI Conf. Weblogs Social Media, Jul. 2011, pp. 1–8. Exp., vol. 2008, no. 10, 2008, Art. no. P10008.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
460 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 2, APRIL 2020
[49] C. Pizzuti, “Ga-net: A genetic algorithm for community detection in [74] J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, “Arnetminer:
social networks,” in Proc. Int. Conf. Parallel Problem Solving Nature. Extraction and mining of academic social networks,” in Proc. 14th
Berlin, Germany: Springer, Sep. 2008, pp. 1081–1090. ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Aug. 2008,
[50] I. X. Y. Leung, P. Hui, P. Lio, and J. Crowcroft, “Towards real- pp. 990–998.
time community detection in large networks,” Phys. Rev. E, Stat. [75] N. Eagle, A. S. Pentland, and D. Lazer, “Inferring friendship network
Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 79, no. 6, 2009, structure by using mobile phone data,” Proc. Nat. Acad. Sci. USA,
Art. no. 066107. vol. 106, no. 36, pp. 15274–15278, 2009.
[51] S. Gregory, “Finding overlapping communities in networks by label [76] E. Zheleva and L. Getoor, “To join or not to join: The illusion of privacy
propagation,” New J. Phys., vol. 12, no. 10, 2010, Art. no. 103018. in social networks with mixed public and private user profiles,” in Proc.
[52] J. Chen, O. Zaïane, and R. Goebel, “A visual data mining approach 18th Int. Conf. World Wide Web, Apr. 2009, pp. 531–540.
to find overlapping communities in networks,” in Proc. Int. Conf. Adv. [77] A. K. Pietiläinen, E. Oliver, J. LeBrun, G. Varghese, and C. Diot,
Social Netw. Anal. Mining, Jul. 2009, pp. 338–343. “MobiClique: Middleware for mobile social networking,” in Proc. 2nd
[53] X. Wang, L. Tang, H. Gao, and H. Liu, “Discovering overlapping ACM Workshop Online Social Netw., Aug. 2009, pp. 49–54.
groups in social media,” in Proc. IEEE Int. Conf. Data Mining, [78] I. A. Kovács, R. Palotai, M. S. Szalay, and P. Csermely, “Community
Dec. 2010, pp. 569–578. landscapes: An integrative approach to determine overlapping network
[54] B. Wang, X. Wang, C. Sun, B. Liu, and L. Sun, “Modeling semantic module hierarchy, identify key nodes and predict network dynamics,”
relevance for question-answer pairs in Web social communities,” in PLoS ONE, vol. 5, no. 9, 2010, Art. no. e12528.
Proc. 48th Annu. Meeting Assoc. Comput. Linguistics, Jul. 2010, [79] D. Centola, “The spread of behavior in an online social network
pp. 1230–1238. experiment,” Science, vol. 329, no. 5996, pp. 1194–1197, 2010.
[55] S. Parthasarathy, Y. Ruan, and V. Satuluri, “Community discovery [80] H. Gao, J. Hu, C. Wilson, Z. Li, Y. Chen, and B. Y. Zhao, “Detect-
in social networks: Applications, methods and emerging trends,” in ing and characterizing social spam campaigns,” in Proc. 10th ACM
Social Network Data Analytics. Boston, MA, USA: Springer, 2011, SIGCOMM Conf. Internet Meas., Nov. 2010, pp. 35–47.
pp. 79–113.
[81] D. Wang, D. Irani, and C. Pu, “A social-spam detection framework,” in
[56] S. Papadopoulos, Y. Kompatsiaris, A. Vakali, and P. Spyridonos, “Com- Proc. 8th Annu. Collaboration, Electron. Messaging, Anti-Abuse Spam
munity detection in social media,” Data Mining Knowl. Discovery, Conf., Sep. 2011, pp. 46–54.
vol. 24, no. 3, pp. 515–554, May 2012.
[82] M. Fire, G. Katz, and Y. Elovici, “Strangers intrusion detection-
[57] M. Plantié and M. Crampes, “Survey on social community detection,” detecting spammers and fake profiles in social networks based on
in Social Media Retrieval. London, U.K.: Springer, 2013, pp. 65–85. topology anomalies,” Hum. J., vol. 1, no. 1, pp. 26–39, 2012.
[58] J. Xie, S. Kelley, and B. K. Szymanski, “Overlapping community
[83] M. D. Roblyer, M. McDaniel, M. Webb, J. Herman, and J. V. Witty,
detection in networks: The state-of-the-art and comparative study,”
“Findings on Facebook in higher education: A comparison of college
ACM Comput. Surv., vol. 45, no. 4, p. 43, 2013.
faculty and student uses and perceptions of social networking sites,”
[59] G. J. Qi, C. C. Aggarwal, and T. Huang, “Community detection with Internet Higher Educ., vol. 13, no. 3, pp. 134–140, 2010.
edge content in social media networks,” in Proc. IEEE 28th Int. Conf.
[84] A. S. Bozkır, S. G. Mazman, and E. A. Sezer, “Identification of user
Data Eng., Apr. 2012, pp. 534–545.
patterns in social networks by data mining techniques: Facebook case,”
[60] J. Tang, X. Wang, and H. Liu, “Integrating social media data for
in Proc. Int. Symp. Inf. Manage. Changing World. Berlin, Germany:
community detection,” in Modeling and Mining Ubiquitous Social Springer, Sep. 2010, pp. 145–153.
Media. Berlin, Germany: Springer, 2011, pp. 1–20.
[85] Y. Wang, G. Cong, G. Song, and K. Xie, “Community-based greedy
[61] L. Weng, F. Menczer, and Y. Y. Ahn, “Virality prediction and com-
algorithm for mining top-k influential nodes in mobile social networks,”
munity structure in social networks,” Sci. Rep., vol. 3, Aug. 2013,
in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
Art. no. 2522.
Jul. 2010, pp. 1039–1048.
[62] S. P. Borgatti, “Centrality and network flow,” Social Netw., vol. 27,
[86] E. Raad, R. Chbeir, and A. Dipanda, “User profile matching in social
no. 1, pp. 55–71, 2005.
networks,” in Proc. 13th Int. Conf. Netw.-Based Inf. Syst., Sep. 2010,
[63] S. P. Borgatti, K. M. Carley, and D. Krackhardt, “On the robustness of
pp. 297–304.
centrality measures under conditions of imperfect data,” Social Netw.,
vol. 28, no. 2, pp. 124–136, 2006. [87] H. Kwak, C. Lee, H. Park, and S. Moon, “What is Twitter, a social
network or a news media?” in Proc. 19th Int. Conf. World Wide Web,
[64] K. W. Axhausen, “Social networks, mobility biographies, and travel:
Apr. 2010, pp. 591–600.
Survey challenges,” Environ. Planning B, Planning Des., vol. 35, no. 6,
pp. 981–996, 2008. [88] C. Yang, R. Harkreader, J. Zhang, S. Shin, and G. Gu, “Analyzing
[65] Y. Zhou, E. Reid, J. Qin, H. Chen, and G. Lai, “US domestic extremist spammers’ social networks for fun and profit: A case study of cyber
groups on the Web: Link and content analysis,” IEEE Intell. Syst., criminal ecosystem on Twitter,” in Proc. 21st Int. Conf. World Wide
vol. 20, no. 5, pp. 44–51, Sep. 2005. Web, Apr. 2012, pp. 71–80.
[66] A. Perer and B. Shneiderman, “Balancing systematic and flexible [89] C.-L. Hsu and H. W. Park, “Mapping online social networks of Korean
exploration of social networks,” IEEE Trans. Vis. Comput. Graphics, politicians,” Government Inf. Quart., vol. 29, no. 2, pp. 169–181, 2012.
vol. 12, no. 5, pp. 693–700, Nov. 2006. [90] D. Hansen, B. Shneiderman, and M. A. Smith, Analyzing Social Media
[67] K. L. Fiori, T. C. Antonucci, and K. S. Cortina, “Social network Networks With NodeXL: Insights From a Connected World. San Mateo,
typologies and mental health among older adults,” J. Gerontol. B, CA, USA: Morgan Kaufmann, 2010.
Psychol. Sci. Social Sci., vol. 61, no. 1, pp. P25–P32, 2006. [91] D. L. Hansen et al., “Do you know the way to SNA?: A process model
[68] L. Backstrom, D. Huttenlocher, J. Kleinberg, and L. X. , “Group for- for analyzing and visualizing social media network data,” in Proc. Int.
mation in large social networks: Membership, growth, and evolution,” Conf. Social Inf., Dec. 2012, pp. 304–313.
in Proc. 12th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, [92] F. Zhao and A. K. H. Tung, “Large scale cohesive subgraphs discovery
Aug. 2006, pp. 44–54. for social network visual analysis,” Proc. VLDB Endowment, vol. 6,
[69] J. Voß, “Measuring wikipedia,” in Proc. ISSI 10th Int. Conf. Int. Soc. no. 2, pp. 85–96, 2012.
Scientometrics Informetrics, 2005. [93] B. Sun and V. T. Ng, “Identifying influential users by their postings
[70] M. A. Sicilia, N. T. Korfiatis, M. Poulos, and G. Bokos, “Evalu- in social networks,” in Ubiquitous Social Media Analysis. Berlin,
ating authoritative sources using social networks: An insight from Germany: Springer, 2012, pp. 128–151.
Wikipedia,” Online Inf. Rev., vol. 30, no. 3, pp. 252–262, 2006. [94] S. He, X. Zheng, D. Zeng, K. Cui, Z. Zhang, and C. Luo, “Identifying
[71] H. E. Aldrich and P. H. Kim, “Small worlds, infinite possibilities? peer influence in online social networks using transfer entropy,” in
How social networks affect entrepreneurial team formation and search,” Proc. Pacific-Asia Workshop Intell. Security Inform. Berlin, Germany:
Strategic Entrepreneurship J., vol. 1, nos. 1–2, pp. 147–165, 2007. Springer, Aug. 2013, pp. 47–61.
[72] C. Kiss and M. Bichler, “Identification of influencers—Measuring [95] A. Javari and M. Jalili, “Cluster-based collaborative filtering for sign
influence in customer networks,” Decis. Support Syst., vol. 46, no. 1, prediction in social networks with positive and negative links,” ACM
pp. 233–253, 2008. Trans. Intell. Syst. Technol., vol. 5, no. 2, p. 24, 2014.
[73] K. Okamoto, W. Chen, and X.-Y. Li, “Ranking of closeness centrality [96] F. Liu, B. Liu, C. Sun, M. Liu, and X. Wang, “Deep belief network-
for large-scale social networks,” in Proc. Int. Workshop Frontiers based approaches for link prediction in signed social networks,”
Algorithmics. Berlin, Germany: Springer, Jun. 2008, pp. 186–195. Entropy, vol. 17, no. 4, pp. 2140–2169, 2015.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
CHAKRABORTY et al.: SURVEY OF SENTIMENT ANALYSIS FROM SOCIAL MEDIA DATA 461
[97] N. Phan, D. Dou, B. Piniewski, and D. Kil, “Social restricted boltzmann [120] K. Lee, J. Caverlee, Z. Cheng, and D. Z. Sui, “Content-driven detection
machine: Human behavior prediction in health social networks,” in of campaigns in social media,” in Proc. 20th ACM Int. Conf. Inf. Knowl.
Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining, Aug. 2015, Manage., Oct. 2011, pp. 551–556.
pp. 424–431. [121] D. M. Romero, W. Galuba, S. Asur, and B. A. Huberman, “Influence
[98] V. Agarwal and K. K. Bharadwaj, “Predicting the dynamics of social and passivity in social media,” in Proc. Joint Eur. Conf. Mach. Learn.
circles in ego networks using pattern analysis and GA K-means Knowl. Discovery Databases. Berlin, Germany: Springer, Sep. 2011,
clustering,” Wiley Interdiscipl. Rev., Data Mining Knowl. Discovery, pp. 18–33.
vol. 5, no. 3, pp. 113–141, 2015. [122] N. Michaelidou, N. T. Siamagka, and G. Christodoulides, “Usage,
[99] K. W. Lim, C. Chen, and W. Buntine, “Twitter-network topic barriers and measurement of social media marketing: An exploratory
model: A full Bayesian treatment for social network and text mod- investigation of small and medium B2B brands,” Ind. Marketing
eling,” 2016, arXiv:1609.06791. [Online]. Available: https://arxiv.org/ Manage., vol. 40, no. 7, pp. 1153–1159, 2011.
abs/1609.06791 [123] K. Sedereviciute and C. Valentini, “Towards a more holistic stake-
[100] D. T. Nguyen, K. A. A. Mannai, S. Joty, H. Sajjad, M. Imran, and holder analysis approach. Mapping known and undiscovered stake-
P. Mitra, “Robust classification of crisis-related data on social networks holders from social media,” Int. J. Strategic Commun., vol. 5, no. 4,
using convolutional neural networks,” in Proc. 11th Int. AAAI Conf. pp. 221–239, 2011.
Web Social Media, May 2017, pp. 1–4. [124] F. Cheong and C. Cheong, “Social media data mining: A social network
[101] R. Rabade, N. Mishra, and S. Sharma, “Survey of influential user analysis of tweets during the 2010–2011 Australian floods,” PACIS,
identification techniques in online social networks,” in Recent Advances vol. 11, p. 46, Jul. 2011.
in Intelligent Informatics. Cham, Switzerland: Springer, 2014, [125] E. Clark and K. Araki, “Text normalization in social media: Progress,
pp. 359–370. problems and applications for a pre-processing system of casual
[102] D. C. Cercel and T.-M. Stefan, “Opinion propagation in online social English,” Procedia-Social Behav. Sci., vol. 27, pp. 2–11, Jan. 2011.
networks: A survey,” in Proc. 4th Int. Conf. Web Intell., Mining, [126] C. C. Yang, L. Jiang, H. Yang, and X. Tang, “Detecting signals
Semantics (WIMS), Jun. 2014, p. 11. of adverse drug reactions from health consumer contributed content
[103] M. Adedoyin-Olowe, M. M. Gaber, and F. Stahl, “A survey of data in social media,” in Proc. ACM SIGKDD Workshop Health Inf.,
mining techniques for social media analysis,” 2013, arXiv:1312.4617. Aug. 2012, pp. 1–8.
[Online]. Available: https://arxiv.org/abs/1312.4617 [127] J. Tang and H. Liu, “Feature selection with linked data in social media,”
[104] J. Bian, Y. Liu, E. Agichtein, and H. Zha, “Finding the right facts in in Proc. SIAM Int. Conf. Data Mining, Apr. 2012, pp. 118–128.
the crowd: Factoid question answering over social media,” in Proc. [128] L. M. Aiello, A. Barrat, R. Schifanella, C. Cattuto, B. Markines, and
17th Int. Conf. World Wide Web, Apr. 2008, pp. 467–476. F. Menczer, “Friendship prediction and homophily in social media,”
[105] E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne, ACM Trans. Web, vol. 6, no. 2, p. 9, 2012.
“Finding high-quality content in social media,” in Proc. Int. Conf. Web [129] B. Han, P. Cook, and T. Baldwin, “Geolocation prediction in social
Search Data Mining, Feb. 2008, pp. 183–194. media data by finding location indicative words,” in Proc. COLING,
[106] K. Denecke and W. Nejdl, “How valuable is medical social media Dec. 2012, pp. 1045–1062.
data? Content analysis of the medical Web,” Inf. Sci., vol. 179, no. 12, [130] G. Ver Steeg and A. Galstyan, “Information transfer in social media,”
pp. 1870–1880, 2009. in Proc. 21st Int. Conf. World Wide Web, Apr. 2012, pp. 509–518.
[107] F. Chen, P. N. Tan, and A. K. Jain, “A co-classification framework for [131] C. Sengstock and M. Gertz, “Latent geographic feature extraction
detecting Web spam and spammers in social media Web sites,” in Proc. from social media,” in Proc. 20th Int. Conf. Adv. Geograph. Inf. Syst.,
18th ACM Conf. Inf. Knowl. Manage., Nov. 2009, pp. 1807–1810. Nov. 2012, pp. 149–158.
[108] B. Markines, C. Cattuto, and F. Menczer, “Social spam detection,” [132] J. Cranshaw, R. Schwartz, J. Hong, and N. Sadeh, “The livehoods
in Proc. 5th Int. Workshop Adversarial Inf. Retr. Web, Apr. 2009, project: Utilizing social media to understand the dynamics of a city,”
pp. 41–48. in Proc. 6th Int. AAAI Conf. Weblogs Social Media, May 2012, pp. 1–8.
[109] H. Sayyadi, M. Hurst, and A. Maykov, “Event detection and tracking [133] S. A. Moorhead, D. E. Hazlett, L. Harrison, J. K. Carroll, A. Irwin,
in social streams,” in Proc. 3rd Int. AAAI Conf. Weblogs Social Media, and C. Hoving, “A new dimension of health care: Systematic review
Mar. 2009, pp. 1–4. of the uses, benefits, and limitations of social media for health
[110] H. Becker, M. Naaman, and L. Gravano, “Event identification in social communication,” J. Med. Internet Res., vol. 15, no. 4, p. e85, 2013.
media,” in Proc. WebDB, Jun. 2009, pp. 1–6. [134] H. A. Schwartz et al., “Personality, gender, and age in the language
[111] R. W. Lariscy, E. J. Avery, K. D. Sweetser, and P. Howes, of social media: The open-vocabulary approach,” PLoS ONE, vol. 8,
“An examination of the role of online social media in journalists’ source no. 9, p. e73791, 2013.
mix,” Public Relations Rev., vol. 35, no. 3, pp. 314–316, 2009. [135] M. Kandias, V. Stavrou, N. Bozovic, and D. Gritzalis, “Proactive insider
[112] C. M. De, Y. R. Lin, H. Sundaram, K. S. Candan, L. Xie, and threat detection through social media: The YouTube case,” in Proc. 12th
A. Kelliher, “How does the data sampling strategy impact the discovery ACM Workshop Privacy Electron. Soc., Nov. 2013, pp. 261–266.
of information diffusion in social media?” in Proc. 4th Int. AAAI Conf. [136] M. Thelwall, “The Heart and soul of the Web? Sentiment strength
Weblogs Social Media, May 2010, pp. 1–8. detection in the social Web with SentiStrength,” in Cyberemotions.
[113] H. Becker, M. Naaman, and L. Gravano, “Learning similarity metrics Cham, Switzerland: Springer, 2013, pp. 119–134.
for event identification in social media,” in Proc. 3rd ACM Int. Conf. [137] F. Morstatter, J. Pfeffer, H. Liu, and K. M. Carley, “Is the sample
Web Search Data Mining, Feb. 2010, pp. 291–300. good enough? Comparing data from twitter’s streaming api with
[114] O. Oh, K. H. Kwon, and H. R. Rao, “An exploration of social media in Twitter’s firehose,” in Proc. 7th Int. AAAI Conf. Weblogs Social Media,
extreme events: Rumor theory and Twitter during the haiti earthquake Jun. 2013, pp. 1–9.
2010,” in Proc. ICIS, vol. 231, Dec. 2010, pp. 7332–7336. [138] Y. Hu, S. D. Farnham, and A. Monroy-Hernández, “Whoo. ly: Facil-
[115] M. Naaman, J. Boase, and C. H. Lai, “Is it really about me?: Message itating information seeking for hyperlocal communities using social
content in social awareness streams,” in Proc. ACM Conf. Comput. media,” in Proc. SIGCHI Conf. Hum. Factors Comput. Syst., Apr. 2013,
Supported Cooperat. Work, Feb. 2010, pp. 189–192. pp. 3481–3490.
[116] S. Asur and B. A. Huberman, “Predicting the future with social media,” [139] F. Mitzlaff, M. Atzmueller, G. Stumme, and A. Hotho, “Semantics
in Proc. IEEE/WIC/ACM Int. Conf. Web Intell. Intell. Agent Technol., of user interaction in social media,” in Complex Networks IV. Berlin,
vol. 1, Aug. 2010, pp. 492–499. Germany: Springer, 2013, pp. 13–25.
[117] N. Diakopoulos, M. Naaman, and F. Kivran-Swaine, “Diamonds in the [140] M. De Choudhury, M. Gamon, S. Counts, and E. Horvitz, “Predicting
rough: Social media visual analytics for journalistic inquiry,” in Proc. depression via social media,” in Proc. 7th Int. AAAI Conf. Weblogs
IEEE Symp. Visual Anal. Sci. Technol., Oct. 2010, pp. 115–122. Social Media, Jun. 2013, pp. 1–10.
[118] Z. Xiang and U. Gretzel, “Role of social media in online travel [141] W. He, S. Zha, and L. Li, “Social media competitive analysis and text
information search,” Tourism Manage., vol. 31, no. 2, pp. 179–188, mining: A case study in the pizza industry,” Int. J. Inf. Manage., vol. 33,
2010. no. 3, pp. 464–472, 2013.
[119] W. C. Jacobsen and R. Forste, “The wired generation: Academic and [142] K. Casler, L. Bickel, and E. Hackett, “Separate but equal? A compar-
social outcomes of electronic media use among University students,” ison of participants and data gathered via Amazon’s MTurk, social
Cyberpsychol., Behav., Social Netw., vol. 14, no. 5, pp. 275–280, media, and face-to-face behavioral testing,” Comput. Hum. Behav.,
2011. vol. 29, pp. 2156–2160, Nov. 2013.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
462 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 2, APRIL 2020
[143] T. Baldwin, P. Cook, M. Lui, A. MacKinlay, and L. Wang, “How noisy [167] B. Bischke, D. Borth, C. Schulze, and A. Dengel, “Contextual enrich-
social media text, how diffrnt social media sources?” in Proc. 6th Int. ment of remote-sensed events with social media streams,” in Proc. 24th
Joint Conf. Natural Lang. Process., Oct. 2013, pp. 356–364. ACM Int. Conf. Multimedia, Oct. 2016, pp. 1077–1081.
[144] A. Majid, L. Chen, G. Chen, H. T. Mirza, and I. W. J. Hussain, [168] L. Liu, D. Preotiuc-Pietro, Z. R. Samani, M. E. Moghaddam, and
“A context-aware personalized travel recommendation system based L. Ungar, “Analyzing personality through social media profile picture
on geotagged social media data mining,” Int. J. Geograph. Inf. Sci., choice,” in Proc. 10th Int. AAAI Conf. Web Social Media, Mar. 2016,
vol. 27, no. 4, pp. 662–684, 2013. pp. 1–10.
[145] M. Khan, M. Dickinson, and S. Kübler, “Does size matter? Text and [169] J. P. Mazer et al., “Communication in the face of a school crisis:
grammar revision for parsing social media data,” in Proc. Workshop Examining the volume and content of social media mentions during
Lang. Anal. Social Media, 2013, pp. 1–10. active shooter incidents,” Comput. Hum. Behav., vol. 53, pp. 238–248,
[146] L. Derczynski, A. Ritter, S. Clark, and K. Bontcheva, “Twitter part- Dec. 2015.
of-speech tagging for all: Overcoming sparse and noisy data,” in [170] B. Dhingra, Z. Zhou, D. Fitzpatrick, M. Muehl, and W. W. Cohen,
Proc. Int. Conf. Recent Adv. Natural Lang. Process. (RANLP), 2013, “Tweet2vec: Character-based distributed representations for social
pp. 198–206. media,” 2016, arXiv:1605.03481. [Online]. Available: https://arxiv.
[147] Y. Zhang and M. Pennacchiotti, “Recommending branded products org/abs/1605.03481
from social media,” in Proc. 7th ACM Conf. Recommender Syst., [171] L. Song, R. Y. K. Lau, R. C. W. Kwok, K. Mirkovski, and W. Dou,
Oct. 2013, pp. 77–84. “Who are the spoilers in social media marketing? Incremental learning
[148] X. Ma, H. Wang, H. Li, J. Liu, and H. Jiang, “Exploring sharing of latent semantics for social spam detection,” Electron. Commerce
patterns for video recommendation on YouTube-like social media,” Res., vol. 17, no. 1, pp. 51–81, 2017.
Multimedia Syst., vol. 20, no. 6, pp. 675–691, 2014. [172] F. Krebs, B. Lubascher, T. Moers, P. Schaap, and G. Spanakis,
[149] S. Pei, L. Muchnik, J. S. Andrade, Jr., Z. Zheng, and H. A. Makse, “Social emotion mining techniques for Facebook posts reaction pre-
“Searching for superspreaders of information in real-world social diction,” 2017, arXiv:1712.03249. [Online]. Available: https://arxiv.
media,” Sci. Rep., vol. 4, Jul. 2014, Art. no. 5547. org/abs/1712.03249
[150] M. Abdul-Mageed, M. Diab, and S. Kübler, “SAMAR: Subjectivity and [173] J. Su, A. Shukla, S. Goel, and A. Narayanan, “De-anonymizing Web
sentiment analysis for Arabic social media,” Comput. Speech Lang., browsing data with social networks,” in Proc. 26th Int. Conf. World
vol. 28, no. 1, pp. 20–37, 2014. Wide Web, Apr. 2017, pp. 1261–1269.
[151] S. Mei, H. Li, J. Fan, X. Zhu, and C. R. Dyer, “Inferring air pollution [174] H. Aleid et al., “Framework to classify and analyze social media
by sniffing social media,” in Proc. IEEE/ACM Int. Conf. Adv. Social content,” Social Netw., vol. 7, no. 2, p. 79, 2018.
Netw. Anal. Mining, Aug. 2014, pp. 534–539. [175] F. Aisopos, G. Papadakis, and T. Varvarigou, “Sentiment analysis of
[152] J. Choi et al., “The placing task: A large-scale geo-estimation challenge social media content using N-Gram graphs,” in Proc. 3rd ACM SIGMM
for social-media videos and images,” in Proc. 3rd ACM Multimedia Int. Workshop Social Media, Nov. 2011, pp. 9–14.
Workshop Geotagging Appl. Multimedia, Nov. 2014, pp. 27–31. [176] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon-
[153] M. Sap et al., “Developing age and gender predictive lexica over social based methods for sentiment analysis,” Comput. Linguistics, vol. 37,
media,” in Proc. Conf. Empirical Methods Natural Lang. Process. no. 2, pp. 267–307, 2011.
(EMNLP), Oct. 2014, pp. 1146–1151. [177] I. Habernal, T. Ptácek, and J. Steinberger, “Reprint of ‘supervised
[154] I. C. L. Memon, A. Majid, M. Lv, and I. C. G. Hussain, “Travel sentiment analysis in Czech social media,” Inf. Process. Manage.,
recommendation using geo-tagged photos in social media for tourist,” vol. 51, no. 4, pp. 532–546, 2015.
Wireless Pers. Commun., vol. 80, no. 4, pp. 1347–1362, 2015. [178] K. P. Murphy, “Naive Bayes classifiers,” Univ. Brit. Columbia,
[155] D. Bamman, J. Eisenstein, and T. Schnoebelen, “Gender identity and Vancouver, BC, Canada, Tech. Rep., 2006, p. 60, vol. 18.
lexical variation in social media,” J. Sociolinguistics, vol. 18, no. 2, [179] M. Salathé and S. Khandelwal, “Assessing vaccination sentiments with
pp. 135–160, 2014. online social media: Implications for infectious disease dynamics and
[156] H. Lin et al., “User-level psychological stress detection from social control,” PLoS Comput. Biol., vol. 7, no. 10, 2011, Art. no. e1002199.
media using deep neural network,” in Proc. 22nd ACM Int. Conf. [180] Y. Lu, X. Hu, F. Wang, S. Kumar, H. Liu, and R. Maciejewski,
Multimedia, Nov. 2014, pp. 507–516. “Visualizing social media sentiment in disaster scenarios,” in Proc.
[157] R. Nivedha and N. Sairam, “A machine learning based classification 24th Int. Conf. World Wide Web, May 2015, pp. 1211–1215.
for social media messages,” Indian J. Sci. Technol., vol. 8, no. 16, p. 1, [181] R. Baracho, M. Bax, L. G. F. Ferreira, and G. Ca’ires Silva, Sentiment
2015. Analysis in Social Networks. Stanford, CA, USA: Association for the
[158] A. P. López-Monroy, M. Montes-y-Gómez, H. J. Escalante, Advancement of Artificial Intelligence, 2012.
L. Villaseñor-Pineda, and E. Stamatatos, “Discriminative subprofile- [182] B. Liu and L. Zhang, “A survey of opinion mining and sentiment
specific representations for author profiling in social media,” Knowl.- analysis,” in Mining Text Data. Boston, MA, USA: Springer, 2012,
Based Syst., vol. 89, pp. 134–147, Nov. 2015. pp. 415–463.
[159] P. Barberá, “Birds of the same feather tweet together: Bayesian ideal [183] A. Candelieri and F. Archetti, “Detecting events and sentiment on
point estimation using Twitter data,” Political Anal., vol. 23, no. 1, Twitter for improving urban mobility,” in Proc. ESSEM AAMAS,
pp. 76–91, 2015. May 2015, pp. 106–115.
[160] C. Cao and J. Caverlee, “Detecting spam URLs in social media via [184] C. J. Hutto and E. Gilbert, “VADER: A parsimonious rule-based model
behavioral analysis,” in Proc. Eur. Conf. Inf. Retr. Cham, Switzerland: for sentiment analysis of social media text,” in Proc. 8th Int. AAAI Conf.
Springer, Mar. 2015, pp. 703–714. Weblogs Social Media, May 2014, pp. 1–10.
[161] T. Ma et al., “Social network and tag sources based augmenting [185] T. Chen, D. Borth, T. Darrell, and S. F. Chang, “Deepsen-
collaborative recommender system,” IEICE Trans. Inf. Syst., vol. 98, tiBank: Visual sentiment concept classification with deep convolu-
no. 4, pp. 902–910, 2015. tional neural networks,” 2014, arXiv:1410.8586. [Online]. Available:
[162] M. Santillana, A. T. Nguyen, M. Dredze, M. J. Paul, E. O. Nsoesie, https://arxiv.org/abs/1410.8586
and J. S. Brownstein, “Combining search, social media, and traditional [186] Q. You, J. Luo, H. Jin, and J. Yang, “Robust image sentiment analysis
data sources to improve influenza surveillance,” PLoS Comput. Biol., using progressively trained and domain transferred deep networks,” in
vol. 11, no. 10, 2015, Art. no. e1004513. Proc. 29th AAAI Conf. Artif. Intell., Feb. 2015, pp. 1–8.
[163] F. Gelli, T. Uricchio, M. Bertini, A. Del Bimbo, and S.-F. Chang, [187] J. Yuan, Q. You, and J. Luo, “Sentiment analysis using social multi-
“Image popularity prediction in social media using sentiment and media,” in Multimedia Data Mining and Analytics. Cham, Switzerland:
context features,” in Proc. 23rd ACM Int. Conf. Multimedia, Oct. 2015, Springer, 2015, pp. 31–59.
pp. 907–910. [188] A. Hanjalic, C. Kofler, and M. Larson, “Intent and its discontents:
[164] W. Chen, C. K. Yeo, C. T. Lau, and B. S. Lee, “Real-time Twitter The user at the wheel of the online video search engine,” in Proc. 20th
content polluter detection based on direct features,” in Proc. 2nd Int. ACM Int. Conf. Multimedia, Oct. 2012, pp. 1239–1248.
Conf. Inf. Sci. Security (ICISS), Dec. 2015, pp. 1–4. [189] D. Cao, R. Ji, D. Lin, and S. Li, “A cross-media public sentiment
[165] J. Ma et al., “Detecting rumors from microblogs with recurrent neural analysis system for microblog,” Multimedia Syst., vol. 22, no. 4,
networks,” in Proc. IJCAI, Jul. 2016, pp. 3818–3824. pp. 479–486, 2016.
[166] V. Kulkarni, B. Perozzi, and S. Skiena, “Freshman or fresher? Quanti- [190] W. He, H. Wu, G. Yan, V. Akula, and J. Shen, “A novel social
fying the geographic variation of language in online social media,” in media competitive analytics framework with sentiment benchmarks,”
Proc. 10th Int. AAAI Conf. Web Social Media, Mar. 2016, pp. 1–6. Inf. Manage., vol. 52, no. 7, pp. 801–812, 2015.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
CHAKRABORTY et al.: SURVEY OF SENTIMENT ANALYSIS FROM SOCIAL MEDIA DATA 463
[191] D. Zimbra, M. Ghiassi, and S. Lee, “Brand-related Twitter sentiment [215] M. Ahmad, S. Aftab, I. Ali, and N. Hameed, “Hybrid tools and
analysis using feature engineering and the dynamic architecture for techniques for sentiment analysis: A review,” Int. J. Multidiscip. Sci.
artificial neural networks,” in Proc. 49th Hawaii Int. Conf. Syst. Sci. Eng., vol. 8, no. 3, pp. 1–6, 2017.
(HICSS), Jan. 2016, pp. 1930–1938. [216] K. Elshakankery and M. F. Ahmed, “HILATSA: A hybrid incremental
[192] B. Jou, T. Chen, N. Pappas, M. Redi, M. Topkara, and S. F. Chang, learning approach for Arabic tweets sentiment analysis,” Egyptian
“Visual affect around the world: A large-scale multilingual visual sen- Inform. J., to be published.
timent ontology,” in Proc. 23rd ACM Int. Conf. Multimedia, Oct. 2015, [217] Q. T. Ain et al., “Sentiment analysis using deep learning techniques:
pp. 159–168. A review,” Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 6, p. 424, 2017.
[193] T. Niu, S. Zhu, L. Pang, and A. E. Saddik, “Sentiment analysis on [218] A. Kalaivani and D. Thenmozhi, “Sentiment analysis using deep
multi-view social data,” in Proc. Int. Conf. Multimedia Modeling. learning techniques,” Int. J. Recent Technol. Eng., vol. 7, no. 6S5,
Cham, Switzerland: Springer, Jan. 2016, pp. 15–27. pp. 1–7, 2019.
[194] J. S. Jang, B. I. C. H. Lee Choi, J. H. Kim, D. M. Seo, and [219] A. M. El-Gazzar, T. M. Mohamed, and R. A. Sadek, “A hybrid SVD-
W. S. Cho, “Understanding pending issue of society and sentiment HSV visual sentiment analysis system,” in Proc. 8th Int. Conf. Intell.
analysis using social media,” in Proc. 8th Int. Conf. Ubiquitous Future Comput. Inf. Syst. (ICICIS), Dec. 2017, pp. 360–365.
Netw. (ICUFN), Jul. 2016, pp. 981–986. [220] H. Zhang, A. C. Berg, M. Maire, and J. Malik, “SVM-KNN: Discrim-
[195] F. H. Khan, U. Qamar, and S. Bashir, “SentiMI: Introducing point-wise inative nearest neighbor classification for visual category recognition,”
mutual information with SentiWordNet to improve sentiment polarity in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., vol. 2, Jun. 2006,
detection,” Appl. Soft Comput., vol. 39, pp. 140–153, Feb. 2016. pp. 2126–2136.
[196] O. F. Villarroel, S. Ludwig, R. K. De, D. Grewal, and M. Wetzels, [221] M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, “Coding facial
“Unveiling what is written in the stars: Analyzing explicit, implicit, expressions with gabor wavelets,” in Proc. 3rd IEEE Int. Conf. Autom.
and discourse patterns of sentiment in social media,” J. Consum. Res., Face Gesture Recognit., Apr. 1998, pp. 200–205.
vol. 43, no. 6, pp. 875–894, 2017. [222] I. S. P. James, “Face image retrieval with HSV color space using
clustering techniques,” SIJ Trans. Comput. Sci. Eng. Appl., vol. 1, no. 1,
[197] S. Poria, E. Cambria, D. Hazarika, and P. Vij, “A deeper look
pp. 1–4, 2013.
into sarcastic tweets using deep convolutional neural networks,”
[223] K. Mahboob and F. Ali, “Sentiment analysis of pharmaceutical products
2016, arXiv:1610.08815. [Online]. Available: https://arxiv.org/abs/
evaluation based on customer review mining,” J. Comput. Sci. Syst.
arXiv:1610.08815
Biol., vol. 11, pp. 190–194, Mar. 2018, doi: 10.4172/jcsb.1000271.
[198] L. Wang and Z. Zhang, Support Vector Machines: Theory and Appli- [224] G. Liang, W. He, C. Xu, L. Chen, and J. Zeng, “Rumor identification
cations. Berlin, Germany: Springer-Verlag, 2005. in microblogging systems based on users’ behavior,” IEEE Trans.
[199] B. Liu, Sentiment Analysis and Opinion Mining (Synthesis Lectures on Computat. Social Syst., vol. 2, no. 3, pp. 99–108, Sep. 2015.
Human Language Technologies), vol. 5, no. 1. Ras al Khaimah, United [225] R. Basak, S. Sural, N. Ganguly, and S. K. Ghosh, “Online public
Arab Emirates: Science Publishing Corporation, RAK Free Trade Zone, shaming on Twitter: Detection, analysis, and mitigation,” IEEE Trans.
2012, pp. 1–167. Comput. Social Syst., vol. 6, no. 2, pp. 208–220, Apr. 2019.
[200] D. M. E.-D. M. Hussein, “A survey on sentiment analysis challenges,” [226] S. Madisetty and M. S. Desarkar, “A neural network-based ensemble
J. King Saud Univ.-Eng. Sci., vol. 34, no. 4, pp. 330–338, 2016. approach for spam detection in Twitter,” IEEE Trans. Comput. Social
[201] S. Behdenna, F. Barigou, and G. Belalem, “Document level sentiment Syst., vol. 5, no. 4, pp. 973–984, Dec. 2018.
analysis: A survey,” EAI Endorsed Trans. Context-Aware Syst. Appl., [227] H. Tajalizadeh and R. Boostani, “A novel stream clustering
vol. 4, Mar. 2018, Art. no. 154339, doi: 10.4108/eai.14-3-2018.154339. framework for spam detection in Twitter,” IEEE Trans. Comput.
[202] V. S. Jagtap and K. Pawar, “Analysis of different approaches to Social Syst., vol. 6, no. 3, pp. 525–534, Jun. 2019, doi: 10.1109/
sentence-level sentiment classification,” Int. J. Sci. Eng. Technol., vol. 2, TCSS.2019.2910818.
no. 3, pp. 164–170, 2013. [228] Y. Tyshchuk and W. A. Wallace, “Modeling human behavior on social
[203] K. Schouten and F. Frasincar, “Survey on aspect-level sentiment media in response to significant events,” IEEE Trans. Comput. Social
analysis,” IEEE Trans. Knowl. Data Eng., vol. 28, no. 3, pp. 813–830, Syst., vol. 5, no. 2, pp. 444–457, Jun. 2018.
Oct. 2015. [229] R. C. Chen, “User rating classification via deep belief network learning
[204] M. K. Dalal and M. A. Zaveri, “Opinion mining from online user and sentiment analysis,” IEEE Trans. Comput. Social Syst., vol. 6, no. 3,
reviews using fuzzy linguistic hedges,” Appl. Comput. Intell. Soft pp. 535–546, Jun. 2019.
Comput., vol. 2014, p. 2, Jan. 2014. [230] T. Chowdhury, S. Muhuri, S. Chakraborty, and S. N. Chakraborty,
[205] P. Gonçalves, M. Araújo, F. Benevenuto, and M. Cha, “Comparing “Analysis of adapted films and stories based on social network,” IEEE
and combining sentiment analysis methods,” in Proc. 1st ACM Conf. Trans. Comput. Social Syst., vol. 6, no. 5, pp. 858–869, Oct. 2019.
Online Social Netw., Oct. 2013, pp. 27–38. [231] K. Chakraborty, S. Bhattacharyya, R. Bag, and A. E. Hassanien,
[206] F. Å. Nielsen, “A new ANEW: Evaluation of a word list for sentiment “Comparative sentiment analysis on a set of movie reviews using deep
analysis in microblogs,” 2011, arXiv:1103.2903. [Online]. Available: learning approach,” in Proc. Int. Conf. Adv. Mach. Learn. Technol.
https://arxiv.org/abs/1103.2903 Appl. (AMLTA), Egypt, Cairo, Feb. 2018, pp. 311–318.
[207] A. Hogenboom, B. Heerschop, F. Frasincar, U. Kaymak, and F. de Jong, [232] K. Chakraborty, S. Bhattacharyya, R. Bag, and A. E. Hassanien,
“Multi-lingual support for lexicon-based sentiment analysis guided by “Sentiment analysis on a set of movie reviews using deep learning
semantics,” Decis. Support Syst., vol. 62, pp. 43–53, Jun. 2014. techniques,” in Social Network Analytics—Computational Research
Methods and Techniques. Amsterdam, The Netherlands: Elsevier, 2018.
[208] P. Ray and A. Chakrabarti, “Twitter sentiment analysis for product
[233] K. Chakraborty, R. Bag, and S. Bhattacharyya, “Relook into sentiment
review using lexicon method,” in Proc. Int. Conf. Data Manage., Anal.
analysis performed on Indian languages using deep learning,” in
Innov. (ICDMAI), Feb. 2017, pp. 211–216.
Proc. 4th Int. Conf. Res. Comput. Intell. Commun. Netw. (ICRCICN),
[209] R. Moraes, J. F. Valiati, and W. P. G. Neto, “Document-level sentiment Nov. 2018, pp. 208–213.
classification: An empirical comparison between SVM and ANN,”
Expert Syst. Appl., vol. 40, no. 2, pp. 621–633, 2013.
[210] B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Koyel Chakraborty was born in 1988. She received
Data. New York, NY, USA: Elsevier, 2007. the B.Sc. degree in computer science from the
[211] J. Smailović, M. Grčar, N. Lavrač, and M. Žnidaršič, “Stream-based University of Burdwan, Bardhaman, India, in 2009,
active learning for sentiment analysis in the financial domain,” Inf. Sci., the M.Sc. degree in computer science from West
vol. 285, pp. 181–203, Nov. 2014. Bengal State University, Kolkata, India, in 2011, and
[212] A. Hasan, S. Moin, A. Karim, and S. Shamshirband, “Machine the M.Tech. degree in computer science from the
learning-based sentiment analysis for Twitter accounts,” Math. Comput. Maulana Abul Kalam Azad University of Technol-
Appl., vol. 23, no. 1, p. 11, 2018. ogy, Kolkata, in 2016.
[213] P. D. Turney, “Thumbs up or thumbs down?: Semantic orientation She is currently an Assistant Professor with the
applied to unsupervised classification of reviews,” in Proc. 40th Annu. Department of Computer Science and Engineering,
Meeting Assoc. Comput. Linguistics, Jul. 2002, pp. 417–424. Supreme Knowledge Foundation Group of Institu-
[214] F. Bravo-Marquez, M. Mendoza, and B. Poblete, “Meta-level sentiment tions, Chandannagar, India, under the Maulana Abul Kalam Azad University
models for big social data analysis,” Knowl.-Based Syst., vol. 69, of Technology. Her research interest is in the fields of deep learning and
pp. 86–99, Oct. 2014. sentiment analysis.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.
464 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 2, APRIL 2020
Siddhartha Bhattacharyya (M’10–SM’13) Rajib Bag was born in 1969. He received the
received the bachelor’s degree in physics, and B.Sc. degree (Hons.) in physics from Calcutta Uni-
the bachelor’s and master’s degrees in optics versity, Kolkata, India, in 1991, the M.Sc. degree
and optoelectronics from the University of in physics from Vinoba Bhave University, Hazarib-
Calcutta, Kolkata, India, in 1995, 1998, and 2000, agh, India, in 1996, and the M.Tech. degree and
respectively, and the Ph.D. degree in computer Ph.D. degree in control systems in engineering from
science and engineering from Jadavpur University, Jadavpur University, Kolkata, in 2007 and 2012,
Kolkata, in 2008. respectively.
He is currently serving as a Professor with the He is currently a Professor and the Head of the
Department of Computer Science and Engineering, Department of Computer Science and Engineering,
Christ University, Bengaluru, India. Prior to this, Supreme Knowledge Foundation Group of Institu-
he served as the Principal of the RCC Institute of Information Technology, tions, Chandannagar, India, under the Maulana Abul Kalam Azad University
Kolkata. He also served as a Senior Research Scientist with the Faculty of of Technology, Kolkata. Currently, five research scholars are doing their
Electrical Engineering and Computer Science, VSB Technical University research work in different areas under his supervision. He has authored or
of Ostrava, Ostrava, Czech Republic. He has coauthored five books, coauthored more than 40 publications in reputed refereed journals and confer-
coedited 40 books, and has authored or coauthored more than 250 research ence proceedings. His research interest includes image and signal processing,
publications in international journals and conference proceedings. He holds education technology, machine learning, deep learning, and Internet-of-Things
three patents. His research interests include soft computing, pattern security besides control systems.
recognition, multimedia data processing, hybrid intelligence, social networks,
and quantum computing.
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on April 17,2020 at 00:08:03 UTC from IEEE Xplore. Restrictions apply.