Application of Sentiment Analysis On Product Review E-Commerce
Application of Sentiment Analysis On Product Review E-Commerce
Application of Sentiment Analysis On Product Review E-Commerce
Yuniarta Basani, Harris V. Sibuea, Sinta Ida Patona Sianipar and Jen Presly
Samosir
Informatics Technology, Institut Teknologi Del, Laguboti 22381, Indonesia
Abstract. The lack of buying product via online are the consumer cannot touch, try, or even
see it directly. Then how does the consumer believe the product they like, is the correct product
to be bought. The main key is product review from the consumer who have bought and try the
product. The more number of product review on certain product or the popular product caused
difficulties for the consumer to decide which product they should choose. For that we require a
solution, which is the writer will apply sentiment analysis to classify every product review into
positive orientation, negative or neutral and also produce summary of product review based on
product feature to help reading process, product review and decision making. The writer step is
(1) data collecting and preprocessing, (2) product feature extraction using Double Propagation,
(3) deciding sentiment orientation, (4) classifying by using Naïve Bayes Classifier and Support
Vector Machine, and (5) summary generation based on product feature. These stages run in a
simulator and the result of the classification from both methods compared.
1. Introduction
The number of internet users and e-commerce sales in Indonesia continues to increase. The rapid
expansion of the use of e-commerce makes more and more people buying products online. E-
commerce has great potential, but there are some problems that hinder growth or flaws facing e-
commerce. The disadvantage of buying products online is that consumers can not touch, try or even
look directly at the product. Then how consumers can believe that the product they like, is the right
product to buy. The main key is product reviews from consumers who have bought and tried the
product (Hu & Liu, 2004).It becomes the basis that it should be given an ease of access, both for
consumers, ecommerce owners and also product manufacturers to view reviews easily and
appropriately (Hu & Liu, 2004). Therefore, it is necessary to apply sentiment analysis to label every
product review based on customer's review text in product review which they write, positive, negative
or neutral and produce summary of product review based on product feature. Brief introduction and
product review classification based on product features for consumers will be an attraction or added
value in some sites such as www.rottentomatoes.com, a trusted site in the quality assessment of a
movie or TV show based on hundreds of reviews covering movies or TV show (Anon., 2010), where
all the reviews are briefly displayed and calculate the percentage of consumers who rated the film or
TV show as positive or not (Pang, et al., 2002).
Related research is the research in 2004 conducted by Hu and Liu. The research applies sentiment
analysis on product review of 5 types of electronic products: 2 types of digital cameras, 1 type of mp3
player, and 1 type of cellular phone. The product review data is collected from the Amazon.com
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
1st International Conference on Advance and Scientific Innovation (ICASI) IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1175 (2019) 012103 doi:10.1088/1742-6596/1175/1/012103
dataset. Hu and Liu also produced a product review overview based on the features of the product.
These summaries make it easy for potential customers to feel the same way about products that have
been purchased by previous customers. Potential consumers who are very interested in certain features
can easily browse whether consumers who have purchased a product feel satisfied or complain about
the product.
Research on this final task is to determine the classification based on sentiment orientation. The
author uses the classification method of Naïve Bayes Classifier (NBC) and Support Vector Machine
(SVM) in determining the positive, negative, or neutral label of a product review. In the final stage,
the authors produce a summary of the classification results grouped by features of each product and
compare the two methods.
2
1st International Conference on Advance and Scientific Innovation (ICASI) IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1175 (2019) 012103 doi:10.1088/1742-6596/1175/1/012103
achieved by the hyper plane that has the greatest distance to the nearest training data point of each
class (functional margin is called), because in general the larger the lower the margin the
generalization error from the sorting. When the problem of origin may be expressed in a finite space
dimension, it is often the case that in space, the set is not separated linearly. For this reason it is
proposed that the finite dimensional space be mapped into a much higher dimensional space that might
make the separation easier in that space.
The main idea of the SVM method is the concept of maximal hyper plane margin. With the
discovery of maximum hyper plane margin then the vector will divide the data into the form of the
most optimum classification. Some examples of hyper planes that may appear to classify data are
shown by the following figure 1.
From Figure 1 it is found that the H3 line (green) does not separate the two classes. The H1 line
(blue) separates, with a small margin and a H2 line (red) with a maximum margin. Data classification
is a common task in machine learning. Suppose that some given data points belong to one of the two
classes, and the goal is to determine the class of a new data point to enter.
3
1st International Conference on Advance and Scientific Innovation (ICASI) IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1175 (2019) 012103 doi:10.1088/1742-6596/1175/1/012103
6. The classification process is used to classify data testing into positive, negative or neutral oriented
classifications.
7. The classification results are used for the generating process of the summary which will produce a
summary of features and opinions along with their orientation sentiment.
4
1st International Conference on Advance and Scientific Innovation (ICASI) IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1175 (2019) 012103 doi:10.1088/1742-6596/1175/1/012103
2. If there is no feature then the next product feature will be created for each review sentence.
5
1st International Conference on Advance and Scientific Innovation (ICASI) IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1175 (2019) 012103 doi:10.1088/1742-6596/1175/1/012103
2. Training data will be conducted by using the training method of NBC and SVM.
3. Each training result of NBC and SVM will be tested using data testing that has been through the
preprocessing stage and feature classification and performed simultaneously.
4. Classification results from NBC and SVM will result in a classification of NBC and SVM to be
used in the next stage.
4. Experimental Result
6
1st International Conference on Advance and Scientific Innovation (ICASI) IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1175 (2019) 012103 doi:10.1088/1742-6596/1175/1/012103
5. Conclusion
The application of sentiment analysis can be used to classify product reviews into positive, negative
and neutral categories. The simulator built by the researcher can handle every step done from the load
dataset to produce a summary of the classification results.
Feature extraction is done using double propagation is in one review sentence can extract more than
one feature of the product. The amount of data, composition and techniques of data selection training
and testing can affect the accuracy and time execution. The methods used to classify product reviews
are considered good and include good classification. Based on experiments that have been done, the
results obtained that from the two methods used are Naive Bayes and Support Vector Machine in the
classification, the accuracy of SVM method is higher than the NBC method and time execution SVM
is shorter than NBC
References
[1] A, G., R, B. & Huang, L., 2009. Twitter sentiment classification using distant supervision.
CS224N Project Report, Volume I, p. 12.
[2] A, M., D, F. & F, P., 2000. Maximum Entropy Markov Models for Information Extraction and
Segmentation. ICML 2000.
[3] Anon., 2010. Rooten Tomamatoes Movie Reviews. [Online] Available at:
https://www.rottentomatoes.com [Accessed 3 October 2016].
[4] A, R., 1996. A Maximum Entropy Model for Part-Of-Speech Tagging. Proceedings of the
Conference on Empirical Methods in Natural Language Processing.
7
1st International Conference on Advance and Scientific Innovation (ICASI) IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1175 (2019) 012103 doi:10.1088/1742-6596/1175/1/012103
[5] B, P., L, L. & S, V., 2002. Thumbs up? Sentiment Classiflcation using Machine Learning. in
Proceedings of the Conference on Empirical Methods in Natural Language Processing
(EMNLP), Philadelphia.
[6] Frank, E. & Witten, I. H., 2005. DATA MINING Practical Machine Learning Tools and
Techniques. 2nd ed ed. s.l.:San Fransisco: Morgan Kaufmann Publisher,Elsevier Inc.
[7] Garcia, D. & chweitzer, F., 2011. Emotion in Product Reviews – Empiris and models.
Switzerland.
[8] G, Q., B, L., J, B. & C, C., 2009. Expanding Domain Sentiment Lexicon through Double
Propagation. Proceedings of IJCAI.
[9] Han, J., Kamber, M. & Pei, J., 2012. DATA MINING Concepts and Techniques. 3rd ed.
Elsevier : Waltham: Morgan Kaufmann..
[10] Hu, M. & Liu, B., 2004. Mining and Summarizing Customer Reviews. Chicago: Department of
Computer Science University of Illinois at Chicago.
[11] J, M., n.d. Amazon Review Data. [Online] Available at: http://jmcauley.ucsd.edu/data/amazon/
[Accessed 2 October 2016]..
[12] Jurafsky & Dan, 2015. Stanford | Natural Language Processing - Course Introduction..
[Online]