Poster Tugas Akhir Teknik Informatika ITB
Poster Tugas Akhir Teknik Informatika ITB
Poster Tugas Akhir Teknik Informatika ITB
Authors Affiliations
Fajar Muslim / [email protected] School of Electrical Engineering and Informatics
Dr. Eng. Ayu Purwarianti, ST., MT. / [email protected] Institut Teknologi Bandung
Fariska Zakhralativa R, S.T., M.T. / [email protected]
Introduction Objective
Social media has become a necessity for some people. Twitter is one of them with To build system, find the best configuration, and compare system performance with
active users monthly 353 million users (Wearesocial, Digital 2020 October Statshot previous research on identifying and categorizing offensive language in social media by
Report). Freedom of expression on twitter is misused for some people to carry out adapting cost-sensitive learning techniques, and ensemble BERT models.
offensive actions such as cursing, hate speech, and swearing against ethnicity, race
and religion by using
offensive language.
Architecture REsult
Module Preprocessing
Converts emojis to English phrases, hashtag segmentation, change
'url' to 'http', remove punctuation except '?' and '!', lowercasing, limit
three consecutive '@USER' The best evaluation result on the test The best evaluation results on the test The best evaluation results on the test
data of subtask A is the baseline model data of subtask B testing is the data of subtask C testing is a cost-
Module BERT with Cost-Sensitive Learning with an F1-score of 0.823. The use of ensemble model with the . approach sensitive learning model with F1-score of
Change loss function on BERT to take into account cost sensitive cost-sensitive learning techniques did hard majority voting with an F1-score of 0.657. The use of cost-sensitive learning
not improve the performance of the 0.777. Use of technique cost-sensitive techniques can improve the
learning with following formula model in subtask A. While the ensemble learning improves F1-score performance performance of the baseline model by
technique with the hard majority voting by 3.16% compared to the baseline 6.9%. While the technique ensemble
approach increased the performance of model. While the ensemble technique with hard-majority voting approach does
the F1-score in subtask A by 0.78% with hard majority voting approach not improve model performance on
compared to the cost-sensitive learning increases F1-score performance by subtask C.
Module Ensemble technique. 1.72%.
Agregating the prediction result BERT models using majority voting