People Innovation Excellence

Twitter Dataset for Hate Speech and Cyberbullying Detection in Indonesian Language

Trisna Febriana, Arif Budiarto

Conference: International Conference on Information Management and Technology 2019, Bali, Indonesia

Abstract: During the 2019 election period in Indonesia, many hate speech and cyberbullying cases have occurred in social media platforms including Twitter. The government tries to filter every negative content to be spread out during this period. However, to detect hate speech is not an easy task. This paper presents the process of developing a dataset that can be used to build a hate speech detection model. More than 1 million tweets have been successfully collected from using Twitter API. The basic preprocessing and preliminary study using machine learning was implemented. Latent Dirichlet Allocation (LDA) algorithm was used to extract the topic for each tweet to see whether these topics can be associated with debate themes. Pretrained sentiment analysis was also applied to the dataset to generate a polarity score for each tweet. From 83,752 tweets included in the analysis step, the number of positive and negative tweets are almost the same.

Read more

Published at :

Periksa Browser Anda

Check Your Browser

Situs ini tidak lagi mendukung penggunaan browser dengan teknologi tertinggal.

Apabila Anda melihat pesan ini, berarti Anda masih menggunakan browser Internet Explorer seri 8 / 7 / 6 / ...

Sebagai informasi, browser yang anda gunakan ini tidaklah aman dan tidak dapat menampilkan teknologi CSS terakhir yang dapat membuat sebuah situs tampil lebih baik. Bahkan Microsoft sebagai pembuatnya, telah merekomendasikan agar menggunakan browser yang lebih modern.

Untuk tampilan yang lebih baik, gunakan salah satu browser berikut. Download dan Install, seluruhnya gratis untuk digunakan.

We're Moving Forward.

This Site Is No Longer Supporting Out-of Date Browser.

If you are viewing this message, it means that you are currently using Internet Explorer 8 / 7 / 6 / below to access this site. FYI, it is unsafe and unable to render the latest CSS improvements. Even Microsoft, its creator, wants you to install more modern browser.

Best viewed with one of these browser instead. It is totally free.

  1. Google Chrome
  2. Mozilla Firefox
  3. Opera
  4. Internet Explorer 9