Ten-Year Compilation of #SaveKPK Twitter Dataset

Politic is one of the most favorite topics to discuss in social media for people in Indonesia. It was proven when a people movement to support the Commission of Corruption Eradication (KPK) named #SaveKPK has been enlived for ten years and become a trending topic on Twitter for several times. In this research, all tweets contain ‘#SaveKPK’ are crawled and compiled using an alternative algorithm to get twitter historical data instead of using Twitter API. The result described the characteristic of the dataset statistically, from the most frequently used words to the most active users. A clustering algorithm named Latent Dirichlet Allocation (LDA) was run over the gathered text dataset to discover most relevant keywords using unsupervised learning approach.

International Conference on Information Management and Technology 2020

Reza Rahutomo, Arif Budiarto, Kartika Purwandari, Anzaludin Samsinga Perbangsa, Tjeng Wawan Cenggoro, and Bens Pardamean

Read Full Paper