Preprocessing Methods and Tools in Modelling Japanese for Text Classification

As a subset of Artificial Intelligence, Natural Language Processing (NLP) is a breakthrough in surpassing language barrier. Japanese language characteristics bring its own challenge in morphological analysis due to the uniqueness of Japanese grammatical system. By the rapid development of NLP tools, many Japanese NLP tools developed with limited ability yet specialized in running certain preprocessing methods. In this paper, the compilation of various methods and newly discovered tools for preprocess Japanese text are delivered to help people decide which Japanese NLP tools needs to be utilized to run some preprocessing methods. All of the Japanese preprocessing methods and tools are collected through literature review. It is concluded that depending on one NLP tool is not recommended since combination of Japanese NLP tools is required to finish Japanese preprocessing phase.

Conference: International Conference on Information Management and Technology 2019, Bali, Indonesia

Reza Rahutomo, Febrian Lubis, Hery Harjono Muljo, Bens Pardamean

Read Full Paper