Features Importance in Classification Models for Colorectal Cancer Cases Phenotype in Indonesia

Recently there has been an interest in predicting complex disease risk using models that combine the effects of many genetic factors together. These are known as polygenic models. These models, however, often do not include important non-genetic factors that are important to prediction of the disease. In this paper, we explore the prediction of colorectal cancer in Indonesia from non-genetic factors using common machine learning algorithms: XGBoost and Elastic Net. The result of this study identified 8 features with strong importance from both XGBoost and Elastic Net. These features are highly recommended to be included in future polygenic models of colorectal cancer data in Indonesia.

Conference: 2019 International Conference on Computer Science and Computational Intelligence, Vol. 157, Yogyakarta, Indonesia

Tjeng Wawan Cenggoro, Bharuno Mahesworo, Arif Budiarto, James W Baurley, Teddy Suparyanto, Bens Pardamean

Read Full Paper