Explainable Tree-Based Boosting Algorithms for Smoker Detection Using Bio-Signals

Smoking generally has no positive effects on the bodily organs, causing diseases in smokers that tend to lead to mortality. Rather than medicating heavily diseased smokers, an approach in preventing those who are not diseased yet is done, though, many of them do not admit to being smokers, leading to another problem. Thus, this study aims to detect important aspects that can detect that the person is a smoker or not through their bio-signals through the produced values from SHAP, along with a comprehensive analysis of the used methods, gradient-boosting algorithms XGBoost, LightGBM, and CatBoost. The study then found that triglyceride, Gtp, hemoglobins, and several more impact more in predicting if someone is a smoker, based on the CatBoosts’ results, which has an AUC score of up to 0.8612 with the selected features.
Authors:
Jason Sebastian Sulistyawan, Kuncahyo Setyo Nugroho, Bens Pardamean
2024 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)