Determining Variables Associated with Annual Oil Palm Yield: An Explainable Gradient Boosting Approach

With the growing demands for Precision Agriculture (PA) in Indonesia, researchers have evaluated the utilization of Machine Learning for predicting oil palm yields and determining variables affecting them. Previous studies showed that meteorological variables, including rainfalls and wind speed, are associated with oil palm yields. In this research, the Extreme Gradient Boosting (XGBoost) model and the Shapley Additive exPlanations (SHAP) were deployed for analyzing the importance of 15 agrometeorological variables in predicting oil palm yield. The best model attained 1.911 RMSE and 0.855 R2. By analyzing the weights and gains of the XGBoost model along with the SHAP values, it was found that the yield in the previous year, the age and number of plants, the area of peat lands, and meteorological parameters such as rainfall rates and the number of rainy days in the previous three years were considered important. The previous year’s yield in particular possesses the greatest weight and gain values according to the model, and the highest SHAP value among all input variables. However, the meteorological data used in this research are only limited to rainfall rates and the number of rainy days. In the future, more diverse variables can be analyzed.

Authors:
Gregorius Natanael Elwirehardja, Teddy Suparyanto, Miftakhurrokhmat, and Bens Pardamean

8th International Conference on Computer Science and Computational Intelligence, ICCSCI 2023

Read Full Article