Research Article Open Access

Random Forest and Extreme Gradient Boosting with Bayesian Hyperparameter Optimization for Landslide Susceptibility Mapping in Penang Island, Malaysia

Dorothy Anak Martin Atok1, Soo See Chai1, Kok Luong Goh2, Neha Gautam3 and Kim On Chin4
  • 1 Faculty of Computer Science and Information Technology, University of Malaysia Sarawak, Malaysia
  • 2 School of Science and Technology, International University College of Advanced Technology Sarawak, Malaysia
  • 3 Data Science Department of Jain University, Bangalore, India
  • 4 Faculty of Computing and Informatics, Universiti Malaysia Sabah (UMS), Sabah, Malaysia

Abstract

Landslide susceptibility models often face challenges of overfitting and overestimation. This research focuses on improving the predictive capabilities of the Extreme Gradient Boosting (XGBoost) and Random Forest (RF) algorithms by applying Bayesian Hyperparameter Optimization (BayesOpt). Penang Island, a region in Malaysia prone to frequent landslides, was chosen as the study area. Ten Landslide Conditioning Factors (LCFs), including elevation, slope angle, NDVI, and proximity to streams and roads, were derived using Geographic Information Systems (GIS). From the total of 886 landslide and non-landslide data points, a 70:30 split was employed for training and testing, respectively. BayesOpt-RF emerged as the top-performing model among all those assessed with an AUC of 99.50% (Success Rate) and 95.80% (Prediction Rate). RF (SR: 100.00%, PR: 95.60%), XGBoost (SR: 100.00%, PR: 95.20%), and BayesOpt-XGBoost (SR: 96.70%, PR: 93.00%) followed. While BayesOpt did not consistently improve prediction performance, it effectively minimized overfitting and ensured optimal model operation. For effective site selection, the generated landslide susceptibility maps are significant, infrastructure planning, and disaster mitigation.

Journal of Computer Science
Volume 21 No. 10, 2025, 2273-2291

DOI: https://doi.org/10.3844/jcssp.2025.2273.2291

Submitted On: 3 September 2024 Published On: 16 December 2025

How to Cite: Atok, D. A. M., Chai, S. S., Goh, K. L., Gautam, N. & Chin, K. O. (2025). Random Forest and Extreme Gradient Boosting with Bayesian Hyperparameter Optimization for Landslide Susceptibility Mapping in Penang Island, Malaysia. Journal of Computer Science, 21(10), 2273-2291. https://doi.org/10.3844/jcssp.2025.2273.2291

  • 43 Views
  • 11 Downloads
  • 0 Citations

Download

Keywords

  • Bayesian Hyperparameter Optimization
  • Extreme Gradient Boosting Landslide Susceptibility Mapping
  • Random forest