Day-Ahead Hourly Forecasting of Solar Radiation using a Physics-based Hybrid Machine Learning Model in Some Selected Locations in Nigeria
Keywords:
Stacked hybrid model, SARIMAX, LSTM, XGBoostAbstract
Day-ahead hourly forecasting of Solar Radiation (SR) is crucial for optimizing renewable energy generation, grid stability, and storage management, especially for locations with high solar potential but low utilization. This study develops a stacked hybrid forecasting model that combines statistical and machine learning approaches to forecast day-ahead hourly SR across eight Nigerian locations. Relevant historical meteorological data spanning one to two years. The datasets were divided into five temporal scales, consisting of annual, pre-rainy(MAM), rainy(JJA), post-rainy(SON), and dry(DJF) seasons. Three baseline models—Seasonal Auto-regressive Integrated Moving Average with exogenous variables (SARIMAX), Long Short-Term Memory (LSTM) networks, and Extreme Gradient Boosting (XGBoost) were developed by using 70:20 of the datasets in each temporal scale for the purpose of training and testing respectively and the remaining 10% for final evaluation of the hybridized linear regression stacked model. The performances of the models were evaluated using Root Mean Square Error (RMSE), Mean Bias Error (MBE), and Coefficient of Determination (R²). All the six possible combinations of the models were subjected to Diebold-Mariano(DM) test of significance. The RMSE together with DM were also carefully combined to rank the models. The results showed that while SARIMAX(R2: 0.60 to 0.88) captured seasonal patterns, it consistently under-performed against the machine learning models. LSTM excelled (R2: 0.76 to 0.92) for datasets where clearer temporal patterns exist, and XGBoost was more effective(R2: 0.75 to 0.90) under irregular or noisy conditions. The hybrid model demonstrated the most robust and reliable performance overall(R2: 0.68 to 0.92) with potential to reduce bias (MBE:-7.24 to 17.21 W/m2). The performance of the model were at their best in the DJF (R2: 0.90 to 0.92) for most of the locations, while MAM and SON were over- and under predicted respectively for most locations due to higher cloud-cover fluctuations. XGBoost and Hybrid models delivered the best...
Published
How to Cite
Issue
Section
Copyright (c) 2026 Taofeek A. Otunla, Emmanuel O. Ayegboyin

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.