Hierarchical forecasting of COVID-19 cases in Africa using machine learning models

dc.contributor.authorShoko, Claris
dc.contributor.authorMakatjane, Katleho
dc.contributor.authorSigauke, Caston
dc.date.accessioned2026-04-07T13:15:06Z
dc.date.available2026-04-07T13:15:06Z
dc.date.issued2026
dc.description.abstractIntroduction: The COVID-19 pandemic posed significant challenges for public health systems, especially in Africa, where data scarcity, inadequate healthcare infrastructure, and regional disparities hindered effective forecasting and response efforts. Conventional forecasting methods have faced challenges in adequately addressing the complexity and detail necessary for effective policy interventions at various administrative levels. This study examines the challenge of producing accurate and coherent forecasts of COVID-19 cases within the hierarchical structure of Africa, which includes the continental, regional, and national levels. Methods: To establish a comprehensive forecasting model that uses hierarchical time series forecasting through a bottom-up reconciliation approach augmented by machine learning algorithms. We employ extreme gradient boosting (XGBoost) and random forest models, subsequently improving predictive accuracy via a weighted average ensemble method. We produce forecasts at the national level and then aggregate them to ensure consistency across all hierarchical levels. The models are evaluated in comparison to conventional methods such as ARIMA and exponential smoothing. Results: Empirical findings indicate that XGBoost is the best among all the single forecast models used in this study, combining forecasts from the XGBoost with the random forest and assigning more weights to the XGBoost surpasses all other models in the area of mean absolute error, root mean square error, and mean absolute scale error. Results further revealed that Southern Africa, despite its low population density, reported the highest number of cases, indicating underlying health vulnerabilities and socioeconomic factors. In summary, the bottom-up HTSF method, when combined with machine learning, serves as an effective tool for forecasting in environments with limited data availability. Discussion: It is advisable to apply similar models to other infectious diseases and to expand their use to guide health interventions, resource allocation, and early warning systems in future pandemics.
dc.identifier.citationShoko, C., Sigauke, C. and Makatjane, K., 2026. Hierarchical forecasting of COVID-19 cases in Africa using machine learning models. Frontiers in Epidemiology, 6, p.1696282.
dc.identifier.urihttps://doi.org/10.3389/fepid.2026.1696282
dc.identifier.urihttps://hdl.handle.net/10566/22176
dc.language.isoen
dc.publisherFrontiers in Epidemiology
dc.relation.ispartofseriesN/A
dc.subjectbottom-up reconciliation
dc.subjectensemble
dc.subjecthierarchical time series
dc.subjectrandom forest
dc.subjectweighted average
dc.subjectXGBoost
dc.titleHierarchical forecasting of COVID-19 cases in Africa using machine learning models
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
shoko_hierarchical_forecasting_of_2026.pdf
Size:
3.77 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: