ORIGINAL RESEARCH
Incorporating Hybrid Prediction with
Feature Selection to Estimate Carbon
Emissions with Limited Data
More details
Hide details
1
State Grid Shanxi Electric Power Research Institute, Taiyuan, Shanxi 030000, China
Submission date: 2024-07-12
Final revision date: 2024-10-25
Acceptance date: 2024-12-29
Online publication date: 2025-03-17
Corresponding author
Weiru Wang
State Grid Shanxi Electric Power Research Institute, Taiyuan, Shanxi 030000, China
KEYWORDS
TOPICS
ABSTRACT
Carbon dioxide (CO2) emission forecasting is crucial for efficient carbon reduction management.
The majority of carbon emission prediction models are developed based on limited data, which are often
collected annually and spatially sparse, and hence face the problems of overfitting and low robustness.
Aiming at reliable estimation of CO2 emissions with a small-scale of data, we propose a CO2 prediction
framework that incorporates a hybrid predictor with feature selection. The hybrid predictor, formed
by a fractional-order Grey multivariate model (FGM) and an ensemble learning model, XGBoost, can
capture both linear and nonlinear variations of CO2 emissions, demonstrating strong predictive ability.
ReliefF is used for feature selection due to its ability to balance features’ importance and diversity,
which helps reduce model overfitting. The forecasting effect of the proposed framework is validated on
the county-level CO2 emissions in Shanxi Province, China, from 2012-2022. The results show that the
proposed model is superior to other linear and machine learning prediction models and achieves a good
forecasting effect, with RMSE, MAE, and R2 values of 1.73, 1.03, and 0.93, respectively. The Likelihood
Ratio (LR) test, the soundness test, and the heterogeneity test have confirmed the generalizability and
stability of our proposed hybrid model for CO2 emissions predictions, as well as the effectiveness of
feature selection. Consequently, the prediction results of Shanxi’s CO2 emissions provide a reliable basis
for spatial correlation analysis using Moran’s Index.