Page 125 - 《高原气象》2025年第6期
P. 125
6 期 刘 杰等:基于决策树算法的青海省高原地区冰雹预报及关键特征因子分析 1533
Hail Forecasting and Key Feature Analysis in the Qinghai Plateau
using Decision Tree Algorithms
LIU Jie , ZHANG Guojing , WANG Xiaoying , GUAN Qin 4
1, 2
1, 2
3
(1. School of Computer Technology and Applications, Qinghai University, Xining 810016, Qinghai, China;
2. Qinghai Provincial Laboratory of Intelligent Computing and Applications, Qinghai University, Xining 810016, Qinghai, China;
3. School of Computer and Information Science, Qinghai Institute of Technology, Xining 810018, Qinghai, China;
4. Qinghai Institute of Meteorological Science, Xining 810001, Qinghai, China)
Abstract: Due to its unique geographical environment, Qinghai Province is highly susceptible to frequent hail
events. Considering the complex topography of high-altitude regions, particularly the Qinghai Plateau, this study
constructs a hail forecasting dataset by integrating hail observations from 52 meteorological stations in Qinghai
from 2009 to 2023, corresponding hail disaster records, and the ERA5 atmospheric reanalysis dataset. Based on
this dataset, three ensemble decision tree models-Random Forest, XGBoost, and LightGBM-are employed to
develop a hail forecasting model, with separate analyses conducted on hail samples with diameters of ≥2 mm
and ≥5 mm. Experimental results demonstrate that the LightGBM model consistently outperforms both Random
Forest and XGBoost, with particularly superior performance in forecasting large hail events (diameter ≥5 mm).
Specifically, for small hail samples (diameter ≥2 mm), the LightGBM model achieves a hit rate of 0. 923, a
false alarm rate of 0. 041, a Critical Success Index (CSI) of 0. 858, an accuracy of 0. 946, and a recall rate of
0. 924, while for large hail samples (diameter ≥5 mm), it attains a hit rate of 0. 938, a false alarm rate of
0. 038, a CSI of 0. 908, an accuracy of 0. 960, and a recall rate of 0. 964. Further analysis of the hail forecasting
model in the complex terrain of the plateau reveals that the most influential meteorological factors for hail fore‐
casting in Qinghai Province include thermodynamic conditions (vertically integrated temperature p54. 162, verti‐
cally integrated thermal energy p60. 162, and 2-meter dew point temperature d2m), characteristic height layer
conditions (100 hPa temperature t100, 400 hPa temperature t400, and 20 hPa geopotential height z20), and dy‐
namic conditions (500 hPa zonal wind component u500, 200 hPa meridional wind component v200, and 200
hPa zonal wind component u200). Kernel density estimation analysis indicates that most feature variables exhibit
limited separability, suggesting that no single factor alone can determine the occurrence of hail events. A case
study demonstrates that the LightGBM-based hail forecasting model exhibits strong spatial forecasting capabili‐
ties. Analysis of the 24-hour evolution of key meteorological variables preceding a large-scale hail event at the
Chaka station identifies several crucial atmospheric indicators: (1) significant fluctuations in vertically integrat‐
ed temperature (p54. 162), indicating intense convective activity; (2) persistently high 2-meter dew point tem‐
perature (d2m), reflecting abundant near-surface moisture; (3) strong 500 hPa zonal wind speed (u500), sug‐
gesting enhanced mid-level atmospheric dynamics; and (4) low 100 hPa temperature (t100), capturing upper-at‐
mosphere characteristics. The coordinated evolution of these atmospheric variables not only reveals key stages in
the development of severe convective weather systems but also provides a scientific foundation for improving
hail potential forecasting methods in Qinghai Province.
Key words: hail; forecasting; decision tree modeling; plateau region

