TY - JOUR
T1 - A Two-Part Framework for Depth to Bedrock Prediction and Uncertainty Assessment in Sweden
AU - Lin, Yiqi
AU - Peterson, Gustaf
AU - Karlsson, Cecilia
AU - Westphal, Florian
AU - Lidberg, William
AU - Agren, Anneli M.
PY - 2026
Y1 - 2026
N2 - Accurate mapping of depth to bedrock (DTB) in complex post-glacial landscapes is challenging due to high spatial variability and the prevalence of bedrock outcrops, which introduce "structural zeros" that violate standard regression modelling assumptions. To address this, we developed a two-part machine learning framework that separates bedrock outcrop classification from continuous depth prediction and applied it to a Swedish case study. The binary classifier effectively distinguished outcrops from sediment-covered areas (AUC = 0.96, F1-score = 0.83), whereas the regression component provided reliable DTB estimates in non-outcrop areas (R-2 = 0.68, RMSE = 5.74 m). The final fused model (R-2 = 0.67, RMSE = 5.80 m) outperformed both the existing national Inverse Distance Weighting interpolation model (R-2 = 0.61, RMSE = 6.61 m) and a global model evaluated over the study area (R-2 = 0.23, RMSE = 9.03 m). The two-part model remains robust in data-sparse regions. However, a depth-stratified uncertainty analysis revealed miscalibration in the uncertainty estimates of the regression component: in shallow ranges (2-15 m), the model overestimates uncertainty and produces overly wide prediction intervals. In deep ranges (> 30 m), it underestimates uncertainty while systematically underpredicts (mean error = 12.44 m). Our findings emphasize that zero-inflated datasets require special consideration in modeling approaches, and that depth-stratified evaluation is essential for understanding model reliability.
AB - Accurate mapping of depth to bedrock (DTB) in complex post-glacial landscapes is challenging due to high spatial variability and the prevalence of bedrock outcrops, which introduce "structural zeros" that violate standard regression modelling assumptions. To address this, we developed a two-part machine learning framework that separates bedrock outcrop classification from continuous depth prediction and applied it to a Swedish case study. The binary classifier effectively distinguished outcrops from sediment-covered areas (AUC = 0.96, F1-score = 0.83), whereas the regression component provided reliable DTB estimates in non-outcrop areas (R-2 = 0.68, RMSE = 5.74 m). The final fused model (R-2 = 0.67, RMSE = 5.80 m) outperformed both the existing national Inverse Distance Weighting interpolation model (R-2 = 0.61, RMSE = 6.61 m) and a global model evaluated over the study area (R-2 = 0.23, RMSE = 9.03 m). The two-part model remains robust in data-sparse regions. However, a depth-stratified uncertainty analysis revealed miscalibration in the uncertainty estimates of the regression component: in shallow ranges (2-15 m), the model overestimates uncertainty and produces overly wide prediction intervals. In deep ranges (> 30 m), it underestimates uncertainty while systematically underpredicts (mean error = 12.44 m). Our findings emphasize that zero-inflated datasets require special consideration in modeling approaches, and that depth-stratified evaluation is essential for understanding model reliability.
KW - depth to bedrock
KW - digital soil mapping
KW - quantile regression forest
KW - two-part model
KW - uncertainty visualization
KW - zero-inflated data
KW - depth to bedrock
KW - digital soil mapping
KW - quantile regression forest
KW - two-part model
KW - uncertainty visualization
KW - zero-inflated data
UR - https://res.slu.se/id/publ/146768
U2 - 10.1111/ejss.70308
DO - 10.1111/ejss.70308
M3 - Journal article
SN - 1351-0754
VL - 77
JO - European Journal of Soil Science
JF - European Journal of Soil Science
IS - 2
M1 - e70308
ER -