机构地区:[1]中国地质大学(武汉)构造与油气资源教育部重点实验室,武汉430074 [2]中国地质大学(武汉)资源学院,武汉430074
出 处:《地质科技通报》2025年第1期321-331,共11页Bulletin of Geological Science and Technology
基 金:国家重点研发计划项目“地质资源精准开发风险预测的大数据智能分析技术及平台建设”(2022YFF0801200)。
摘 要:测井技术是查明地下岩性、地层及地质流体的关键技术手段,在石油勘探行业中发挥着至关重要的作用。然而,由于仪器损坏、井眼条件等因素,经常造成测井数据缺失、曲线不全等问题,传统多元线性回归或经验公式方法无法合理地构建测井曲线间的关系模型使得曲线重构精度相对较低,机器学习算法虽能在大量数据之间找到最为合适的数据映射关系进而提高模型精度,但相较而言其所构建的黑箱模型无法得到良好的解释。近期,可解释性算法的运用使得机器学习在重构测井曲线中的应用更为合理。通过将支持向量回归(support vector regression,简称SVR),随机森林(random forest,简称RF)以及极限梯度提升(extreme gradient boosting,简称XGBoost)和传统多元线性回归方法(linear regression,简称LR)的对比对英国能源局22-30b-11号井声波测井曲线进行了模型重构并基于shapley additive explanations(SHAP)算法对XGBoost模型进行了解释。结果表明,XGBoost在测试集上的决定系数(R2)和均方误差(MSE)分别为0.996,6.371,优于SVR的0.990、15.755和RF的0.993、9.871,而传统多元线性回归方法则为0.969、48.895,表明XGBoost对声波时差曲线的重构具有更高的准确度和更好的泛化性能。创新性地采用SHAP算法对XGBoost黑箱模型的解释表明,在模型构建选择重要特征时,XGBoost模型采用地层温度数据作为特征明显合理于多元线性回归方法采用的井径测井数据。最后基于SHAP对模型进行了单点和全局特征交互解释。上述结果表明在声波测井曲线重构方面,机器学习算法明显优于传统的多元线性回归方法,并证明了SHAP算法在声波测井曲线重构机器学习模型解释方面的可行性,为后续机器学习在测井解释中的发展提供了新的思路。[Objective]Well logging techniques is cruicial for determining subsurface lithological characteristics and geological structures,which plays a pivotal role in the petroleum exploration industry.However,issues such as instrument damage and wellbore conditions frequently lead to data gapping or incomplete curves of well logs.Traditional multivariate linear regression or empirical formula fail to construct a reasonable relationship model among well logging curves,resulting in a relatively low reconstruction accuracy.Although machine learning algorithms are able to find the most appropriate mapping relationship between a large amount of data to improve the model accuracy,the black-box characteristics cannot be well explained.[Methods]In this work,support vector regression(SVR),random forest(RF),and eXtreme gradient boosting(XGBoost)are compared with traditional multiple linear regression(LR)to reconstruct the acoustic logging curve of the NDR well 22-30b-11,and the XGBoost model is interpreted based on shapley additive explanations(SHAP)algorithm.[Results]Results demonstrate that XGBoost outperforms SVR and RF on the test set,achieving R2 of 0.996 and an MSE of 6.371,surpassing SVR,with an R2 of 0.990 and an MSE of 15.755,and RF,with an R2 of 0.993 and an MSE of 9.871.In contrast,the LR yields an R2 of 0.969 and an MSE of 48.895,indicating that XGBoost has higher accuracy and better generalization performance in reconstructing acoustic time difference curves.This paper innovatively adopts the SHAP algorithm to explain the XGBoost black-box model,showing that when important features are selected for model construction,the XGBoost model with formation temperature data is more reasonably than the well logging data with multiple linear regression.Finally,the model is interpreted via SHAP for single-point and global feature interactions.[Conclusion]Results show that the machine learning algorithm is significantly better than the traditional multiple linear regression for logging curve reconstruction,indicating the feasibi
关 键 词:测井曲线重构 机器学习 模型解释 SHAP算法 声波测井
分 类 号:P631.814[天文地球—地质矿产勘探]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...