1.中国科学技术大学 化学与材料科学学院,安徽 合肥 230026
2.重庆中烟工业有限责任公司 技术中心,重庆 400060
邵利民,博士,副教授,研究方向:化学计量学方法的研究和应用,E-mail:lshao@ustc.edu.cn
扫 描 看 全 文
潘正豪,王鹏,陈昆燕等.变量选择方法对近红外光谱校正模型优化极限的研究[J].分析测试学报,2023,42(12):1659-1665.
PAN Zheng-hao,WANG Peng,CHEN Kun-yan,et al.Study on the Optimization Limit of Near-infrared Spectroscopy Calibration Model by Variable Selection Methods[J].Journal of Instrumental Analysis,2023,42(12):1659-1665.
潘正豪,王鹏,陈昆燕等.变量选择方法对近红外光谱校正模型优化极限的研究[J].分析测试学报,2023,42(12):1659-1665. DOI: 10.19969/j.fxcsxb.23082802.
PAN Zheng-hao,WANG Peng,CHEN Kun-yan,et al.Study on the Optimization Limit of Near-infrared Spectroscopy Calibration Model by Variable Selection Methods[J].Journal of Instrumental Analysis,2023,42(12):1659-1665. DOI: 10.19969/j.fxcsxb.23082802.
变量选择经常被用于优化近红外光谱线性校正模型,消除冗余信息,提升回归的准确性和可解释性。该文研究并设计了一种基于蒙特卡洛的方法,用于评估不同线性校正方法在变量选择的子空间中能达到的最优程度,寻找变量选择对线性校正模型的优化极限。该方法通过获得验证指标——预测均方根误差(RMSEP)的分布图,揭示变量选择方法在数据集上的优化效果与优化极限。将该方法应用于3组样品的近红外光谱建模研究,结果表明:在烟草-果胶数据集上的可优化率约为24.98%,RMSEP降低了15.2%;在小麦-蛋白质数据集上的可优化率约为13.90%,RMSEP降低了9.5%;在玉米-淀粉数据集上的可优化率约为14.05%,RMSEP降低了57.1%。应用该方法可以快速得到变量选择方法在模型上的优化极限,为变量选择方法的设计、应用和评估提供参考。
Variable selection is often used to optimize linear calibration models for NIR spectroscopy,eliminating redundant information and improving the accuracy and interpretability of regression. In this paper,a Monte Carlo-based method is studied and designed to find the optimal limit of variable selection for linear calibration models,and can evaluate the optimal degree that different linear calibration methods(such as partial least squares regression PLSR) can achieve in the subspace of variable selection. In this study,the distribution plot of the root mean square error prediction(RMSEP) of the validation metric is obtained,and the optimization effect and optimization limit of the variable selection methods on the dataset are revealed. The method was applied to the near-infrared spectroscopy modeling of three datasets,and the results showed that the optimizable rate on the tobacco-pectin dataset was about 24.98%,and the prediction error RMSEP was reduced by 15.2%. The optimizable rate on the wheat-protein dataset is about 13.90%,and the prediction error RMSEP is reduced by 9.5%. The optimizable on the corn-starch dataset is about 14.05%,and the prediction error RMSEP is reduced by 57.1%. The application of this method can quickly obtain the optimization limit of the variable selection methods on the model,and provide reference for the design,application and evaluation of the variable selection methods.
化学计量学近红外光谱化学校正线性模型变量选择蒙特卡洛方法
chemometricsnear-infrared spectroscopychemical calibrationlinear modelvariable selectionMonte Carlo method
Yukihiro O,Christian H,Satoru T,Søren B E. Near-infrared Spectroscopy. Singapore:Springer Singapore,2021:331-345.
Grassi S,Casiraghi E. Foods,2022,11:1250.
Callery E L,Rowbottom A W. Appl. Spectrosc. Rev.,2022,57(5):411-440.
Chu X L,Huang Y,Yun Y H,Bian X H. Chemometric Methods in Analytical Spectroscopy Technology. Singapore:Springer Nature Singapore,2022:169-207.
Yun Y H,Li H D,Deng B C,Cao D S. Trends Anal. Chem.,2019,113:102-115.
Wang H P,Chen P,Dai J W,Liu D,Li J Y,Xu Y P,Chu X L. Trends Anal. Chem.,2022,153:116648.
Wold S,Sjostrom M,Eriksson L. Chemom. Intell. Lab. Syst.,2001,58(2):109-130.
Xu H,Liu Z C,Cai W S,Shao X G. Chemom. Intell. Lab. Syst.,2009,97(2):189-193.
Zou X B,Zhao J W,Li Y X. Vib. Spectrosc.,2007,44(2):220-227.
Yun Y H,Wang W T,Deng B C,Lai G B,Liu X B,Ren D B,Liang Y Z,Fan W,Xu Q S. Anal. Chim. Acta,2015,862:14-23.
Leardi R. J. Chemom.,2000,14:643-655.
Kirkpatrick S,Gelatt C D,Vecchi M P. Science,1983,220(4598):671-680.
Li H D,Liang Y Z,Xu Q S,Cao D S. Anal. Chim. Acta,2009,648(1):77-84.
Cai W S,Li Y K,Shao X G. Chemom. Intell. Lab. Syst.,2008,90(2):188-194.
Yun Y H,Li H D,Leslie E W,Fan W,Wang J J,Cao D S,Xu Q S,Liang Y Z. Spectrochim. Acta A,2013,111:31-36.
Norgaard L,Saudland A,Wagner J,Nielsen J P,Munck L,Engelsen S B. Appl. Spectrosc.,2000,54(3):413-419.
Ciurczak E W. Handbook of Near-infrared Analysis. 4th ed. Boca Raton:CRC Press/Taylor & Francis Group,2021.
Brown S. Comprehensive Chemometrics:Chemical and Biochemical Data Analysis. Second edition. Amsterdam,Oxford, Cambridge,MA:Elsevier,2020.
Zhang J,Hu Y,Zhou L X,Li B Y. J. Instrum. Anal.(张进,胡芸,周罗雄,李博岩. 分析测试学报),2020,39(10):1196-1203.
DB53/T 497-2013. Tobacco and Tobacco Products-Main Chemical Composition Indicators-Guidelines for the Establishment and Validation of Near-infrared Calibration Models. Local Standards in Yunnan Province(烟草及烟草制品 主要化学成分指标 近红外校正模型建立与验证导则. 云南省地方标准).
Galvão R K H,Araujo M S U,José G M,Pontes M J C,Silva E C,Saldanha T C B. Talanta,2005,67(4):736-740.
0
浏览量
8
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构