浏览全部资源
扫码关注微信
中国刑事警察学院 刑事科学技术学院,辽宁 沈阳 110854
Received:28 October 2024,
Revised:16 December 2024,
Accepted:26 December 2024,
Published Online:16 May 2025,
Published:15 June 2025
移动端阅览
张文骥,李昊洋,丁海媛,韩祺瑞,宋辉,罗颖超.基于挥发性有机物检验的血液种属鉴别方法与模型评估[J].分析测试学报,2025,44(06):1-8.
ZHANG Wen-ji,LI Hao-yang,DING Hai-yuan,HAN Qi-rui,SONG Hui,LUO Ying-chao.Blood Species Identification Method and Model Evaluation Based on Volatile Organic Compounds Testing[J].Journal of Instrumental Analysis,2025,44(06):1-8.
张文骥,李昊洋,丁海媛,韩祺瑞,宋辉,罗颖超.基于挥发性有机物检验的血液种属鉴别方法与模型评估[J].分析测试学报,2025,44(06):1-8. DOI: 10.12452/j.fxcsxb.241028489.
ZHANG Wen-ji,LI Hao-yang,DING Hai-yuan,HAN Qi-rui,SONG Hui,LUO Ying-chao.Blood Species Identification Method and Model Evaluation Based on Volatile Organic Compounds Testing[J].Journal of Instrumental Analysis,2025,44(06):1-8. DOI: 10.12452/j.fxcsxb.241028489.
为拓展挥发组学在血液种属鉴别(BSI)领域的应用,该研究测定了多种物种血液的挥发性有机物(VOCs),构建了机器学习分类模型并评估其性能。使用顶空固相微萃取/气相色谱-质谱(HS-SPME/GCMS)检测8种常见物种血液中的VOCs,利用偏最小二乘判别分析(PLS-DA)和正交偏最小二乘判别分析(OPLS-DA)筛选潜在标志物,使用9种常见机器学习分类算法构建一对多分类模型,进行基准测试、超参数调优、算法性能评估,并评估重采样方法和标志物变量选择的影响。筛选出人类与7种不同动物血液种属特征相关的潜在标志物共17种,基准测试获得最优算法为K-最近邻(KNN)算法,分类模型最优超参数组合为:
K
值为5,距离加权核函数为三权函数,闵氏距离参数
p
为0.324 0;验证集下最优模型准确度、曲线下面积、布里尔分数分别为0.928 4、0.997 0、0.057 6。使用所有成分变量的模型与仅使用潜在标志物变量模型的结果无显著差异(
t
检验
p
>
0.05),不同重采样方法模型结果无显著差异(
t
检验
p
>
0.05)。该研究表明,挥发组学分析在血液种属鉴别方面极具潜力,潜在标志物可靠性强、模型准确度高、抗干扰能力强。
Blood species identification(BSI) plays a significant role in criminal investigation,import and export inspection,animal protection and other fields. Volatilomics analysis of blood volatile organic compounds (VOCs) is a novel approach for blood species identification. To screen potential biomarkers of blood from different species,the study established multiple machine learning(ML) classification algorithms,and compared the predictive value of different classification models for blood species identification. Headspace solid-phase microextraction(HS-SPME) coupled with gas chromatography-mass spectrometry(
GC-MS) was used to analyze VOCs in the blood of eight common species. Partial least squares discriminant analysis(PLS-DA) and orthogonal partial least squares discriminant analysis(OPLS-DA) were employed to screen potential biomarkers. Samples were randomly divided into training and testing sets at a ratio of 7∶3. Nine common classification models were established,and the best algorithm was selected and optimized by comparing all models. A model using all VOCs as variables was constructed to verify the reliability of the potential biomarkers,and different resampling methods were used to assess the impact of training and testing set division on the model. A total of 17 VOCs related to species characteristics of human and seven different animal bloods were screened. The accuracy of the multi-layer perceptron,naive Bayes algorithm,multinomial logistic regression algorithm,K-nearest neighbor(KNN) algorithm,Gaussian kernel function support vector machine,polynomial kernel function support vector machine,decision tree,random forest model,and extreme gradient boosting tree model were 0.859 7,0.575 1,0.859 7,0.942 1,0.815 0,0.734 2,0.842 9,0.923 1 and 0.872 9,respectively. Among them,the accuracy,area under the receiver operating characteristic curve(AUC),and Brier score of the KNN model under the testing set were 0.918 4,0.999 0 and 0.037 6. KNN was selected as the optimal algorithm,and the best model's hyperparameter combination was:
K
value of 5,distance-weighted kernel function of the triweight function,and the Minkowski distance parameter
p
of 0.324 0. The best-performing model on the validation set achieved an accuracy of 0.928 4,with corresponding metrics of 0.997 0 for the AUC and 0.057 6 for the Brier score. There was no significant difference in the results between the model using all component variables and the potential biomarker variable model (
t
-test
p
>
0.05),and there was no significant difference in the results of models using different
resampling methods (
t
-test
p
>
0.05). Volatilomics analysis shows great potential in blood species identification,with strong reliability of potential biomarkers,high model accuracy,and strong anti-interference ability.
Zhang M , Peng W , Lai Z Q , Wang H P , Yuan R J , He Q , Wan X . Spectrosc . Spectral Anal. (张铭,彭文,赖珍荃,王泓鹏,袁汝俊,何强,万雄. 光谱学与光谱分析), 2019 , 39 ( 3 ): 797 - 801 .
Wei M M , Lu H X , Yang H H . Chem . J. Chin. Univ. (魏曼曼,路皓翔,杨辉华. 高等学校化学学报), 2020 , 41 ( 1 ): 94 - 101 .
Claver J A , Quaglia A I E . J. Exot. Pet. Med. , 2009 , 18 ( 2 ): 87 - 97 .
Singkhanan N , Kettratad J , Senarat S , Pengsakul T , Para C , Kaneko G . Environ . Asia , 2019 , 12 ( 2 ): 79 - 86 .
Tammen H , Schulte I , Hess R , Menzel C , Kellmann M , Mohring T , Schulzknappe P . Proteomics , 2005 , 5 ( 13 ): 3414 - 3422 .
Issaq H J , Xiao Z , Veenstra T D . Chem. Rev. , 2007 , 107 ( 8 ): 3601 - 3620 .
Ge G B , Ai C Z , Hu W B , Hou J , Zhu L L , He G Y , Fang Z Z , Liang S C , Wang F Y , Yang L . Eur. J. Pharm. Sci. , 2013 , 48 ( 1/2 ): 360 - 369 .
Huang X H , Wang Q L , Mao M , Wang N Y , Zhang J M , Wu X , Dai X T , Tian Z G , Peng Y . Microchem. J. , 2024 , 206 : 111417 .
Hebert P D N , Gregory T R . Syst. Biol. , 2005 , 54 ( 5 ): 852 - 859 .
Hu Z . Species Identification Study Through Pyrosequencingon 12S rRNA . Suzhou : Soochow University(胡真. 基于焦磷酸测序技术利用线粒体12S rRNA基因进行种属鉴别的研究.苏州:苏州大学) , 2016 .
Mistek E , Lednev I K . Anal. Bioanal. Chem. , 2015 , 407 ( 24 ): 7435 - 7442 .
Zhang L , Li G , Sun M , Li H , Wang Z , Li Y , Lin L . Infrared Phys. Technol. , 2017 , 86 : 116 - 119 .
Huang S , Wang P , Tian Y B , Bai P L , Chen D Q , Wang C , Chen S J , Liu Z B , Zheng J , Yao W M , Li J X , Gao J . Biomed. Opt. Express , 2019 ,( 10 ): 6129 - 6144 .
Ashley D L , Bonin M A , Cardinali F L , McCraw J M , Wooten J V . Environ. Health Perspect. , 1996 , 104 ( suppl 5 ): 871 - 877 .
Blount B , Kobelski R , Mcelprang D , Ashley D , Morrow J , Chambers D , Cardinali F . J. Chromatogr. B , 2006 , 832 ( 2 ): 292 - 301 .
Kim J K , Jung K H , Noh J H , Eun J W , Bae H J , Xie H J , Jang J J , Ryu J C , Park W S , Lee J Y , Nam S W . Toxicol. Appl. Pharmacol. , 2011 , 250 ( 2 ): 162 - 169 .
Forbes S L , Rust L , Trebilcock K , Perrault K A , Mcgrath L T . Forensic Sci. Med. Pathol. , 2014 , 10 ( 4 ): 570 - 582 .
Longo V , Forleo A , Ferramosca A , Notari T , Pappalardo S , Siciliano P , Capone S , Montano L . Environ. Pollut. , 2021 , 286 : 117410 .
Ru L H , Lü W , Wang X L , Zhang Z J . J . Instrum. Anal. (茹立华,吕伟,王祥麒,张志娟.分析测试学报), 2023 , 42 ( 3 ): 275 - 282 .
Wu L , Min K Y , Liu J F , Liang W F , Yang H H , Hu G , Yang J T . Acta Acad . Med. Sin. (武乐,闵开元,柳江枫,梁万丰,杨晔宏,胡刚,杨俊涛. 中国医学科学院学报), 2024 , 46 ( 2 ): 147 - 153 .
Hou Z J , Li R Q , Li J , Feng Y N , Jin Q Q , Sun J H , Cao J . Chem . J. Chin. Univ. (侯泽金,李荣其,李健,冯怡宁,靳茜茜,孙俊红,曹洁. 高等学校化学学报), 2024 , 45 ( 9 ): 73 - 81 .
Ding H Y , Du Y , Li H Y , Zheng L N , Zhang J J . J. Instrum. Anal. (丁海媛,杜宇,李昊洋,郑丽娜,张金建.分析测试学报), 2024 , 44 ( 3 ): 420 - 428 .
Gao Y , Zhang J K . J. Med . Biomech. (高原 ,张 健坤. 医用生物力学), 2024 , 39 ( S1 ): 202 .
Wang N , Wang C , Bian H Y , Wang J , Wang P , Bai P L , Yin H C , Tian Y B , Gao J . Spectrosc . Spectral Anal. (王宁,王驰,卞海溢,王钧,王鹏,白鹏利,尹焕才,田玉冰,高静. 光谱学与光谱分析), 2018 , 38 ( 8 ): 2412 - 2418 .
Ren Z , Liu T , Liu G D . Spectrosc. Spectral Anal. (任重,刘涛,刘国栋. 光谱学与光谱分析), 2021 , 41 ( 9 ): 2734 - 2741 .
Zhou Z H , Xiong T , Zhao S , Zhang F , Zhu G X . Spectrosc . Spectral Anal. (周哲海,熊涛,赵爽,张帆,朱桂贤.光谱学与光谱分析), 2024 , 44 ( 4 ): 1081 - 1087 .
Li H Y , Zhang W J , Han Q R , Ding H Y , Yu H C , Luo Y C . J. Instrum. Anal. (李昊洋,张文骥,韩祺瑞,丁海媛,于汇川,罗颖超. 分析测试学报), 2024 , 43 ( 9 ): 1442 - 1449 .
Kuhn M , Silge J . Tidy Modeling with R:A Framework for Modeling in the Tidyverse . Sebastopol,CA : O'Reilly Media,Inc. , 2022 .
0
Views
11
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution