Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

[Construction and preliminary validation of machine learning predictive models for cervical cancer screening based on human DNA methylation].

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: Chinese Medical Association Country of Publication: China NLM ID: 7910681 Publication Model: Print Cited Medium: Print ISSN: 0253-3766 (Print) Linking ISSN: 02533766 NLM ISO Abbreviation: Zhonghua Zhong Liu Za Zhi Subsets: MEDLINE
    • Publication Information:
      Publication: Peking : Chinese Medical Association
      Original Publication: Beijing, Zhonghua yi xue hui.
    • Subject Terms:
    • Abstract:
      Objective: Using methylation characteristics of human genes to construct machine learning predictive models for screening cervical cancer and precancerous lesions. Methods: Human DNA methylation detection was performed on 224 cervical exfoliated cell specimens from the Cancer Hospital of the Chinese Academy of Medical Sciences, Tianjin Central Hospital of Gynecology Obstetrics, Xinmi Maternal and Child Health Hospital of Henan Province, West China Second Affiliated Hospital of Sichuan University, and Heping Hospital Affiliated to Changzhi Medical College collected during April 2014 and March 2015. The hypermethylated gene fragments related to cervical cancer were selected by high-density, high-association, and hypermethylated gene fragment screening and the LASSO regression algorithm. Taking cervical intraepithelial neoplasia grade 2 (CIN2) or more severe lesions as the research outcome, machine learning predictive models based on the random forest (RF), naive Bayes (NB), and support vector machine (SVM) algorithm, respectively, were constructed. A total of 144 outpatient specimens were used as the training set and 80 cervical exfoliated cell specimens from women participating in the cervical cancer screening program were used as the test set to verify the predictive models. Using histological diagnosis results as the gold standard, the detection efficacy for CIN2 or more severe lesions of the three machine learning predictive models were compared with that of the human papilloma virus (HPV) detection and cytological diagnosis. Results: In the training set of 144 cases, there were 34 cases of HPV positivity, with a positive rate of 23.61%. Cytologically, there were 37 cases diagnosed as no intraepithelial lesion or malignancy (NILM), and 107 cases diagnosed as atypical squamous cells of undetermined significance (ASC-US) or above. Histologically, there were 28 cases without cervical intraepithelial neoplasia or benign cervical lesions, 31 cases of CIN1, 18 cases of CIN2, 31 cases of CIN3, and 36 cases of squamous cell carcinoma. Seven hypermethylated gene fragments were selected from 45 genes, and three machine learning prediction models based on the RF, NB, and SVM algorithm, respectively, were constructed. In the validation set of 80 cases, there were 28 cases of HPV positivity, with a positive rate of 35.00%. Cytologically, there were 65 cases diagnosed as NILM and 15 cases as ASC-US or above. Histologically, there were 39 cases without cervical intraepithelial neoplasia or benign cervical lesions, 10 cases of CIN1, 10 cases of CIN2, 11 cases of CIN3, and 10 cases of squamous cell carcinoma. In the validation set, the area under the curve (AUC) values of the RF model, NB model, SVM model, HPV detection, and cytological diagnosis of CIN2 or above were 0.90, 0.88, 0.82, 0.68, and 0.45, respectively. The DeLong test showed that there was no statistically significant difference in the AUC values between the RF, NB, and SVM models (all P >0.05), and the AUC values of the RF and NB models were higher than that of HPV detection (both P <0.01), and the AUC values of the RF, NB, and SVM models were higher than that of cytological diagnosis (all P <0.01). Compared with the NB model, the sensitivity of the RF model was similar (80.65% vs. 77.42%), but the specificity of the NB model was much higher than that of the RF model (93.88% vs. 73.47%). Conclusion: Among the machine learning prediction models for cervical cancer and precancerous lesions constructed based on human DNA methylation, the NB model has good predictive performance for CIN2 and above lesions, and may be used for screening of cervical cancer and precancerous lesions.
    • Grant Information:
      81973136 National Natural Science Foundation of China; 2021-I2M-1-004 CAMS Innovation Fund for Medical Sciences; 21YYJC3520 Sichuan Provincial Science and Technology Program Applied Basic Research Program
    • Contributed Indexing:
      Local Abstract: [Publisher, Chinese] 目的: 利用人基因的甲基化特征,构建预测宫颈癌及癌前病变的机器学习预测模型。 方法: 对2014年4月至2015年3月来自中国医学科学院肿瘤医院、天津市中心妇产科医院、河南省新密妇幼保健院、四川大学华西第二附属医院和山西长治医学院附属和平医院的224例宫颈脱落细胞标本进行人DNA甲基化检测,通过CpG高密度、高关联、高甲基化基因片段筛选和LASSO回归算法,筛选出与宫颈癌病变相关的高甲基化基因片段。以宫颈上皮内瘤变2级(CIN2)及以上病变为研究结局,以144例门诊患者标本为训练集,构建随机森林(RF)、朴素贝叶斯(NB)和支持向量机(SVM)3种机器学习预测模型,以80例参与宫颈癌筛查项目女性的宫颈脱落细胞标本为验证集对预测模型进行验证。以组织学诊断结果为金标准,比较3种机器学习预测模型与HPV检测和细胞学诊断对CIN2及以上病变的检出效能。 结果: 训练集144例中,HPV阳性34例,阳性率为23.61%。细胞学诊断为无上皮内病变或恶性细胞(NILM)37例,不能明确意义的非典型鳞状上皮细胞(ASC-US)及以上病变107例。组织学诊断为未见宫颈上皮内病变或宫颈良性病变28例,CIN1 31例,CIN2 18例,CIN3 31例,鳞癌36例。从45个基因中筛选出7个高甲基化基因片段,构建了RF、NB和SVM机器学习预测模型。验证集80例中,HPV阳性28例,阳性率为35.00%。细胞学诊断为NILM 65例,ASC-US及以上病变 15例。组织学诊断为未见宫颈上皮内病变或宫颈良性病变39例,CIN1 10例,CIN2 10例,CIN3 11例,鳞癌10例。在验证集中,RF模型、NB模型、SVM模型、HPV检测和细胞学诊断CIN2及以上病变的受试者工作特征曲线下面积(AUC)分别为0.90、0.88、0.82、0.68和0.45。DeLong检验显示,RF模型、NB模型和SVM模型的AUC差异无统计学意义(两两比较均 P >0.05),RF模型、NB模型的AUC高于HPV检测(均 P <0.01),RF模型、NB模型、SVM模型的AUC高于细胞学诊断(均 P <0.01)。RF模型与NB模型相比,灵敏度相近(分别为80.65%和77.42%),但NB模型的特异度远高于RF模型(分别为93.88%和73.47%)。 结论: 基于人DNA甲基化构建的宫颈癌及癌前病变机器学习预测模型中,NB模型对CIN2及以上病变的预测效能良好,或可用于女性的宫颈癌及癌前病变筛查。.
    • Publication Date:
      Date Created: 20250212 Date Completed: 20250507 Latest Revision: 20250507
    • Publication Date:
      20250508
    • Accession Number:
      10.3760/cma.j.cn112152-20230925-00156
    • Accession Number:
      39939021