Abstract: Objective This study focuses on the risk of venous thromboembolism (VTE) in patients with gastric or esophageal cancer (GC/EC), investigating the risk factors for VTE in this population. Utilizing machine learning techniques, the research aims to develop an interpretable VTE risk prediction model. The goal is to identify patients with gastric or esophageal cancer who are at high risk of VTE at an early stage in clinical practice, thereby enabling precise anticoagulant prophylaxis and thrombus management. Methods This study is a real-world investigation aimed at predicting VTE in patients with GC/EC. Data were collected from inpatients diagnosed with GC/EC at Sichuan Provincial People’s Hospital between 1 January 2018, and 31 June 2023. Using nine supervised learning algorithms, 576 prediction models were developed based on 56 available variables. Subsequently, a simplified modeling approach was employed using the top 12 feature variables from the best-performing model. The primary metric for assessing the predictive performance of the models was the area under the ROC curve (AUC). Additionally, the training data used to construct the best model in this study were employed to externally validate several existing assessment models, including the Padua, Caprini, Khorana, and COMPASS-CAT scores. Results A total of 3,742 cases of GC/EC patients were collected after excluding duplicate visit information. The study included 861 (23.0%) patients, of which 124 (14.4%) developed VTE. The top five models based on AUC for full-variable modeling are as follows: GBoost (0.9646), Logic Regression (0.9443), AdaBoost (0.9382), CatBoost (0.9354), XGBoost (0.8097). For simplified modeling, the models are: Simp-CatBoost (0.8811), Simp-GBoost (0.8771), Simp-Random Forest (0.8736), Simp-AdaBoost (0.8263), Simp-Logistic Regression (0.8090). After evaluating predictive performance and practicality, the Simp-GBoost model was determined as the best model for this study. External validation of the Padua score, Caprini score, Khorana ...
No Comments.