The CT-based intratumoral and peritumoral machine learning radiomics analysis in predicting lymph node metastasis in rectal carcinoma | BMC Gastroenterology

Patients enrollment

This retrospective study was approved by the Medical ethics committee of our hospital (No. 2021QT339) and the informed consent of patients was waived. After searching the surgical database of our hospital, a cohort of 788 patients which were histopathologically diagnosed as rectal carcinoma were enrolled in this study from January 2015 to January 2021. The specific inclusion criteria were lesions which were happened in rectum or the junction between rectum and sigmoid colon, were histopathologically diagnosed as classical adenocarcinoma, signet-ring cell carcinoma, or mucinous carcinoma, were taken triphasic CT examinations, and received surgeries within two weeks after CT examinations. The exclusion criteria were patients who had a history of metachronous or recurrent malignancy, received chemotherapy or radiation therapy before surgeries, and were happened in the ascending, descending, or sigmoid colon.

The general technical workflow was illustrated in Fig. 1. Finally, the cohort including 303 RCs with LNM and 485 RCs without LNM (non-LNM) was randomly divided into the training group (212 LNM and 339 non-LNM) and validation group (91 LNM and 146 non-LNM) with a proportion of 7:3.

Fig. 1

The general technical workflow of this study

CT examination

All patients underwent triphasic CT examinations with a 64 or 128 slices CT protocol (Somatom Definition AS, Siemens, Germany) with the same parameters: tube voltage 120Kv, tube current 200mA, collimation 64*0.625, field of view 360 mm, rotation time 0.75s, slice and thickness interval 5 mm. The triphasic CT examination including unenhanced-phase, arterial-phase, and venous-phase were carried out by the method of computer-aid bolus tracking (1.3 mL/Kg iomeperol 350, 3.0 mL/s) by injecting contrast media via elbow vein. After a delay of 35 and 60 s of unenhanced phase, the arterial phase and venous phase were performed, respectively.

Clinical characteristics

The histopathological characteristics of LNM was diagnosed according to the American Joint Commission on Cancer TNM staging system and the ESMO Clinical Practice Guideline for diagnosis of colon cancer [13]. When the number of positive regional lymph node greater than or equal to one was regarded as LNM, otherwise the absence of positive regional lymph node was classified into non-LNM. The clinical characteristics included gender, age, long diameter, location (It was divided into low, medium, and high position according to the lesion distance within 5 cm, between 5 and 10 cm, and higher than 10 cm from the anal margin), perineural invasion (PNI), extramural venous invasion (EMVI), microsatellite instability (MSI), carcinoembryonic antigen (CEA), carbohydrate antigen 19 − 9 (CA19-9), history of diabetes, hypertension, smoking, and drinking. Additionally, the tumor located at the recto-sigmoid region and more than 10 cm away from the anal margin was classified as high RC. The PNI refers to a process of neoplastic invasion of nerves, nerve sheaths, and the surrounding tissues, which is recognized as a route of metastatic spread [14]. The presence of EMVI was defined as the involvement of tumor to the vasculature beyond the muscularis propria [15]. Tumors lacked one or more mismatch repair proteins of MLH1, MSH2, MSH6, and PMS2 were expected to be MSI status [16].

CT-based machine learning radiomics analysis

Before radiomics analysis, the volume of interest (VOI) of intratumor (VOI-it) and peritumor (VOI-pt) was depicted after three steps: (1) standardize the original CT images through the methods of reconstructing the voxel of X/Y/Z axes into 1.0 mm and adjust the image grayscale into 1 to 32 in software of A.K. (Artificial Intelligence Kit, GE Healthcare). (2) load the standardized triphasic CT images into ITK-SNAP software (, Version3.4.0 ), the VOI-it (Fig. 2a) was segmented manually by two radiologists with 7 and 10 diagnostic experience. (3) the VOI-pt (Fig. 2b) was obtained by expanding 5 mm from the margin of tumor in A.K. software.

Fig. 2
figure 2

The VOI-it (a) and VOI-pt (b) was delineated in software of ITK-SNAP

After segmentation of VOI, the radiomics features of intratumoral and peritumoral tissue were calculated in A.K. software, automatically. Then the repeatability of VOI between two radiologists were evaluated by the analysis of intra-observer correlation coefficient (ICC) among all of the recruited 788 patients. The radiomics features larger than 0.75 were selected and the mean values of selected radiomics features between two radiologists were taken for further analysis. After that, four steps were put into effect to screen radiomics features: (1) the cohort of 788 patients was randomly assigned into two groups of the training group (551 patients) and the validation group (237 patients) with a proportionate of 7:3. (2)before analyses, variables with zero variance were excluded, the outlier values were replaced by the median, and the data were standardized by standardization. (3) the approaches of variance, correlation analysis, and gradient boosting decision tree (GBDT) were employed to extract radiomics features. The specific information of segmentation and radiomics analysis was listed in Supplementary Material.

In the end, the five machine learning radiomics models of Bayes, k-nearest neighbor (KNN), logistic regression (LR), support vector machine (SVM), and decision tree (DT) were constructed. The relative standard deviation (RSD) of 100 Bootstrap replication in the training group was calculated, and the machine learning radiomics model with the minimal RSD value showed the higher stability of the model was selected for further analysis [17]. The equation and detail results of RSD were listed in Supplementary Material. Then the intratumoral and peritumoral combined machine learning model was conducted. Ten-fold cross-validation was performed in the training group to select the best diagnostic classifier. The Delong test was used to depicted the receiver operator curve (ROC) and the area under curve (AUC) with 95% confidence interval (CI) was calculated to evaluate the efficacy of the model.

Statistical analysis

The general clinical characteristics including gender, age, long diameter, location, PNI, EMVI, MSI, CEA, CA19-9, history of diabetes, hypertension, smoking, and drinking were analyzed in SPSS software (Version 22). The continuous variables conforming to normal distribution were analyzed by a method of independent t-test, and the categorical variables were analyzed by chi-square test. The methods of radiomics analysis including variance, correlation analysis, GBDT, machine learning algorithms, and logistic-based nomogram were proceeded in R software (Version 3.4.1) and Python (Version 3.5.6). The methods of ICC and ROC were analyzed in MedCalc software (Version 18.2.1). A two-tailed p-value < 0.05 indicated a statistical significance.

Source link

Back to top button