Integrating No.3 lymph nodes and primary tumor radiomics to predict lymph node metastasis in T1-2 gastric cancer

Background This study aimed to develope and validate a radiomics nomogram by integrating the quantitative radiomics characteristics of No.3 lymph nodes (LNs) and primary tumors to better predict preoperative lymph node metastasis (LNM) in T1-2 gastric cancer (GC) patients. Methods A total of 159 T1-2 GC patients who had undergone surgery with lymphadenectomy between March 2012 and November 2017 were retrospectively collected and divided into a training cohort (n = 80) and a testing cohort (n = 79). Radiomic features were extracted from both tumor region and No. 3 station LNs based on computed tomography (CT) images per patient. Then, key features were selected using minimum redundancy maximum relevance algorithm and fed into two radiomic signatures, respectively. Meanwhile, the predictive performance of clinical risk factors was studied. Finally, a nomogram was built by merging radiomic signatures and clinical risk factors and evaluated by the area under the receiver operator characteristic curve (AUC) as well as decision curve. Results Two radiomic signatures, reflecting phenotypes of the tumor and LNs respectively, were significantly associated with LN metastasis. A nomogram incorporating two radiomic signatures and CT-reported LN metastasis status showed good discrimination of LN metastasis in both the training cohort (AUC 0.915; 95% confidence interval [CI] 0.832–0.998) and testing cohort (AUC 0.908; 95% CI 0.814–1.000). The decision curve also indicated its potential clinical usefulness. Conclusions The nomogram received favorable predictive accuracy in predicting No.3 LNM in T1-2 GC, and the nomogram showed positive role in predicting LNM in No.4 LNs. The nomogram may be used to predict LNM in T1-2 GC and could assist the choice of therapy. Supplementary Information The online version contains supplementary material available at 10.1186/s12880-021-00587-3.

are used for the treatment of EGC [3]. However, endoscopic resection is only considered for tumors with a low risk of lymph node metastasis (LNM) [4].
In recent years, Artificial Intelligence (AI) has been widely used in the field of medicine. Machine Learning (ML)-based tools have been used in Prediction of LNM, Risk assessment of cancer, lesion detection, staging, evaluation of prognosis and curative effect analysis. Radiomics based on this technology may improve the tumor patient management, screening strategy and customized treatment plan [16][17][18][19][20][21][22]. Studies have shown that radiomics, the technique of converting medical images into mineable data and high-dimensional features, has been proven to improve diagnostic and prognostic accuracy in oncology [23][24][25][26]. It had been widely applied to the prediction of LN metastasis in GC, colorectal cancer and occult peritoneal metastasis in advanced GC and achieved satisfactory results [27][28][29][30][31][32][33][34][35].
At present, the published radiomics research in finding the predictors of LNM mainly use tumor radiomics characteristics or other characteristics related to patients. However, the ability to accurately predict LNM may be affected by relying solely on the radiomics characteristics of primary tumors [33]. Thus, this study aimed to predict preoperative LNM in T1-2 GC patients by integrating the radiomics characteristics of LN and primary tumors. The LNs of the stomach are given station numbers as No.1-No.16 [36,37]. Researches showed that the incidence rate of LNM in No.3 station was the highest (No.1, 2.5%; No.2, 4.8%; No.3, 11.6%; No.4, 6.5%; No.5, 0.5%; No.6, 7.6%) in EGC [38][39][40]. Therefore, by integrating the quantitative radiomics characteristics of No.3 LNs and primary tumors, we developed and validated a radiomics nomogram to better predict preoperative LNM in patients with T1-2GC.

Patients
The Institutional Review Board of our hospital approved this retrospective study and the requirement for informed consent was waived.
The inclusion criteria for the training and testing cohorts were as follows: (a) patients who underwent surgery with curative intent for T1-2 GC and with pathological results; (b) LN dissection performed; (c) excisional LN with detailed grouping and pathological diagnosis; (d) standard contrast-enhanced CT performed less than 10 days before surgical resection. The exclusion criteria were: (a) hypotensive drug taboo (such as glaucoma, prostatic hypertrophy, etc.); (b) preoperative therapy (radiotherapy, chemotherapy, or chemoradiotherapy); (c) concurrent with other tumors or diseases; (d) patients with variation of the left gastric artery; (e) invisible lesions on CT images.

CT data acquisition
All patients fasted for at least 4 h, and 20 mg anisodamine (654-2) was administered intramuscularly to reduce gastrointestinal peristalsis 10 min prior to CT examination. 800-1000 mL warm water was drank to distend the stomach. CT was performed using a 256-Slice (Brilliance iCT, ROYAL PHILIPS, Eindhoven, Netherlands) or a 64-slice (SOMATON sensation64, SIEMENS Healthineers, Muenchen, Germany) multi-slice spiral CT. Patients underwent both unenhanced and two-phase enhanced CT examinations (arterial phase: 35 s after injection; venous phase: 70 s after injection). The CT scans, covering the entire stomach region, were acquired during a breath-hold with the patient supine in all of the phases. During the enhanced CT scan, patients were infused with 1.5 mL/kg of the non-ionic contrast material (iohexol, Yangzi River Pharmaceutical Group, Jiangsu, China; iIodine concentration: 300 mg/mL) with a pump injector (Ulrich CT Plus 150, Ulrich Medical, Ulm, Germany) at a rate of 3.0 mL/s into the antecubital vein. The imaging parameters were as follows: 120 kV; 220-250 mAs; rotation time: 0.5 s; detector collimation: 128 × 0.625 mm or 32 × 0.6 mm; field of view: 400 × 400 mm; matrix: 512 × 512; reconstruction slice thickness: 5 mm for axial plane, and 3 mm for coronal and sagittal plane.

Pipeline
The pipeline of this study includes five steps: lesion detection, region of interest (ROI) segmentation, radiomic feature extraction, radiomic signature building, and nomogram construction and evaluation (Fig. 1).

Detection of lesion on CT images
All CT images were reviewed by a radiologist with more than 10 years of experience in GC diagnosis. Localization of GC lesions: the 159 patients selected in this study all had the results of gastroscopy and CT examination. Combined with gastroscopy and CT images (axial, coronal and sagittal images), the lesions could be located. The diagnostic criteria of CT-reported LN metastasispositive were shown as follows: short-axis diameter of LN ≥ 5 mm, the ratio of short diameter to long diameter of LN ≥ 0.7, and the plain CT value of LN ≥ 25 HU or venous phase CT value of LN ≥ 75 HU; or multiple LNs were fused together even if above conditions were not satisfied.

ROI segmentation on CT images
Two 2-dimensional ROIs were manually segmented by a radiologist with more than 10 years of experience in GC diagnosis. The first ROI (ROI-1) was delineated on the tumor in the slice with the maximum tumor lesion. The second ROI (ROI-2) was delineated on the region of No.3 station LNs around the lesser curvature of stomach. ROI segmentation was performed using ITK-SNAP software (version 2.2.0; www. itksn ap. org) on the venous phase CT images with axial view (see Additional file 1: A1 for detail).

Extraction of radiomic features
Two feature groups were extracted from two ROIs, with each group containing 273 features [41,42]. These features were divided into 4 categories: (a) shape and size features, (b) gray intensity features, (c) texture features, and (d) wavelet features. The feature extraction was implemented using MATLAB (version 2014a; Mathworks, Natick, MA, USA). Radiomic features of all patients were standardized by the z-score method, based on the parameters calculated from the training cohort. More information about the radiomic feature extraction is shown in Additional file 1: A2.

Radiomic signature construction
Radiomic feature selection and signature building were performed in the training cohort for ROI-1 and ROI-2, respectively. More details are described as follows. In order to avoid model over-fitting and improve performance, feature selection was performed to match the sample size (Additional file 1: A3).
First, the minimum redundancy maximum relevance algorithm (mRMR) ranked each feature based on its relevance to LN metastasis status, and the ranking process was able to consider the redundancy of these features at the same time [43]. Since the number of predictors should be kept within 1/10-1/3 of the size of the group that contains the smallest cases in the training cohort  [44], the number of potential features was limited to 7 or less in this study.
Second, five-fold cross-validation was performed multiple times on the training cohort to find the optimal number of features with the best performance based on ranked features. Then a radiomic signature (RS1) reflecting phenotype of ROI-1 and a radiomic signature (RS2) reflecting phenotype of ROI-2, were built as independent predictors of LN metastasis using selected features, respectively. It should be noticed that the feature selection and radiomic signatures construction were implemented based on training cohort alone. For each radiomic signature, the signature score was calculated to reflect the risk of LN metastasis. The predictive performance of the radiomic signatures were quantitatively tested using the area under the receiver operator characteristic (ROC) curve in both the training and testing cohorts.

Construction and evaluation of nomogram
Univariate analysis and multivariate analysis were used to screen out significant clinical risk factors. For univariate analysis, continuous variables were assessed using independent t-test or Mann-Whitney U test for differences between different groups, and categorical variables were assessed by Chi-squared test. A two-sided P value < 0.05 was used to indicate statistical significance. As for multivariate analysis, we performed multivariate logistic regression to screen out key factors. Furthermore, multivariate logistic regression was used to merge two radiomic signatures and clinical risk factors into a nomogram. Similarly, the building of radiomic nomogram was conducted based on training cohort alone. For comparison, we construct two more models which combine clinical risk factors with RS1and RS2, respectively. After that, the calibration curves and Hosmer-Lemeshow test were used to assess the goodness-of-fit of the nomogram, and the AUC was used to quantify its predictive performance. For assessing overfitting, DeLong test was adapted to compare AUCs between training and testing cohorts. Moreover, we used net reclassification index (NRI) to compare the performance between nomogram and clinical risk factors, and quantify the improvement in predictive performance.
Furthermore, a stratified analysis was used to evaluate the influence of clinical factors to the nomogram. In addition, we performed a subgroup analysis to evaluate the additional value of the nomogram in the CT-reported LN metastasis-negative (CT-LNM0) subgroup. Since the number of metastasis in No. 4  Finally, to estimate the clinical utility of the nomogram, decision curve analysis (DCA) was performed by calculating the net benefits using a range of threshold probabilities. Table 1 summarizes the patients' clinical risk factors in both the training and testing cohorts. There is no significant difference in the probability of LN metastasis between the two cohorts (P = 0.384). Univariable analysis showed that CT-reported LN metastasis status from the radiologist were significantly correlated with pathological LN metastasis status (P < 0.05), while CA125 was significantly correlated with LN metastasis status only in the training cohort and tumor infiltration depth in the testing cohort. After multivariable analysis we chose the CTreported LN metastasis status to predict LN metastasis.

Establishment of radiomic signature
During the feature selection, mRMR selected top 10 radiomic features from ROI-1 and top 10 radiomic features from ROI-2 in the training cohort, respectively. As shown in Additional file 1: Figure S1 and Table S3, the cross-validation reserved 4 features from ROI-1 and 2 features from ROI-2. The heatmaps of these features and unsupervised cluster partitioning are shown in Additional file 1: Figure S2. Significant association was found between these features and LN metastasis status. Two radiomic signatures were built using linear combination of these radiomic features (4 features from ROI-1 for RS1 and 2 features from ROI-2 for RS2), and the signature score calculation are presented in Additional file 1: A4. As shown in Fig. 2 and Table 2, both of the two radiomic signatures showed significant predictive ability of LN metastasis in training cohort (AUC of RS1: 0.831, 95% confidence interval [CI] 0.725-0.937, and AUC of RS2: 0.761, 95% CI 0.629-0.893) and testing cohort (AUC of RS1: 0.852, 95% CI 0.742-0.962, and AUC of RS2: 0.763, 95% CI 0.626-0.900).

Construction of nomogram
During the multivariate logistic regression analysis, the two radiomic signatures and one clinical risk factor (CTreported LN metastasis status, CTR) were identified as independent predictors of LN metastasis in T1-2 GC patients (Additional file 1: Table S4). An individualized nomogram was built using the regression method to predict the LN metastasis probability (Fig. 3a).

Evaluation of nomogram
As shown in Fig. 2 and Table 2, our nomogram reached an AUC of 0.915 (95% CI 0.832-0.998) in the training cohort and an AUC of 0.908 (95% CI 0.814-1.000) in the testing cohort, which were better than CTR, RS1, RS2, RS1 + CTR, and RS2 + CTR. The NRI also demonstrated that the nomogram had better predictive ability than the CT-reported LN metastasis status in the training cohort (NRI = 0.339, P < 0.001) and testing cohort (NRI = 0.301, P < 0.001). The DeLong test revealed that difference was not significant between AUCs of our nomogram in training and testing cohorts (P = 0.908), further indicating the robust of our nomogram. As shown in Fig. 3b, c, the calibration curves of the nomogram demonstrates a good fitness of nomogram in both the training and testing cohorts.
The Hosmer-Lemeshow test also showed good performance of our nomogram in the training cohort (P = 0.147) and testing cohort (P = 0.903).
We also implemented stratified analysis, more details were presented in Additional file 1: A5 and Figure S3. The results showed that our nomogram worked well in gender, age, pathologic grade and tumor infiltration depth subsets (DeLong test, P > 0.05).
Moreover, we selected 9 patients with LN metastasis and 11 patients with non-LN metastasis at No.4 station as a validation set to further validate our nomogram. Interestingly, our nomogram also showed a good performance on this station (AUC 0.824; 95% CI 0.517-1; Additional file 1: Figure S4). The decision curve of the nomogram is presented in Fig. 5. With a threshold of 0 to 0.85, patients using nomogram will have more diagnostic benefits than allmetastasis or none-metastasis strategies.

Discussion
In this study, an easy-to-use radiomic nomogram was established to identify LN metastasis of T1-2 GC preoperatively. The nomogram, incorporating two radiomic signatures and CT-reported LN metastasis status, showed the best discrimination ability of LN metastasis in both the training and testing cohorts. The nomogram  could assist the formulation of clinical treatment scheme. Although the lymphatic system around the stomach is very complex [45], previous researches showed that the incidence rate of LN metastasis of EGC in No.3 station was the highest (No.3, 11.6%; No.1, 2.5%; No.2, 4.8%; No.4, 6.5%; No.5, 0.5%; No.6, 7.6%, respectively) (Additional file 1: Table S5) [38][39][40]. Therefore, we developed and validated a radiomics nomogram by integrating radiomics characteristics of No.3 LNs and primary tumors to better predict preoperative LNM in T1-2 GC patients.
We analyzed the radiomic features in the two significant radiomic signatures. The radiomic features used in RS1 included: (1) 'X1_fos_skewness' describes the shape of a probability distribution of the voxel intensity histogram, and reflects the distribution symmetry. (2) 'X0_fos_variance' measures the spread of intensity distribution about the mean value, and reflects the uniformity of distribution. (3) 'X3_fos_root_mean_square' is the root mean square of the voxels intensity value. (4) High 'X1_GLCM_dissimilarity' means there is a great disparity in intensity value among neighboring voxels. These radiomic features might quantify intratumor heterogeneity, and thus could predict the invasiveness of the tumor and the probability of LN metastasis [46]. The final selected radiomic features of lymph nodes consisted of: (1) 'X1_ GLRLM_energy' measures of the magnitude of voxel values in an image describes the overall density of the lymph volume, (2) High 'X1_GLCM_cluster_prominence' implies more asymmetry. These radiomic features might indicate the high image intensity and heterogeneity in the No.3 station LN region, and thus the sign of LN metastasis. We have showed two examples of patients with and without LN metastasis (Additional file 1: A6 and Figure  S5). The CT images also demonstrated that higher heterogeneity of the primary tumor and No.3 LN region leaded to higher probability of LN metastasis.
In this study, CT-reported LN metastasis status from the radiologist was significantly correlated with LN metastasis in univariable analysis. This subjective judgement was also included in our nomogram. We also found that CA125 was significantly associated with LN metastasis in the training cohort (P = 0.035), but had no significance in the testing cohort. This may be caused by the relatively small sample size and baseline deviation. Moreover, the positive rate of CA125 was very low in EGC [47].
We conducted some stratified analysis, the results showed that the performance of our nomogram was not affected by gender, age, pathologic grade and tumor infiltration depth factors. In addition, we tested the correlations between the radiomic features and clinical risk factors using Pearson correlation analysis (Additional file 1: Figure S6). There was no correlation between radiomic features and clinical risk factors, which pointed that the radiomic features might be a good supplement to clinical factors. The good performance of our nomogram in CT-LNM0 subgroup also demonstrated the additional value of the nomogram to the radiologists.
More interestingly, the nomogram trained from phenotype of No.3 station LNs also showed a positive role in predicting LN metastasis in No.4 station LNs. This finding indicated that the radiomic signature from the LN region did reflect the early change of phenotype of LNs. Thus, our nomogram may be used in other stations of LNs.
There are some limitations in this study. Firstly, the relatively small sample size of this study. Secondly, the lack of the external validation. Thirdly, the presence of lymphatic invasion and LN micrometastasis have also been considered as important risk factors for LN metastasis in  Decision curve analysis for the nomogram. The y-axis represents the net benefit, and the pink line represents the nomogram. The blue line represents the hypothesis that all patients had lymph node (LN) metastases, and the black line represents the hypothesis that no patients had LN metastases. The x-axis represents the threshold probability. The threshold probability is where the expected benefit of treatment is equal to the expected benefit of avoiding treatment. The decision curves in the training cohort showed that if the threshold probability is between 0 and 0.85, using the nomogram to predict LN metastases adds more benefit than treating either all or no patients EGC [48][49][50], however, these factors were not routinely collected in our center. Fourth, given the use of manual segmentation, the radiomic features reproducibility should be further evaluated. Finally, cases with invisible lesions on CT images were excluded, so some patients could not use the nomogram. These problems need to be further studied.

Conclusions
In summary, the nomogram received favorable predictive accuracy in predicting No.3 LNM in T1-2 GC, and the nomogram showed positive role in predicting LNM in No.4 LNs. The radiomics nomogram may assist the formulation of clinical treatment scheme.