Differentiation of mediastinal metastatic lymph nodes in NSCLC based on radiomic features of different phase CT imaging

non-small cell lung cancer (NSCLC).Methods We selected 231 mediastinal lymph nodes confirmed by pathology results as the subjects, which were divided into training (n=163) and validation cohorts (n=68). The regions of interest (ROIs) were delineated on CT scans of the plain phase, arterial phase and venous phase, respectively. Radiomic features were extracted from the CT images of each phase. Least absolute shrinkage and selection operator (LASSO) was used to select features, and multivariate logistic regression analysis was used to build models. We constructed six models (orders of 1-6) based on radiomic features of the single- and dual-phase CT images. The performance of the radiomic model was evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, positive predictive value (PPV) and negative predictive value (NPV).Results A total of 846 features were extracted from each ROI, and 10, 9, 5, 2, 2, and 9 features were chosen to develop models 1-6. All of the models showed superior differentiation, with AUCs greater than 0.8. The plain CT radiomic model, model 1, yielded the highest AUC, specificity, accuracy and PPV, which were 0.926 VS 0.925, 0.860 VS 0.769, 0.871 VS 0.882 and 0.906 VS 0.870 in the training and validation sets, respectively. When the plain and venous phase CT radiomic features were combined with the arterial phase CT images, the sensitivity increased from 0.879, 0.919 to 0.949, 0979 and the NPV increased from 0.821, 0.789 to 0.878, 0.900 in the training group, respectively.Conclusion CT radiomic models based on different phases all showed high accuracy and precision in the diagnosis of LNM in NSCLC patients. When combined with arterial phase CT, the sensitivity and NPV of the model can be further improved.

imaging technique for the diagnosis of metastatic LNs, with relatively high specificity for LN staging in patients with NSCLC [2,3]. However, the low prevalence and high cost of PET/CT equipment limit its clinical application. Additionally, CT has disadvantages in the identification of metastatic LNs in that high rates of false-positive and false-negative results occur when images are judged according to morphological changes, including size, shape, necrosis, and external capsule invasion [4,5]. Hence, a great need exists for sensitive and accurate methods to preoperatively assess the status of LNs, which could help to improve the degree of radical surgery, select appropriate chemotherapy regimens, and delineate the radiotherapy target area.
Due to the emergence of personalized medicine and targeted therapy, the need for quantitative image analysis has increased with the explosion of standard medical data. Radiomics provides promising opportunities in this regard, endowing medical imaging with an increasingly important role in analyzing tumor heterogeneity [6]. Previous studies have shown that objective and quantitative image features could potentially be used as prognostic or predictive biomarkers [7]. However, most studies have focused on single-phase CT images, which may not be able to obtain the best model from a series of CT images.
Therefore, in the present study, we investigated the accuracy of radiomic and delta-radiomic features between different CT phase scans in the preoperative discrimination of metastatic LNs in NSCLC patients to provide the best reference model for the clinical diagnosis of mediastinal lymph nodes.

Patient information
The Institutional Review Board approved the retrospective review of the medical records for this analysis. The inclusion criteria were as follows: (I) all patients underwent plain and enhanced CT; (II) no patients received any treatments before the scans were performed; (III) LNM was confirmed by pathology results; and (IV) distant metastasis, multiple tumors and other manifestations were absent.
The exclusion criteria were as follows: (I) clinical data were incomplete, or statistical analysis could not be performed; (II) poor image quality affected the quantitative analysis; and (III) CT images were reconstructed using different algorithms, thicknesses, or equipment.

CT image acquisition
All patients received routine and enhanced CT scanning, and a Philips scanner (Holland, CT Lightspeed 16) was used with an imaging protocol of tube voltage 120 kV, cube current 300 mA, thickness 2 mm and in-plane resolution 0.97×0.97. The contrast medium was injected into the elbow vein at the injection rate of 2.3~3.0 ml/sec, and the maximum dose was 100 ml. An arterial phase scan was performed 25 to 30 seconds after contrast medium injection, and a venous phase scan was performed 90 seconds later. Plain, arterial and venous phase images were obtained. All images were exported in DICOM format for image feature extraction.

Lesion segmentation
We performed manual segmentation on arterial-phase CT image using MIM Maestro software (MIM Software, Cleveland, OH), and pathologically confirmed LNs were defined as regions of interest (ROIs). Using the arterial-phase CT image as the reference, plain and venous-phase CT images were corrected by the nonrigid registration method, and the contouring results were mapped to the plain and venous-phase images, respectively. The target images were delineated by two senior radiologists with 20 years of experience in chest CT diagnosis, and differences in the findings were resolved by a third high-ranking radiologist when disputes occurred. Figure 1 shows schematic diagrams of the ROIs on three CT images of different phases.

Feature extraction
Radiomic features were extracted from LNs using 3D Slicer software, an open-source Python package for the extraction of features from medical images (version 4.6, http://www.slicer.org) [8]. In total, 841 radiomic features were extracted and were organized into two categories: (I) based on original images; and (II) based on wavelet images. Eighteen first-order features derived from the tumor intensity histogram reflect the distribution of values of individual voxels without concern for spatial relationships. Thirteen shape features provide the geometric tumor volume. Seventy-four texture features describe the spatial arrangement of voxels as calculated from different parent matrices, including the gray level dependence matrix (GLDM), the gray level co-occurrence matrix (GLCM), the gray level size zone matrix (GLSZM), the gray level run length matrix (GLRLM) and the neighborhood gray-tone difference matrix (NGTDM) [9]. In addition, 736 wavelet features derived from eight filtering modes were obtained.

Feature selection and development of radiomic models
The least absolute shrinkage and selection operator (LASSO) logistic regression algorithm was used to select significant features with nonzero coefficients to develop models. In this study, we constructed six models based on radiomic features of single-phase imaging and joint two-phase imaging, which included models 1, 2, and 3 (based on the plain, arterial and venous phase radiomic features, respectively), and models 4, 5, and 6 (based on the delta radiomic features between plain and arterial phase imaging, plain and venous phase imaging, arterial and venous phase imaging, respectively). The above process was implemented in R software (version: 3.3.3, https://www.rproject.org). The classification performance of the radiomic models was quantified by the area under the receiver operator characteristic curve (AUC), sensitivity, specificity, accuracy, positive predictive value (PPV) and negative predictive value (NPV) in both the training and validation cohorts.

Statistical analysis
The data analysis was performed using Statistical Package for Social Sciences (SPSS) software version 23.0 (SPSS, Chicago, IL, USA) and R software (version 3.4.0, https://www.r-project.org). We compared continuous values (age) between the training and verification groups by independent samples t tests, and the c 2 test was used to compare the two classification characteristics (sex and pathological outcome) between the training and verification groups. P values less than 0.05 were considered statistically significant.

Selection of features and construction of radiomic models
A total of 841 features were extracted from each phase CT image of the training cohort. We screened these features and chose 10, 9, 5, 2, 2, and 9 features that had nonzero coefficients as potential predictors using the LASSO logistic regression model.

Analysis of models based on the single-and joint-phase CT
Based on three single-phase CT images and two joint-phase CT images, we constructed six models in this study. As shown in Table 2 The sensitivity, specificity, accuracy, NPV, and PPV of each model are listed in Table 2.

Discussion
The International Association for the Study of Lung Cancer (IASLC) showed that, based on a newly established large database, the 5-year survival rates for patients with LNM ranged from 26 to 53% [10]. The systematic dissection of LNs in lung cancer patients has been widely accepted, but the extent of LN dissection has remained a matter of debate due to the precise assessment of metastatic LNs [11,12]. LNM is an important factor that affects tumor and LN staging. Therefore, noninvasive preoperative evaluation of the LN status is crucial for determination of the lung cancer stage, surgical plan, and prognosis [13].
Currently, CT and PET/CT are the most routinely used noninvasive methods for the clinical diagnosis of LNs. The international standard for the diagnosis of metastatic LNs by CT in lung cancer is a short-axis LN diameter larger than 10 mm. However, due to the single diagnostic criterion, the accuracy of the diagnosis is limited to some extent. PET/CT is a noninvasive method for staging cancer that is increasingly employed by multidisciplinary lung cancer teams. Many studies have reviewed the diagnostic performance of PET/CT for LN staging in patients with NSCLC [14~16]. A systematic review showed that the summary sensitivity and specificity estimates for a maximum standard uptake volume (SUVmax) ≥2.5, which is the PET/CT positivity criterion, were 81.3% and 79.4%, respectively [17]. However, the low prevalence and high cost of PET/CT equipment result in it being less commonly applied than CT alone in preoperative examinations. If the accuracy of CT in the diagnosis of lymph nodes could be improved, it would provide more important clinical guidance for the delineation of radiotherapy targets and the selection of surgical range.
Recently, the development of radiomic has enabled medical images to be converted into highthroughput quantitative data, providing information that can be explored and used to guide clinical decision-making. In contrast to subjective descriptions of the volume and shape of lesions, radiomic features can more comprehensively describe the state of lesions, overcoming the disadvantages of traditional diagnostic methods [18~20]. Therefore, it is expected to improve the accuracy of CT image diagnosis. Moreover, studies have demonstrated the feasibility of using radiomic features to predict LNM in rectal, breast and esophageal cancers, providing theoretical support for this study In the present study, we constructed radiomic models based on pathological diagnostic results to facilitate the preoperative identification of metastatic LNs in NSCLC patients. The results showed that the diagnostic models exhibited favorable discrimination (AUC values greater than 0.8, a maximum sensitivity of 97.9%, and a maximum specificity of 86.0%). Yao et al. [24] summarized the PET/CT diagnostic results of 2543 NSCLC patients from 22 research centers and found that the overall sensitivity and specificity of PET/CT for identifying mediastinal LNM were 0.66 and 0.82, respectively.
In addition, another study showed that the sensitivity and specificity of CT in the diagnosis of mediastinal lymph node metastasis were 0.79 and 0.72, respectively [25]. Compared to those in previous studies, the methods proposed in this study have the advantages of being quantitative and reproducible, with higher sensitivity and specificity than previously reported.
Moreover, different from previous studies that only analyzed characteristics of single-phase CT images, this study extracted not only radiomic features from plain, arterial, and venous phase CT images but also calculated delta radiomic feature values between different phase CT images. The arterial phase mainly reflects the tissue perfusion of the tumor, and the venous phase mainly reflects the clearing of the tissue blood flow, which is also an important imaging feature of tumor metastasis [26]. The results revealed that the sensitivity and NPV of models 4 and 6 were significantly improved. In clinical practice, for NSCLC patients treated with neoadjuvant therapy and routine radical surgery, false-positive LNs will not result in insufficient treatment or lead to treatment delay. However, the higher NPV of this approach means that negative LNs will be more accurately identified, which may change the clinical treatment plan [27]. These findings suggest that the accuracy of models can be improved when combined with dual-phase models in future clinical applications.
This method of integrating a large number of features in CT images that cannot be recognized or