Skip to main content

Contrast-enhanced CT-based radiomics model for differentiating risk subgroups of thymic epithelial tumors

Abstract

Background

To validate a contrast-enhanced CT (CECT)-based radiomics model (RM) for differentiating various risk subgroups of thymic epithelial tumors (TETs).

Methods

A retrospective study was performed on 164 patients with TETs who underwent CECT scans before treatment. A total of 130 patients (approximately 79%, from 2012 to 2018) were designated as the training set, and 34 patients (approximately 21%, from 2019 to 2021) were designated as the testing set. The analysis of variance and least absolute shrinkage and selection operator algorithm methods were used to select the radiomics features. A logistic regression classifier was constructed to identify various subgroups of TETs. The predictive performance of RMs was evaluated based on receiver operating characteristic (ROC) curve analyses.

Results

Two RMs included 16 and 13 radiomics features to identify three risk subgroups of traditional risk grouping [low-risk thymomas (LRT: Types A, AB and B1), high-risk thymomas (HRT: Types B2 and B3), thymic carcinoma (TC)] and improved risk grouping [LRT* (Types A and AB), HRT* (Types B1, B2 and B3), TC], respectively. For traditional risk grouping, the areas under the ROC curves (AUCs) of LRT, HRT, and TC were 0.795, 0.851, and 0.860, respectively, the accuracy was 0.65 in the training set, the AUCs were 0.621, 0.754, and 0.500, respectively, and the accuracy was 0.47 in the testing set. For improved risk grouping, the AUCs of LRT*, HRT*, and TC were 0.855, 0.862, and 0.869, respectively, and the accuracy was 0.72 in the training set; the AUCs were 0.778, 0.716, and 0.879, respectively, and the accuracy was 0.62 in the testing set.

Conclusions

CECT-based RMs help to differentiate three risk subgroups of TETs, and RM established according to improved risk grouping performed better than traditional risk grouping.

Peer Review reports

Background

Thymic epithelial tumors (TETs) originate from the thymus and are the most common primary neoplasms in the anterior mediastinum, accounting for approximately 47% of cases [1]. Pathological subtypes of TETs were determined by the World Health Organization (WHO) in 2004, including thymomas (Types A, AB, B1, B2, and B3) and thymic carcinoma (TC), based on morphologic manifestations of epithelial cells and the ratio of lymphocytes to epithelial cells [2]. In 2014, the International Thymic Malignancy Interest Group (ITMIG) affirmed the description of WHO histologic subtypes of TETs [3]. The six different subtypes were divided into three risk subgroups according to increasing grade of malignancy: low-risk thymomas (LRT; Types A, AB and B1), high-risk thymomas (HRT; Types B2 and B3), and TC in 2004 [4]. It has been agreed that TC has a poorer prognosis and a higher recurrence rate than HRT and LRT. According to the different subgroups of TETs, different standardized and appropriate treatment options and methods of predicting the clinical course and prognosis of the disease are used for each patient by the clinical multidisciplinary team [5, 6]. Therefore, accurate and noninvasive identification of TETs before treatment, and even of the subgroups, is of clinical significance.

According to the National Comprehensive Cancer Network (NCCN) guidelines for thymomas and thymic carcinomas in 2021, chest contrast-enhanced CT (CECT) with contrast is still the first choice for imaging evaluation before treatment [7]. Chest CECT imaging can provide many general morphologic parameters. However, there are many overlapping features in the histological subgroups of TETs, and certain difficulties in distinguishing different subgroups may be encountered [8, 9]. Radiomics, a diagnostic technology based on radiomics signatures, has aroused increasing attention, mainly because it can extract different kinds and large quantities of high-throughput imaging features and transform medical images into mineable high-dimensional data [10, 11]. The subsequent quantitative analysis of these data can offer help in differential diagnosis, risk classification, predicting prognosis and efficacy evaluation of tumors based on different kinds of medical images [12,13,14,15]. Although several CT-based radiomics analyses have been used to identify the risk classification of thymic epithelial tumors, most studies were based on two-classification [16, 17]. Only one study was based on triple classification, and the accuracy of the clinical-semantic radiomics model (RM) in the risk assessment of three subgroups in the validation group was only 48.3% [18]. Therefore, radiomics research based on triple classification needs further research.

Previous studies have found that although type B1 thymomas are LRTs in terms of biological characteristics and invasive performance, their imaging features are more similar to those of types B2 and B3 thymomas [19]. In addition, the results of Kim et al. showed that the disease-free survival at 5 years of type B1, B2 and B3 thymomas was basically similar [20]. Therefore, we tried to regroup the six subtypes into three risk subgroups: LRT* (Types A and AB), HRT* (Types B1, B2, and B3), and TC. In this article, the subgroups were named traditional risk grouping (LRT, HRT, and TC) and improved risk grouping (LRT*, HRT*, and TC) to facilitate the description of articles and statistics of data.

This study aimed to build two CECT-based RMs and validate their predictive abilities in differentiating three different risk subgroups of TETs in the two simplified groups.

Methods

Patients

The retrospective study was approved by the institutional review board of Shanxi Province Tumor Hospital. The individual written informed consent was waived. The study included 179 patients with pathologically confirmed TETs in the anterior mediastinum from October 2012 to March 2021. Accurate pathological classifications were obtained in 164 patients, including 45 cases of biopsy and 119 cases of surgical resection, while not accurate pathological classifications were obtained in 15 patients, including 14 cases of biopsy and 1 case of surgical resection. All 164 patients who were included in this radiomics study underwent CECT scans before treatment. The inclusion criteria were as follows: (a) solid anterior mediastinal TETs; (b) lesions > 2.0 cm in diameter based on the longest diameter; (c) good-quality CECT images without movement artifacts; and (d) patients who did not undergo biopsy, treatment with chemotherapy, radiation therapy, or surgery before CT scan.

Determine the number of patients in the training set and test set according to the time. A total of 130 patients (approximately 79%, from 2012 to 2018) were designated as the training set, and 34 patients (approximately 21%, from 2019 to 2021) were designated as the testing set. The distribution of the training set and testing set of 164 patients is shown in Table 1. The workflow was shown in Fig. 1.

Table 1 The distribution of the training set and testing set of 164 patients
Fig. 1
figure 1

Radiomics analysis workflow. First, 164 TETs in the anterior mediastinum on CECT were collected. Second, image segmentation was used to delineate the TET lesions on the RadCloud platform, the volume of interest (VOIs) was checked manually, and the radiomics features of VOIs were calculated automatically. In addition, the two kinds of valuable radiomics features were extracted by the automated high-throughput feature analysis algorithm according to two different simplified groups in the training set. Finally, statistical analysis was applied, and ROC curve analysis was used to illustrate the prediction performance of RM for the risk subgroups of TETs

CT images

The Digital Imaging and Communications in Medicine (DICOM) CECT images were scanned by a GE Discovery CT 750HD scanner (Waukesha, WI) and a GE lightspeed Healthcare CT scanner. Automatic tube current modulation techniques were adopted with the tube voltage set at 120 kVp. Before scanning, patients were instructed to hold their breath to avoid motion artifacts. The first series was a thorax noncontrast CT study (helical scan type, 100 kV and automatic mAs, the rotation time was 0.6 s, the slice thickness and interval were each 5 mm, the pitch was 1.375:1, the scanning field of view (SFOV) was 50 cm, and the matrix was 512*512); the scan range was from the thoracic inlet to the diaphragmatic level. A total of 50 to 120 mL (1 mL/kg weight) of contrast medium (iohexol, 300 mg/mL, iodine) was injected by using a pump injector at a rate of 3.0 mL/s. Venou phase scanning began 35 s after the trigger attenuation threshold (120 HU) achieved the level of the thoracic aorta. The scanning parameters were the same as those in the noncontrast CT study.

Lesion delineation and segmentation

All DICOM CECT images were loaded into the RadCloud platform (Huiying Medical Technology Co., Ltd. https://mics.radcloud.cn). RadCould radiomics platform used open source code, which can be obtained online (https://readthedocs.org/projects/pyradiomics/downloads/). The region of interest (ROI) of the lesion was handcrafted layer by layer on 5 mm thick venous CECT images on the platform by a radiologist with 10 years of experience (X.L.). Volumes of interest (VOIs) were automatically calculated and generated (Fig. 2).

Fig. 2
figure 2

TET lesions segmentation. On all consecutive CECT images, the contour of the lesions was drawn manually along the edge of the lesions, and VOIs were automatically obtained

Radiomics features

In total, 1409 quantitative imaging features were extracted from venous-phase CECT images with the RadCloud platform, which the feature extraction module is based on the “pyradiomics” (version 2.2.0, https://pyradiomics.readthedocs.io/) package in Python (Version 2.7). They were grouped into four categories. Category 1 covered the intensity features (including 18 descriptors) that quantitatively delineated the distribution of voxel intensities within the CT image through the basic metrics found in common. Category 2 (shape features) consists of 14 three-dimensional (3D) features that describe the geometric features of the target area, such as shape and size. Category 3 (texture features). The 75 features described the characteristics of voxel spatial distribution intensity levels and were divided into five types based on the gray level cooccurrence matrix (GLCM), gray size area band matrix (GLSZM), gray run length matrix (GLRLM), gray level dependence matrix (GLDM), and neighboring gray tone difference matrix (NGTDM). The above three categories all extracted features from the VOIs of the original image. Category 4 (higher-order features), with 1302 features, included the intensity and texture features that were derived from the wavelet transformation and the filters of the original image. In this study, a total of 14 filters were used for the filtering of the original image, including exponential, square, square root, logarithm, gradient, local binary pattern and wavelet (wavelet-LLL, wavelet-HHH, wavelet-HLL, wavelet-HHL, wavelet-LLH, wavelet-HLH, wavelet-LHL, wavelet-LHL, wavelet-LHH). Before feature extraction, the images were resampled to 1 * 1 * 1, and the gray-level normalization were applied for the standardization of the CT images.

Radiomics feature selection and model establishment

All statistical analyses were performed in Python (Version 2.7) using “scitkit-learn” (V0.2 https://scikit-learn.org/stable/). Before feature selection, Z-Score was used for feature standardization. We used analysis of variance (ANOVA) and least absolute shrinkage and selection operator (LASSO) algorithm methods for feature selection to identify the optimal features. The cost function of LASSO method is:

$$\mathop {\min }\limits_{w} \frac{1}{2n}\left\| {Xw - y} \right\|_{2}^{2} + \alpha \left\| w \right\|_{1}$$

where X is the matrix of radiomic features, y is the vector of the sample labels, n is the number of samples, w is the coefficient vector of the regression model, and \(\alpha \left\| w \right\|_{1}\) is the LASSO penalty with the constant \(\alpha\) and the \(l_{1}\)-norm of coefficient vector \(\left\| w \right\|_{1}\).

We used a logistic regression (LR) classifier on CECT selected features. A logistic function or logistic curve is a common "S" shape (sigmoid curve), with the following equation:

$$y = \frac{L}{{1 + \exp \left( { - k\left( {x - x_{0} } \right)} \right)}}$$

where e is the natural logarithm base (also known as Euler's number),\(x_{0}\) is the x-value of the sigmoid's midpoint, L is the curve's maximum value, and k is the steepness of the curve.

The cost function of LR as following:

$$\mathop {\min }\limits_{w,c} \frac{1}{2}w^{T} w + \mathop \sum \limits_{i = 1}^{n} \log \left( {\exp \left( { - y_{i} \left( {X_{i}^{T} + c} \right)} \right) + 1} \right)$$

where the parameters are the same as the cost function for LASSO [21].

Assessment of inter- and intraclass correlation coefficients (ICCs)

To ensure reproducibility of radiomics feature extraction, we employed inter- and intraclass correlation coefficients (ICCs) for assessing the intra- and interobserver agreement of VOI delineation. Thirty lesions were selected randomly by statistical software. After 1 month, another radiologist (Z.Z.K) with 13 years of clinical experience used the same method to extract radiomics features. An ICC > 0.75 was considered to represent good agreement.

Predictive performance of RMs after machine learning

Receiver operating characteristic (ROC) curve analysis was used to evaluate the prediction ability of the two different RMs. The optimal cutoff value was selected as the point when both the sensitivity and specificity were maximal. The area under the curve (AUC) and accuracy were calculated in both the training and testing sets. The three indicators were P (precision = true positives/(true positives + false positives)), R (recall = true positives/(true positives + false negatives)), and f1-score (f1-score = P × R × 2/(P + R)), to evaluate the performance of the LR classifier. The clinical benefits of two RMs were estimated by decision curve analyses, and the goodness-of-fits of the two RMs were evaluated by calibration curves. They were accomplished with R 4.0.3 (www.R-project.org/).

Results

General data

A total of 164 patients (mean age: 54 ± 10.33 years, age range: 24–78 years) with TETs for CECT scans were enrolled: 78 men and 86 women. According to the histological and immunohistochemical results, with regard to WHO pathological subtypes, there were 15 (9.1%) Type A patients, 19 (11.6%) Type AB, 26 (15.9%) Type B1, 34 (20.7%) Type B2, 24 (14.6%) Type B3, and 46 (28.0%) TC (including 4 cases of thymic carcinoid) (Table 1).

Radiomics features selection

The inter- and intraobserver reproducibility of feature extraction was achieved with ICCs > 0.75 between the two different radiologists. The 16 and 13 features were selected by the ANOVA and Lasso algorithm method, and the corresponding optimal values of the lasso tuning parameter (alpha) were 1.241 and 1.239, respectively. Then, the two RMs included 16 and 13 radiomics features to identify three different subgroups of TETs according to traditional risk grouping [LRT (Types A, AB and B1), LRT (Types B2 and B3), TC] and improved risk grouping [LRT* (Types A and AB), HRT* (Types B1, B2 and B3), TC], respectively (Figs. 3, 4).

Fig. 3
figure 3

Valuable radiomics feature selection of traditional risk grouping [LRT (Types A, AB and B1), HRT (Types B2 and B3), TC)] using LASSO regression. The optimal value of the lasso tuning parameter (alpha = 1.241) was found, and 16 features that corresponded to the optimal alpha value were extracted following coefficients on CECT images

Fig. 4
figure 4

Valuable radiomics feature selection of improved risk grouping [LRT* (Types A and AB), HRT* (Types B1, B2 and B3), TC] using LASSO regression. The optimal value of the lasso tuning parameter (alpha = 1.239) was found, and 13 features that corresponded to the optimal alpha value were extracted following coefficients on CECT images

The features in the two RMs were all high-order features without any intensity, shape or texture features, four of which were the same: wavelet-LLL_glcm_InverseVariance, wavelet-LLH_glcm_Imc2, gradient_glcm_Imc1 and wavelet-LLH_glszm_GrayLevelNonUniformityNormalized.

Diagnostic performance of the two RMs

The 16- and 13-feature RMs were trained with the LR classifier on CECT images, and the ROC curve analysis results are shown in Figs. 5 and 6. In the training set of traditional risk grouping, the areas under the ROC curve (AUCs) of LRT, HRT, and TC were 0.795, 0.851, and 0.860, respectively, and the accuracy was 0.65; in the testing set, the AUCs were 0.621, 0.754, and 0.500, respectively, and the accuracy was 47%. In the training set of improved risk grouping, the AUCs of LRT*, HRT*, and TC were 0.855, 0.862, and 0.869, respectively, the accuracy was 0.72, and in the testing set, the AUCs were 0.778, 0.716, and 0.879, respectively, and the accuracy was 0.62. For the testing set, the AUC of TC in improved risk grouping was 0.879, which was significantly larger than 0.500 in traditional risk grouping (Table 2). Additional file 1: Tables S1–S4 showed the confusion matrices. The calibration curves showed that the predicted performance of RM according to the improved risk grouping for HRT* and TC were in satisfactory agreement with the actual risk level, while the performance of the RM according to the traditional risk grouping was unsatisfactory (Fig. 7). In addition, Analyses of decision curves showed that the RM according to the improved risk grouping for HRT* and TC obtained higher clinical utility (Fig. 8).

Fig. 5
figure 5

Receiver operating characteristic curve (ROC) on CECT-based RM according to traditional risk grouping [LRT (Types A, AB and B1), HRT (Types B2 and B3), TC)]. a The 16-feature RM was trained in the training set with the LR classifier. The areas under the ROC curve (AUCs) of LRT, HRT, and TC were 0.795, 0.851, and 0.860, respectively. b The 16-feature RM was tested in the training set with the LR classifier. The AUCs of LRT, HRT, and TC were 0.621, 0.754, and 0.500, respectively

Fig. 6
figure 6

Receiver operating characteristic curve (ROC) on CECT-based RM according to improved risk grouping [LRT* (Types A and AB), HRT* (Types B1, B2 and B3), TC]. a The 13-feature RM was trained in the training set with the LR classifier. The areas under the ROC curve (AUCs) of LRT, HRT, and TC were 0.855, 0.862, and 0.869, respectively. b The 13-feature RM was tested in the training set with the LR classifier. The AUCs of LRT, HRT, and TC were 0.778, 0.716, and 0.879, respectively

Table 2 The prediction performance of the two RMs
Fig. 7
figure 7

The calibration curves of the two RMs in the testing sets respectively. a For the traditional risk grouping [LRT (Types A, AB and B1), HRT (Types B2 and B3), TC)], the prediction performance of RM for LRT, HRT and TC did not show satisfactory consistency with the actual risk level. b For the traditional risk grouping [LRT* (Types A and AB), HRT* (Types B1, B2 and B3), TC], the prediction performance of RM for HRT* and TC showed satisfactory consistency with the actual risk level

Fig. 8
figure 8

The decision curve analyses of the two RMs in the testing sets respectively. a The RM according to the traditional risk grouping [LRT (Types A, AB and B1), HRT (Types B2 and B3), TC)] had general clinical utility for LRT, HRT and TC. b The RM according to the improved risk grouping [LRT* (Types A and AB), HRT* (Types B1, B2 and B3), TC] had good clinical utility for HRT* and TC

Discussion

This study built two RMs based on CECT images using LASSO to extract the features and LR as the classifier to identify three different subgroups of TETs. After machine learning, the 13-feature RM (accuracy = 0.62) established according to improved risk grouping [LRT* (Types A and AB), HRT* (Types B1, B2 and B3), TC] showed a better predictive performance than the 16-feature RM (accuracy = 0.47) established according to traditional risk grouping [LRT (Types A, AB and B1), HRT (Types B2 and B3), TC] in the test set.

Recently, six popular machine learning algorithms have been used to construct RMs: k-nearest neighbor (KNN), support vector machine (SVM), eXtreme Gradient Boosting (XGBoost), random forest (RF), logistic regression (LR), and decision tree (DT). Among them, the results using the LR algorithm were the most ideal in many CT-based radiomics studies to predict different risk subgroups of TETs or thymomas [18, 21, 22]. Therefore, in this study, we only chose LR algorithm. In this study, the prediction accuracy of 16-feature RM according to traditional risk grouping was not ideal (only 0.47), which was basically consistent with the research results (0.45) of Liu et al. [18] in the testing set. Therefore, it can be seen from our and Liu et al.'s studies that the ability of CT-based RM to distinguish the three conventional risk groups of TETs was not ideal.

Several studies have shown that although type B1 thymoma belongs to LRT, its conventional CECT findings overlap with type B2 and B3 thymomas in HRT to a certain extent, especially with type B2 thymoma [9, 19]. At the same time, a study showed that the prognosis of type B1 thymoma is not significantly different from that of type B2 and B3 thymomas [20]. Therefore, based on the above contradictions, we propose the idea of regrouping, and we hypothesized that regrouping may be more conducive to the identification of TETs. To the best of our knowledge, this is the first study to propose the concept of improved risk grouping of TETs. In this study, we found that the prediction accuracy of 13-feature RM according to improved risk grouping was 0.62, which was higher than the 0.45 of the simple CECT-based model and the 0.48 of the CECT-based clinical-semantic-radiomics model of Liu et al. [18] in the testing set. The results of this study verified our hypothesis. In pathology, type B thymomas apparently represent a continuum from B1 to B3 thymomas, which shows a spectrum of lymphocyte to epithelial predominance [23]. It can also be understood that the pathological similarity between type B1 thymoma and type B2 thymoma is higher than that between type B1 thymoma and type A or AB thymoma. Therefore, pathologists may overlap in the diagnosis of type B1 and B2 thymomas (approximately 15% disagreement) [3]. This pathological manifestation may explain the phenomenon that there was a certain overlap between type B1 thymoma and type B2 and B3 thymomas on conventional CT features, and it is also a feasible basis for regrouping. Therefore, we applied the improved risk grouping method to fundamentally reduce the interference of type B1 thymoma in LRT and HRT, and the established RM improved the accuracy of diagnosis. In this study, for the improved risk grouping, the performance of the CECT-based RM also declined when moving from training set to testing set (from 0.72 to 0.62). Significant TET atypia should be one of the main reasons for the general decline of performance. We also found that the AUC of TC according to improved risk grouping was 0.879, which was significantly larger than 0.500 according to traditional risk grouping in the testing set. This indicated that the RM established according to the improved risk grouping method may have a higher accuracy in predicting the risk of TC. We speculated that the reason may be that the extracted valuable radiomics features were more specific for TC or that some thymomas in LRT* and HRT* were very similar in pathological manifestations.

The 3D analysis of the whole lesion could reflect the heterogeneity of the tumor more representative and provide more comprehensive information. Chaddad et al. [24] found that a 3D wavelet transform can distinguish colorectal cancer classification, which has higher accuracy and sensitivity than 2D wavelet transform. Therefore, we manually depicted ROIs along the lesion contour on each image and converted ROIs to VOIs. Finally, there were 11 and 9 3D-wavelet texture features in the two RMs, respectively. In our study, there was no shape feature in any of the extracted features in the two RMs, indicating that the shape features were not significantly different in the three different risk subgroups of TETs. The results of Han et al.'s conventional CT imaging to identify different risks of TETs showed that tumor size and contour significantly differed between LRT and HRT [25]. Our results were inconsistent with these results, which might be due to the relatively small number of cases, especially type A and AB thymomas.

Chest CECT was the first choice of imaging evaluation before treatment for TETs. In this study, the images with 5 mm thickness in the venous phase of conventional CECT were used for radiomics analysis because the image stability in the venous phase was better than that in the arterial phase. In the arterial phase, the concentration of contrast medium in the superior vena cava or brachiocephalic vein was quite high, and the adjacent area had obvious artifacts, which may affect the display of lesions. Wang et al. [26] used radiomics based on CECT images and noncontrast-enhanced CT (NECT) images to identify high-risk and low-risk thymomas with similar AUCs. We did not use the NECT image because in some of the NECT images, the obvious artifact in the lesion may affect the authenticity of the tumor heterogeneity, and the unclear edge is not conducive to the segmentation of the lesion. Therefore, we think that radiomics analysis based on CECT and 3D segmentation of all lesions may have broader application prospects for the evaluation of TETs. According to the improved risk grouping method, we only selected the images with a 5 mm thickness of the venous phase as the training set, segmented them to generate VOIs, and used LR as the classifier to extract features and establish the most simplified RM. After machine learning, the prediction accuracy of the test set was significantly higher than that of the CECT-based clinical-semantic-radiomics model of Liu et al. [18]. This indicated that improved risk grouping may have potential clinical popularization and application value. In addition, several studies have shown that the iodine concentration (IC) value of dual-energy CT (DECT) is valuable for distinguishing different risks of TETs [27, 28]. The radiomics evaluation of TETs based on DECT images combined with IC values is worthy of further study.

We know that only when patients obtain accurate pathological diagnosis results can a multidisciplinary diagnosis and treatment team give them the most appropriate treatment plan [29]. Although pathological diagnosis is the gold standard, not all patients can obtain a specific pathological diagnosis after biopsy. Similarly, we found that 15 patients with TETs did not obtain accurate pathological classification during our follow-up. For patients who could not obtain accurate pathological diagnosis results in time, we could use RM to evaluate their risk level before treatment and provide a multidisciplinary diagnosis and treatment team with suggestions on the tumor risk level. Our RM may also have important value for the risk assessment of TET patients without specific pathological classification.

This study had some limitations. First, individual medical centers were included in the study, and the number of cases was small. Combining multiple centers with a larger number of patients will be needed to verify our results. Second, although our study was a retrospective cohort study, there was selection bias. Third, to compare the prediction performance of the two RMs, cross validation was not used in this study. We grouped the data according to time, which may avoid the selection bias caused by machine learning to a certain extent. Further research is needed to verify our results. In addition, it was time-consuming and subjective to draw the contour manually. Therefore, it is necessary to develop a more efficient and accurate method of image contour drawing.

Conclusions

Our study established a simple RM established based only on venous CECT images to distinguish the three risk subgroups [low-risk thymoma (Types A, AB and B1), high-risk thymoma (Types B2 and B3), thymic carcinoma] of TETs. If type B1 thymoma is reclassified as high-risk thymoma, RM established according to the improved grouping mode may have higher accuracy in predicting the three risk subgroups.

Availability of data and materials

The datasets generated and/or analysed during the current study are not publicly available due to patient privacy protection, but are available from the corresponding author on reasonable request.

Abbreviations

TETs:

Thymic epithelial tumors

TC:

Thymic carcinoma

LRT:

Low-risk thymoma

HRT:

High-risk thymoma

CECT:

Contrast-enhanced CT

RM:

Radiomics model

VOI:

Volume of interest

GLCM:

Gray level co-occurrence matrix

GLSZM:

Gray size area band matrix

GLRLM:

Gray run length matrix

GLDM:

Gray level dependence matrix

NGTDM:

Neighbouring gray tone difference matrix

LASSO:

Least absolute shrinkage and selection operator

LR:

Logistic regression

References

  1. Engels EA. Epidemiology of thymoma and associated malignancies. J Thorac Oncol. 2010;5:260–5.

    Article  Google Scholar 

  2. Travis WD, Brambilla E, Burke AP, et al. World Health Organization classification of tumours: pathology and genetics: tumours of the lung, pleura, thymus and heart. 4th ed. Lyon: World Health Organization; 2004.

    Google Scholar 

  3. Marx A, Ströbel P, Badve SS, et al. ITMIG consensus statement on the use of the WHO histological classification of thymoma and thymic carcinoma: refined definitions, histological criteria, and reporting. J Thorac Oncol. 2014;9:596–611.

    Article  CAS  Google Scholar 

  4. Strobel P, Bauer A, Puppe B, et al. Tumor recurrence and survival in patients treated for thymomas and thymic squamous cell carcinomas: a retrospective analysis. J Clin Oncol. 2004;22:1501–9.

    Article  Google Scholar 

  5. Moser B, Scharitzer M, Hacker S, et al. Thymomas and thymic carcinomas: prognostic factors and multimodal management. Thorac Cardiovasc Surg. 2014;62:153–60.

    PubMed  Google Scholar 

  6. Kondo K, Yoshizawa K, Tsuyuguchi M, et al. WHO histologic classifcation is a prognostic indicator in thymoma. Ann Thorac Surg. 2004;77:1183–8.

    Article  Google Scholar 

  7. Ettinger DS, Wood DE, Aisner DL, et al. National Comprehensive Cancer Network (NCCN) clinical practice guidelines in oncology: thymomas and thymic carcinomas, Version 1.2021. https://www.nccn.org/professionals/physician_gls/default.aspx#thymic. Accessed 4 Dec 2020.

  8. Nishino M, Ashiku SK, Kocher ON, et al. The thymus: a comprehensive review. Radiographics. 2017;37:1004.

    Article  Google Scholar 

  9. Marom EM. Advances in thymoma imaging. J Thorac Imaging. 2013;28:69–80.

    Article  Google Scholar 

  10. Van Ginneken B. Fifty years of computer analysis in chest imaging: rule-based, machine learning, deep learning. Radiol Phys Technol. 2017;10:23–32.

    Article  Google Scholar 

  11. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology. 2016;278:563–77.

    Article  Google Scholar 

  12. Fasmer KE, Hodneland E, Dybvik JA, et al. Whole-volume tumor MRI radiomics for prognostic modeling in endometrial cancer. J Magn Reson Imaging. 2021;53:928–37.

    Article  Google Scholar 

  13. Chetan MR, Gleeson FV. Radiomics in predicting treatment response in non-small-cell lung cancer: current status, challenges and future perspectives. Eur Radiol. 2021;31:1049–58.

    Article  Google Scholar 

  14. Conti A, Duggento A, Indovina I, et al. Radiomics in breast cancer classification and prediction. Semin Cancer Biol. 2021;72:238–50.

    Article  CAS  Google Scholar 

  15. Cerfolio RJ, Moore WH. Can CT radiomics differentiate benign from malignant N2 adenopathy in non-small cell lung cancer. Transl Lung Cancer Res. 2020;9:1710–1.

    Article  Google Scholar 

  16. Ren CY, Li ML, Zhang YY, et al. Development and validation of a CT-texture analysis nomogram for preoperatively differentiating thymic epithelial tumor histologic subtypes. Cancer Imaging. 2020;20:86.

    Article  Google Scholar 

  17. Chen XM, Feng B, Li CL, et al. A radiomics model to predict the invasiveness of thymic epithelial tumors based on contrast-enhanced computed tomography. Oncol Rep. 2020;43:1256–66.

    PubMed  PubMed Central  Google Scholar 

  18. Liu J, Yin P, Wang SC, et al. CT-based radiomics signatures for predicting the risk categorization of thymic epithelial tumors. Front Oncol. 2021;11:628534.

    Article  Google Scholar 

  19. Hu YC, Wu L, Yan LF, et al. Predicting subtypes of thymic epithelial tumors using CT: new perspective based on a comprehensive analysis of 216 patients. Sci Rep. 2014;10:1–7.

    Google Scholar 

  20. Kim HK, Choi YS, Kim J, et al. Type B thymoma: is prognosis predicted only by World Health Organization classification? J Thorac Cardiovasc Surg. 2010;139:1431–5.

    Article  Google Scholar 

  21. Sui He, Liu L, Li X, et al. CT-based radiomics features analysis for predicting the risk of anterior mediastinal lesions. J Thorac Dis. 2019;11(5):1809–18.

    Article  Google Scholar 

  22. Kayi Cangir A, Orhan K, Kahya Y, et al. CT imaging-based machine learning model: a potential modality for predicting low-risk and high-risk groups of thymoma: “Impact of surgical modality choice.” World J Surg Oncol. 2021;19:147.

    Article  Google Scholar 

  23. Marx A, Chan JKC, Coindre J-M, et al. The 2015 WHO classification of tumors of the thymus: continuity and changes. J Thorac Oncol. 2015;10:1383–95.

    Article  CAS  Google Scholar 

  24. Chaddad A, Daniel P, Niazi T. Radiomics evaluation of histological heterogeneity using multiscale textures derived from 3D wavelet transformation of multispectral images. Front Oncol. 2018;4:96.

    Article  Google Scholar 

  25. Han X, Gao W, Chen Y, et al. Relationship between computed tomography imaging features and clinical characteristics, Masaoka-Koga stages, and World Health Organization histological classifications of thymoma. Front Oncol. 2019;11:1041.

    Article  Google Scholar 

  26. Wang X, Sun W, Liang H, et al. Radiomics signatures of computed tomography imaging for predicting risk categorization and clinical stage of thymomas. Biomed Res Int. 2019;28:3616852.

    Google Scholar 

  27. Chang S, Hur J, Im DJ, et al. Volume-based quantification using dual-energy computed tomography in the differentiation of thymic epithelial tumours: an initial experience. Eur Radiol. 2017;27:1992–2001.

    Article  Google Scholar 

  28. Yu CH, Li T, Zhang RP, Yang X, et al. Dual-energy CT perfusion imaging for differentiating WHO subtypes of thymic epithelial tumors. Sci Rep. 2020;10:5511.

    Article  CAS  Google Scholar 

  29. Conforti F, Pala L, Giaccone G, et al. Thymic epithelial tumors: from biology to treatment. Cancer Treat Rev. 2020;86:1014.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This study has received funding from the Scientific Research Program of the Health Commission of Shanxi Province in China (Grant No. 2021070).

Author information

Authors and Affiliations

Authors

Contributions

Y.C.H. wrote the initial draft of the manuscript. Y.X.T. and Z.R.P. guaranteed the integrity of the entire study. Y.C.H., Y.X.T. and Z.R.P. contributed to the study concepts and design. Y.C.H. and C.J.J. contributed to revising the manuscript. Y.C.H., Z.Z.K. and X.L. contributed to segmenting tumors with software. Y.C.H., L.T. and C.J.J. contributed to acquiring, analyzing and interpreting the data. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xiaotang Yang.

Ethics declarations

Ethics approval and consent to participate

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the ethics committee of Shanxi Province Cancer Hospital (Approval No. 201995), and individual written informed consent for this retrospective analysis was waived by the ethics committee of Shanxi Province Cancer Hospital.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Supplementary Table 1. The confusion matrix of the training set for the traditional risk grouping. Supplementary Table 2. The confusion matrix of the testing set for the traditional risk grouping. Supplementary Table 3. The confusion matrix of the training set for the improved risk grouping. Supplementary Table 4. The confusion matrix of the testing set for the improved risk grouping.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yu, C., Li, T., Yang, X. et al. Contrast-enhanced CT-based radiomics model for differentiating risk subgroups of thymic epithelial tumors. BMC Med Imaging 22, 37 (2022). https://doi.org/10.1186/s12880-022-00768-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12880-022-00768-8

Keywords

  • Radiomics
  • Computed tomography
  • Thymoma
  • Thymic carcinoma