Improving the diagnostic performance of CT image biomarkers in combination with clinical parameters for the common parotid tumors

Objective To develop and validate diagnostic models of the common parotid tumors based on whole-volume CT textural image biomarkers (IBMs) in combination with clinical parameters at a single institution. Methods The study cohort was composed of 51 pleomorphic adenoma (PA) patients and 42 Warthin tumor (WT) patients. Clinical parameters and conventional image features were scored retrospectively and textural IBMs were extracted from CT images of arterial phase . Independent-samples t test or Chi-square test was used for evaluating the significance of the difference among clinical parameters, conventional CT image features, and textural IBMs. The diagnostic performance of univariate model and multivariate model was evaluated via receiver operating characteristic (ROC) curve and area under ROC curve (AUC). Significant differences were found in clinical parameters (age, gender, disease duration, smoking), conventional image features (site, maximum diameter, time-density curve, peripheral vessels sign) and textural IBMs (mean, uniformity, energy, entropy) between PA group and WT group (P<0.05). ROC analysis showed that clinical parameter (age) and quantitative textural IBMs (mean, energy, entropy) were able to categorize the patients into PA group and WT group, with the AUC of 0.784, 0.902, 0.910, 0.805, respectively. When IBMs were added in clinical model, the multivariate models including age-mean, age-entropy, and age-energy performed significantly better than the univariate models with the improved AUC of 0.940, 0.944, 0.861, respectively (P<0.001). AUC=area parotidectomy, ECD=extracapsular dissection, DICOM=digital imaging and communications in medicine, ROI= region of interest, VOI=volume of interest, TDC=time-density curve, FS=Frey’s syndrome, FNA=fine needle aspiration, MRI=magnetic resonance imaging, PET=positron emission tomography


Background
Parotid tumor is the most common type of salivary gland tumor, and 75-80% of them are benign [1].
Pleomorphic adenoma (PA) and Warthin tumor (WT) are the two most common types of the parotid tumors, and the incidence of WT has gradually increased in recent years [1,2]. They are both benign, but the biological behavior and surgical plan are absolutely different. PA may become malignant due to a delayed excision, and postoperative recurrence is another remarkable characteristics of it. Yet, WT grows slowly, rarely recurs and malignant transformation seldom occurs [3,4]. At present, partial superficial parotidectomy (PSP) is the most common surgical procedure for PA and extracapsular dissection (ECD) for WT, which is in line with the current trend of minimising surgical dissection.
Therefore, the risk of short-term and long-term complications might be decreased [5]. ECD is a safe and time-efficient surgical approach, offering earlier recovery and better preservation of salivary function compared to PSP. Meanwhile, ECD should be considered as a surgical approach for parotid tumors, especially those in the parotid tail, such as WT [6].Yet, ECD is not quite suitable to be applied to PA, for it will increase the postoperative recurrence rate compared to PSP.
Pre-operation knowledge of the pathological type of tumors would be of great importance in consideration of optimizing the individualized operative program and help to inform preoperative patient counselling. In routine clinical practice, common factors such as age, performance status, tumor size, site, and the like, are used to guide treatment decision-making. Nevertheless, patients with similar factors above may have different outcome. Neither these clinical factors nor conventional image features are sufficient for identifying patients that will benefit most from specific surgical strategies. Thus, more detailed information that reflect the characteristics of the whole tumor are needed to improve the diagnostic accuracy.
Recent studies have demonstrated the potential value of IBMs, which are significantly associated with pathology and prognosis [7,8]. IBMs can be extracted from medical images and provide quantitative information regarding intensity, shape and textural characteristics of the region of interest (ROI) [9].
There is a great deal of quantitative information on medical images, far beyond what is currently used for routine interpretation. Consequently, there is an increasing interest in the evaluation of tumors on medical images using advanced software in order to derive additional, clinically relevant information, namely texture analysis. Texture analysis has been used to predict various clinical issues of interest including tumor heterogeneity, patient prognosis, and response to therapy [10][11][12]. In this study, texture features were extracted from CT image and incorporated into the diagnostic model for histologic classification of the two most common benign parotid tumors, WT and PA.
Although many IBMs are significantly associated with outcome, it remains unclear to what extent the addition of IBMs improves the diagnostic power of models consisted of clinical parameters and conventional image features. Moreover, there is a relative paucity of literature pertaining to the combined model. The aim of this study was to test whether the diagnostic performance of prediction models could be improved by the addition of IBMs compared to models based on solely clinical parameters or conventional images for WT and PA.

Patient selection
This retrospective study was approved by the local ethics review board. The informed consent was waived for this single-institution study due to its retrospective nature. All data of patients were used confidentially and anonymously. The research involved no more than minimal risk to the patients.
Meanwhile, the waiver did not adversely affect the rights and welfare of the patients. Clinical and image data of all patients were obtained through medical record system and follow-up. Between January 2016 and May 2017, patients with PA or WT who underwent surgery were eligible and identified from the institution's database. The clinical information is shown in Table 1.
The inclusion criteria were as follows: (1) confirmed PA or WT with postoperative pathological diagnosis; (2) complete CT contrast-enhanced images of neck containing parotid gland obtained within two weeks prior to surgery; and (3) maximum diameter of lesions ≥ 1.0 cm. The exclusion criteria were as follows: (1) CT images with obvious artifacts, such as false teeth artifacts, motion artifacts, etc; and (2) lesions with scarcely solid components which are difficult for texture analysis.
As a result, a total of 122 patients were identified, and 29 patients were excluded (Fig. 1). The final study population comprised 93 patients. Image acquisition 6 CT scans were performed using 64/128-multidetector scanners (LightSpeed VCT; GE Healthcare, Waukesha, WI, USA) with the following parameters: 120 kVp; 150 mA; slice thickness and interval for axial images, 5 mm/5 mm. The scanning ranged from skull to ribcage level. contrast-enhanced CT scanning with a total of 80-100 mL of (1.35 mL per kg of body weight) nonionic iodinated contrast material (320 mg/ mL; Iopamidol, Shanghai Bracco Sine Pharmaceutical Co., Ltd., Shanghai, China) at an injection rate of 3.0 mL/s, followed by 50 mL of saline solution via a power injector. The contrastenhanced CT images were obtained during arterial and balanced phases at 35, and 120 seconds after contrast material injection, respectively.

Clinical parameters
All clinical parameters including age, gender, disease duration, and smoking status were collected from medical record system.

Conventional CT image features
All conventional CT image features were derived from the original CT image data, including tumor site (in the parotid tail or not), maximum diameter, time-density curve (washout type or not), and peripheral vessels sign,which defined as INCREASED TORTUOUS VASCULAR SHADOWS CLINGING TO THE EDGE OF THE LESION.

CT textural image biomarkers
Complete CT arterial phase images of all patients were stored in Digital Imaging and Communications in Medicine (DICOM) format and uploaded to ITK-SNAP software for three-dimensional manual segmentation of the region of interest (ROI). An ROI was manually drawn to cover the tumor as large as possible, keeping a distance of 1 mm from the boundary and carefully avoiding the retromandibular vein and too much parotid parenchyma into the lesion, which may lead to a misunderstanding of the internal structure of the tumor and affect the accuracy of texture analysis results. The ROI of each case was manually drawn by a head and neck radiologist who did not have any knowledge about the clinical information of patients, and then the segmentation was checked by a senior radiologist. Areas of tumor heterogeneity, including cystic change or necrosis, were not excluded, for the information captured with texture analysis could potentially contribute to tumor discrimination and classification. An in-house software, Matlab2017b (Mathworks, Natick, MA, USA), was used to extract the texture parameters automatically. Six frequently-used texture parameters obtained from the gray-level histogram were included in the study, namely uniformity(a measure of the sum of the squares of each intensity value), mean (the average gray level intensity within the ROI), energy (a measure of the magnitude of voxel values in an image), entropy(the distribution of gray levels within the VOI), skewness(the histogram asymmetry degree around the mean), and kurtosis (a measurement of the histogram sharpness). An overview of the textural IBM extraction process and analysis is shown in Fig. 2.

Data analysis
Data management and statistical analysis were conducted by IBM SPSS Statistics package (version 22, SPSS Inc., Chicago, IL, USA). Kolmogorov-Smirnov test was used for intra-group normality test, and Levene test was used for intra-group variance homogeneity test. Parameters with normal distribution and homogeneity of variances were expressed as mean ± standard deviation, and independent-samples t test was adopted for data analysis.Parameters that do not satisfy normal distribution and uneven variance are expressed by median and interquartile spacing. The twoindependent-samples Mann-Whitney U Test was used for data analysis. Qualitative data are presented as ratios, which were analyzed by chi-square test. As to clinical parameter (gender) -textural IBM models, a two-component diagnostic model was fitted with binary Logistic regression analysis. The diagnostic performance of each index was tested via receiver operating characteristic (ROC) analysis.
Cutoff values were established by calculating the maximal Youden index (Youden index = sensitivity + specifcity-1). A two-tailed P value of less than 0.05 was considered statistically significant.
Step 1 Clinical model Potential clinical parameters that were considered for their diagnostic ability in the PA and WT datasets included age (> median vs.≤median), gender (female vs. male), disease duration (> median vs.≤median), and smoking status (yes vs. no).
Step 2 conventional image model Potential conventional CT image features that were considered for their diagnostic ability in the PA and WT datasets included tumor site (in the parotid tail, which can be defined as inferior 2 cm of the superficial lobe of the gland, vs. not), maximum diameter (> median vs. ≤median), time-density curve (TDC) (washout type vs. not), peripheral vessels sign, which is defined as increased tortuous vascular shadow clinging to the edge of lesion in the arterial phase (yes vs. no).
Step 3 textural IBM model Six texture features namely mean, uniformity, energy, entropy, skewness and kurtosis were calculated and selected. The potential textural IBMs were analyzed for their diagnostic power, and the median values (> median vs.≤median) in the WT and PA datasets were regarded as the threshold value in the univariate analysis.
Step 4 Combined models According to the ROC analysis and AUC, the potential clinical parameters, conventional image features, and textural IBMs were included in the multivariate analysis to create combined models, and the optimal one was selected out.

Results
Step 1 Clinical model Univariable analysis showed the age in the WT group was significantly older than that in the PA group (P<0.001). The disease duration in the PA group was significantly longer than that in the WT group (P=0.01). Meanwhile, significant differences were found in gender and smoking status between the two groups (P<0.001). All the selected clinical parameters showed significant differences between two groups,which was shown in Table 1.
Step 2 Conventional image model Step 3 textural IBM model The mean, energy, and entropy of WT group were significantly higher than those of PA group (all P<0.001), while the uniformity of WT group was significantly lower than that of PA group (P<0.001).
No statistically significant differences of skewness and kurtosis were found in both groups (P=0.05 and P=0.151, respectively) ( Table 3).
Step 4 Diagnostic performance of univariate and multivariate models Univariate analysis showed significant differences in age, gender, disease duration, smoking status, site (in the parotid tail), peripheral vessels sign, TDC (washout type), maximum diameter, mean, energy, entropy and uniformity between the WT and PA group (all P< 0.05). ROC analysis showed that clinical parameter (age) and textural parameters (mean, energy, entropy) performed well in differentiating the WT group from the PA group, and yielded the AUC of 0.784, 0.902, 0.910, 0.805, respectively (Fig. 3, 4, Table 4). The multivariate models were consisted of the clinical parameter and textural parameters, including age-mean, age-energy, age-entropy, with the AUC of 0.940, 0.944, 0.841, respectively. The optimal multivariate models included age-mean, age-energy, and yielded the  Table 5).

Discussion
This study showed a detailed analysis on the different diagnostic models for PA and WT patients. We not only calculated AUC, sensitivity and specificity of univariate and multivariate model, but also analyzed the differences between the AUC values. Our results indicated that textural IBMs might be helpful to diagnose patients with parotid PA and WT. Especially, the multivariate models including clinical parameter and textural IBMs showed better diagnostic performance in this study.
Nowadays, surgeons are in great efforts to reduce the incidence of post-parotidectomy complications, such as Frey's syndrome (FS), facial palsy, depressed facial deformity and the like, under the advocacy of precision medicine. As a result, the selection of parotid tumor operations has become more and more limited. PSP as well as ECD are the most commonly used surgical modalities at present [5,13]. ECD, which is often applied to excise tumors confined to parotid tail, has been evaluated as a minimally invasive operation, leading to fewer complications, higher efficiency, and better preservation of salivary function [6,14,15]. Parotid tail defined as inferior 2 cm of the superficial lobe, lies anterolateral to sternocleidomastoid muscle [16]. The surgical techniques used to treat neoplasms in the parotid tail may be quite different from other parotid areas, especially for WT [6]. In our study, most of WT were located in the parotid tail, while few of PA were discovered in the tail(P 0.001). Consequently, ECD is not quite suitable to be applied to PA. Furthermore, a historical review has demonstrated a significantly higher rate of recurrent PA with ECD compared to PSP [17].
Parotid PA and WT always present as painless, slow growing lumps with no specific manifestations in laboratory and traditional imaging examinations. Previous studies showed that smoking and elderly male were inclined to suggest WT. The WT shows male smoker predominance, with the ratio of male patients to female patients ranging from 1.7:1 to 11.5:1, and it tends to develop in older patients (mean age, 56.7-60 years) [18,19], which was consistent with the findings of the present study.
While, the PA is more prone to occur in middle-aged women (P < 0.001) compared to WT in this study, which was also in line with those found by the literature.
This study found that disease duration of PA is longer than WT. we speculate that WT is larger in size(P 0.001) and more superficial in location(P=0.003) compared with PA ,which is beneficial to early detection.
However, many confusing overlapping features of WT and PA lead to the dilemma of diagnosis before operation [20]. Fine needle aspiration (FNA) considered to be gold standard for the diagnosis of parotid tumors has some inevitable limitations too and may be associated with poor levels of diagnostic accuracy and low sensitivity [21]. The false positive rates of FNA in PA and WT were reported to be 9% and 8%, respectively, and the variegated cytomorphology of these tumors may lead to an error in interpretation [22]. Furthermore, intratumor heterogeneity from a single or limited tumor-biopsy sample can be underestimated [23]. Thus, pre-operation imaging examination may be a noninvasive and better approach to identify WT and PA. At present, dual or three phases enhanced CT has been increasingly suggested for examination before parotidectomy,which has been currently the primary method for the assessment of the parotid tumors [20,24]. It is known that washout time of contrast agents in the tumors could provide many valuable physiopathology information and be helpful in differential diagnosis of pariod tumors [24,25].
As to PA and WT, quick expurgation of contrast material was unique for WT, while a delayed enhancement was for PA, which was consistent with the findings of the present study. But it turned out that was not always quite the case, for some WTs also present delayed enhancement and PAs present quick expurgation [20], which had been confirmed in this study. Furthermore, the delayed imaging would increase the radiation doses or reduce the temporal resolution.
Additionally, peripheral vessels sign was detected more frequently in WTs(P 0.001) compared with PAs in the arterial phase, which has not been previously reported. We speculate that WT is hypervascular lesion with abundant expanded blood capillaries according to histopathological features, which may contribute to its potential of stimulating peripheral angiogenesis.
The diagnostic efficiency of energy in all textural IBMs was the best in this cohort of parotid WT and PA patients, according to the univariate diagnostic models of this study. This was verified by the whole-volume textural analysis from arterial phase contrast-enhanced CT. That is to say, the textural IBM provided a stronger association with the diagnosis of WT and PA compared to clinical model and conventional image model. The textural IBMs were quantified by extracting features from the complete tumor volume in this study, which was quite different from other texture analysis of parotid disease, for the ROI was manually drawn around the tumor on its largest cross-sectional area, instead of the whole volume of the tumor [26]. In consequence, the overall tumor features were reflected by textural IBMs. In addition to some features on plain CT, contrast-enhanced CT can also reflect some heterogeneous features on tumor blood supply. For arterial blood supply is the main source of parotid tumors, arterial phase CT images were selected to analyze the texture features of PA and WT in this study.
Imaging features can be derived from CT, magnetic resonance imaging (MRI), and positron emission tomography (PET) without modification of the acquisition protocols and additional costs for patients [27,28]. Currently, texture analysis is mainly used to evaluate the treatment effect and prognosis of lung cancer, colorectal cancer, liver cancer and so on [29][30][31]. But it is rarely applied to parotid gland, except for several reports focusing on the alterations of parotid morphology and secretion function induced by radiotherapy for head and neck cancers [32,33]. However, to our best knowledge, there is paucity of studies to date about the potential diagnostic value of CT textural IBMs in parotid tumors, as well as the multivariate model.
Given the different surgical management of PA and WT patients, it is hoped that gathering clinical clues and CT IMBs together could augment the ability for treatment decision-making and aid in prognosticating surgical scenarios. In this study, the diagnostic performance was obviously improved by the combination of clinical parameter and textural IBMs, compared to univariate model, especially the conventional image model and clinical model. Our findings show that with minimal cost and no additional imaging burden, texture analysis of routine contrast-enhanced CT imaging before surgery may provide useful information for PA and WT patients undergoing different surgical resection. As our understanding of CT textural IBM continues to unfold, it is hoped that this may provide more insight into the likely benefit of new operative regimes in patients with parotid tumors.
The present study had several limitations. First, the sample size was relatively small, although it was larger than previous CT texture analysis studies on PA and WT [26]. Despite a relatively small sample size, we found strong association of the textural IBMs with the diagnosis of PA and WT, as well as the multivariate models. The findings of this study encourage investigating the association of a wider range of radiomic features with parotid tumors of other pathological types in a larger sample size.
Second, all patients enrolled in this study were from one institution, so a large-scale randomized controlled trial needs to be performed to validate our results. Finally, there is a lack of understanding on the underlying relationship between textural IBMs and histopathology of parotid PA and WT which requires further work. Meanwhile,further work is also needed to address the repeatability of these quantitative IBMs as part of a biomarker validation process.

Conclusions
In conclusion, the addition of CT textural image biomarkers improved the diagnostic performance of clinical model significantly for both WT and PA datasets, and facilitated the individualized operation plan for patients. Consequently, the parotidectomy trauma may be minimized and the postoperative complication may be reduced. The CT textural IBMs are worth exploring to determine whether they can improve the clinical programme currently applied.    Figure 1 Schematic shows recruitment pathway of patients for this study.   ROC curves for distinguishing WT from PA based on CT textural IBMs. ROC curves for distinguishing WT from PA based on multivariate models, composed of clinical parameter and CT textural IBMs.