- Research article
- Open Access
A model based on CT radiomic features for predicting RT-PCR becoming negative in coronavirus disease 2019 (COVID-19) patients
BMC Medical Imaging volume 20, Article number: 118 (2020)
Coronavirus disease 2019 (COVID-19) has emerged as a global pandemic. According to the diagnosis and treatment guidelines of China, negative reverse transcription-polymerase chain reaction (RT-PCR) is the key criterion for discharging COVID-19 patients. However, repeated RT-PCR tests lead to medical waste and prolonged hospital stays for COVID-19 patients during the recovery period. Our purpose is to assess a model based on chest computed tomography (CT) radiomic features and clinical characteristics to predict RT-PCR negativity during clinical treatment.
From February 10 to March 10, 2020, 203 mild COVID-19 patients in Fangcang Shelter Hospital were retrospectively included (training: n = 141; testing: n = 62), and clinical characteristics were collected. Lung abnormalities on chest CT images were segmented with a deep learning algorithm. CT quantitative features and radiomic features were automatically extracted. Clinical characteristics and CT quantitative features were compared between RT-PCR-negative and RT-PCR-positive groups. Univariate logistic regression and Spearman correlation analyses identified the strongest features associated with RT-PCR negativity, and a multivariate logistic regression model was established. The diagnostic performance was evaluated for both cohorts.
The RT-PCR-negative group had a longer time interval from symptom onset to CT exams than the RT-PCR-positive group (median 23 vs. 16 days, p < 0.001). There was no significant difference in the other clinical characteristics or CT quantitative features. In addition to the time interval from symptom onset to CT exams, nine CT radiomic features were selected for the model. ROC curve analysis revealed AUCs of 0.811 and 0.812 for differentiating the RT-PCR-negative group, with sensitivity/specificity of 0.765/0.625 and 0.784/0.600 in the training and testing datasets, respectively.
The model combining CT radiomic features and clinical data helped predict RT-PCR negativity during clinical treatment, indicating the proper time for RT-PCR retesting.
Coronavirus disease 2019 (COVID-19) is a major threat to the health of people worldwide. According to the diagnosis and treatment guidelines proposed by the National Health Committee of the People’s Republic of China (7th Edition) , negative reverse transcription-polymerase chain reaction (RT-PCR) is the key criterion for discharging COVID-19 patients. The clinical prediction of RT-PCR becoming negative is critical for the proper retesting time, preventing medical waste from repeated RT-PCR tests and unnecessary prolonged hospital stays. Doctors need an objective and accurate method for prediction of RT-PCR negativity during clinical treatment.
Chest computed tomography (CT) can intuitively demonstrate the lung lesions and its manifestations of COVID-19 pneumonia have been reported in many studies [2,3,4]. Chest CT exams are useful in supplementary diagnosis of RT-PCR tests [5,6,7], evaluating disease stages [2, 3, 8, 9] and severity [10,11,12]. Recently, deep learning techniques have been widely used in the detection and segmentation of COVID-19 lesions in chest CT images [13,14,15,16]. Based on a reliable segmentation method, the high-throughput and high-dimensional radiomic features on chest CT showed strong potential for predicting the true status of RT-PCR.
We hypothesized that a model incorporating CT radiomic features and clinical characteristics can predict RT-PCR becoming negative. We collected the clinical data and chest CT features of mild COVID-19 patients in Fangcang Shelter Hospital in Wuhan, Hubei, aiming to establish a predictive model for RT-PCR becoming negative during the recovery period.
Patients and methods
The study was approved by the institutional review board of the First Affiliated Hospital of China Medical University. Informed consent was waived due to the nature of the retrospective study.
Between February 10, 2020, and March 10, 2020, the clinical data and CT images of COVID-19 patients at Fangcang Shelter Hospital in Hongshan Gymnasium, Wuhan, Hubei, were reviewed retrospectively. All cases were mild from the onset and during the course of hospitalization, as defined by no hypoxemia or respiratory distress (respiratory rate ≥ 30 breaths/min, requirement for oxygen treatment or mechanical ventilation, or SpO2 ≤ 93% on room air) . Patients were included if they met the following criteria: (1) No abnormal clinical symptoms (fever and severe respiratory symptoms) for more than 3 days. (2) Underwent RT-PCR tests at least 3 times after abnormal clinical symptoms disappeared. (3) The first RT-PCR tests were performed between 3 and 5 days after abnormal clinical symptoms disappeared. (4) Underwent chest CT exams within 2 days after the first RT-PCR test. Patients with inconsistent results in the first two consecutive RT-PCR tests were excluded (Fig. 1a, b). Novel coronavirus 2019-nCoV nucleic acid detection kit (fluorescence PCR method) (Sansure Biological Technology Co., Ltd., Changsha, China, Serial Number: 20150036) was used for RT-PCR tests.
The enrolled patients were divided into two groups: RT-PCR-negative and RT-PCR-positive groups (Fig. 1a, b). Inclusion criteria for the RT-PCR-negative group were: (1) All RT-PCR tests were negative; (2) No worsening clinical symptoms during hospitalization and the 2-week isolation after discharge. Inclusion criteria for the RT-PCR-positive group: the first two RT-PCR tests were positive.
We collected 20 available clinical characteristics, including general characteristics (age, gender, time interval from symptoms onset to CT exams), comorbidities, vital signs on the CT scan day and laboratory tests on admission. Comorbidities included diabetes, hypertension, cardiovascular disease, chronic obstructive pulmonary disease, chronic liver disease and cancer. Vital signs on the CT scan day included heart rate, systolic blood pressure, diastolic blood pressure, respiratory rate, and blood oxygen saturation. Laboratory tests include white blood cell count, neutrophil count, lymphocyte count, platelet count, hemoglobin and neutrophil/lymphocyte ratio (NLR) (NLR = neutrophil counts/lymphocyte counts).
The first RT-PCR tests for all enrolled patients were performed between 3 and 5 days after abnormal clinical symptoms disappeared. Then, all patients underwent CT exams within 2 days after the first RT-PCR test. Chest CT scanning used a mobile cabin CT (CT-NeuVz Prime, Neusoft) with a single breath-hold in the supine position. The scan parameters are as follows: tube voltage of 120 kVp, tube current of 100–200 mA, detector collimation of 64 or 128 × 0.625 mm, field of view of 350 mm × 350 mm, and matrix size of 512 × 512. Imaging data were reconstructed using a medium sharp reconstruction algorithm with a slice thickness of 5 mm and an interval of 1 mm.
Image segmentation and feature extraction
CT image analysis was performed on a dedicated workstation—Lung intelligence Kit (LK) Version V2.1.1. R (GE Healthcare, China). The main processes included data import and preprocessing, lung lobe segmentation, lesion segmentation and feature extraction (Fig. 2). Lung lobes were segmented with the purpose of improving the accuracy of lesion segmentation and calculating the proportion of lesions in each lung lobe.
Lobe and lesion segmentation
Before lung lobe and lesion segmentation, the images were resampled to voxel size 1 × 1 × 1 mm3, and a Gaussian filter was applied for denoising. Then, a fully automatic segmentation of three-dimensional lung lobes and lesions based on deep learning algorithms was performed. In cases of unsatisfactory lung lobe and lesion segmentation, two thoracic radiologists (with 5 and 15 years of experience, respectively) blinded to the clinical information and RT-PCR results manually adjusted the contour and resolved discrepancies by consensus.
Quantitative feature extraction
After segmentation, 86 CT quantitative parameters were automatically calculated: the statistical results of lung lobe and lesion (volume, volume percentage, pneumonia score, average density, standard deviation of density) and the component analysis of the lesion (partial solidity, solidity and total lesions) (Additional file 1: Supplementary Data 1).
Radiomic feature extraction
After segmentation, 120 radiomic features of 7 categories were automatically calculated: (1) first-order features (n = 19); (2) 2D and 3D shape features (n = 26); (3) gray level cooccurrence matrix features (n = 24); (4) gray level run length matrix features (n = 16); (5) gray level size zone matrix features (n = 16); (6) neighboring gray tone difference matrix features (n = 5); and (7) gray level dependence matrix features (n = 14). Detailed names and definitions of all 120 features can be found in Additional file 1: Supplementary Data 2.
Missing values were replaced by the median, and the data were standardized by the following formula: standardized value = (original value-average value)/standard deviation.
The patients were randomly assigned at a 7:3 ratio to either the training cohort or the testing cohort. All patients in the training cohort were used to build the predictive model, while patients in the testing cohort were used to independently evaluate the model’s performance. To obtain the strongest features that were significantly associated with negative RT-PCR results in the training cohort, we performed univariate logistic regression analysis, and features with a p value < 0.10 were used for subsequent analysis. Then, Spearman correlation analysis was used to remove the features highly correlated with others; here, the |r| value was 0.9.
Model establishment and evaluation
We constructed a multivariate logistic regression model to identify a strategy to best classify RT-PCR-negative patients in the training dataset. Radiomics scores (Rad-scores) were calculated in each patient through a linear combination of the extracted features with their respective coefficients. The predictive performance was evaluated in terms of discrimination-receiver operating characteristic (ROC) curve, calibration-calibration curve and clinical application-decision curve.
Categorical variables are presented as the number and percentage of the total. The normality of continuous variables was evaluated by using the Shapiro–Wilk test. Normally distributed variables are shown as the mean ± standard deviation or the median (25% percentile, 75% percentile). The differences in variables between different subgroups were assessed by the t test or Mann–Whitney U test as appropriate. The chi-squared test was used to compare the significance of the differences between categorical variables. All statistical analyses for the present study were performed with R 3.5.1 and Python 3.5.6. A two-tailed p value < 0.05 indicated statistical significance.
Analysis of clinical and CT quantitative features
The flow diagram summarizing the selection of the enrolled patients is shown in Fig. 1b. For 203 patients included in our study, the average number of RT-PCR tests was 6 ± 3, ranging from 3 to 12 during hospitalization. 122/203 (60.1%) were categorized in the RT-PCR-negative group, and 81 (39.9%) were categorized in the RT-PCR-positive group. Figure 3 shows CT images for cases in the RT-PCR-negative and RT-PCR-positive groups. Clinical information of the training and the testing cohort is shown in Table 1. The RT-PCR-negative group had a longer time interval from symptom onset to CT exams than the RT-PCR-positive group (median 23 vs. 16 days for the total patients, p < 0.001). There was no significant difference in the other clinical characteristics. The CT quantitative features are summarized in Additional file 1: Supplementary Data 1, and none of them differed between the two groups.
A total of 226 characteristics from each patient were collected: 20 clinical characteristics, 86 quantitative features and 120 radiomic features. After the univariate logistic regression analysis was performed, 20/226 parameters were reserved. Then, 10 features that were highly correlated (|r|> 0.9) with other features were removed due to their redundancy based on the Spearman correlation analysis. Ultimately, 10/20 parameters (Table 2) were retained to build the model.
Model establishment and evaluation
The statistical summary of the multivariate logistic regression model is shown in Table 2. The time interval from symptom onset to CT exams and original_firstorder_Minimun had the highest odds ratio (OR) values (OR = 2.84 and 2.10, respectively) among all parameters. Figure 4 shows Rad-score for each patient in the training and testing datasets. ROC curves of the model (Fig. 5) showed an area under the curve (AUC) of 0.811 with a sensitivity of 76.5%, specificity of 62.5% and accuracy of 70.9% in the training dataset and 0.812 with a sensitivity of 78.4%, specificity of 60.0% and accuracy of 71.0% in the testing dataset. The calibration curve of Rad-scores for the differentiation of the RT-PCR-negative group demonstrated the good consistency between prediction and observation in the training and testing cohorts (Fig. 6). The decision curve analysis showed that the model had a significantly improved performance within a certain threshold range in the training and testing datasets (Fig. 7).
We demonstrated the usefulness of CT radiomic features for predicting RT-PCR negativity and established a predictive model based on CT radiomic features combined with clinical data in COVID-19 patients during the recovery period. With AUCs of 0.811 and 0.812 for the training and testing datasets, respectively, we expect the model to help doctors effectively predict RT-PCR negativity during clinical treatment.
The unsatisfactory sensitivity of RT-PCR detection is a major concern [5, 6, 17, 18]. To avoid the possibility of false negative RT-PCR in our study, we included patients with repeated RT-PCR tests (average times: 6; range 3–12) during hospitalization. Only the patients with consistent results of the consecutive RT-PCR tests were included to ensure true negative or positive RT-PCR status for the corresponding CT. A 2-week isolation after discharge was further performed to avoid any possibility of false negative RT-PCR.
Accurate lesion segmentation is the key to feature extraction and model construction. Colombi et al.’s study  divided lung parenchyma into upper, middle and lower zones in severe COVID-19 patients. They found quantification of well aerated lung parenchyma were predictors of adverse outcome. In the present study, we used the automatic pneumonia segmentation software based on a deep learning algorithm. It detected the respiratory tract and lung lesions based on the actual segmentation of the lung lobes, so more comprehensive and complicated quantitative parameters and radiomic features were evaluated for model construction. Recently, the deep learning algorithm has been widely used in the detection of COVID-19 lesions in chest CT images [13,14,15,16]. Most studies [13,14,15] applied it to chest CT images in the early stage of the disease course for diagnosis and differential diagnosis, while there are few studies regarding chest CT images of COVID-19 patients during the recovery period. We analyzed chest CT images after the abnormal clinical symptoms disappeared, and proposed a combination model of radiomic features and clinical data to predict RT-PCR negativity.
Radiomic features played important roles in the model. Among the 10 parameters in the model, 9 of them were CT radiomic features. The top five radiomic features are original_firstorder_Minimum, original_gldm_Small Dependence Low Gray Level Emphasis, original_glszm_Large Area High Gray Level Emphasis, original_firstorder_10Percentile, and original_shape_Sphericity (Table 2). These indicators represent lesion internal heterogeneity of morphology, density, texture and distribution, thus indicating disease severity. The time interval from symptom onset was the only clinical parameter selected in the model, with the strongest correlation with the RT-PCR-negative group (OR = 2.84). As expected, the longer the disease course, the more patients received negative RT-PCR.
We also analyzed the chest CT quantitative parameters, but none of them were included in the model. Increased numbers, extents, and densities of ground-glass opacities (GGOs)  and consolidations  represent progression in COVID-19 patients, as well as the transformation of consolidation from GGOs . Decreased sizes, extents, and degrees of such lesions could indicate improvement [21,22,23,24,25]. In our study, the recovering patients who had a negative RT-PCR result were expected to show smaller lesion volumes and lower CT values, but the quantitative parameters were not precise enough for the changes. The high-throughput and high-dimensional radiomic features could reflect more detailed changes inside the lesions than the CT quantitative parameters.
No laboratory tests were included in the model. Neutrophils and lymphocytes are the main hematological indicators reflecting systematic inflammation. Lymphocytopenia occurred in more than 80% of critically ill patients , while in an almost mild study population, only 35% of patients had mild lymphocytopenia . Elevated baseline neutrophils in mild cases were not common, and only 6.3% of non-severe patients showed increases in Zhang et al.’s study . Neutrophils also did not increase over the disease course for patients with mild disease and survivors [22, 29]. The patients included in our study were mild COVID-19 patients from Fangcang Shelter Hospital. Most laboratory tests were normal or slightly exceeded normal limits, and we did not find a significant difference in lymphocytes and neutrophils between the RT-PCR-negative and RT-PCR-positive groups.
This study has several limitations. First, as a retrospective study, the study only involved mild COVID-19 cases, so the model cannot be employed for severe and critical cases. For all mild COVID-19 patients in Fangcang Shelter Hospital, some laboratory tests such as erythrocyte sedimentation rate and C-reactive protein were not performed. Second, this is a single-center study, and multi-center data should be used for further verification. Moreover, we only built one model type and lacked comparative analysis with other model types, including decision trees, random forests and support vector machines. Finally, we did not explain the biological interpretation of the radiomic features. We are fully aware of the need for further exploration of these conclusions in subsequent studies.
In conclusion, the established model based on CT radiomic features and clinical data could help doctors predict RT-PCR negativity during the clinical treatment, indicating the proper time for RT-PCR retesting.
Availability of data and materials
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.
Coronavirus disease 2019
Reverse transcription-polymerase chain reaction
Receiver operating characteristic
Area under the curve
National Health Commission of the People’s Republic of China. COVID-19’s diagnosis and treatment plan (7th edition). 2020. https://www.gov.cn/zhengce/zhengceku/2020-03/04/content_5486705.htm.
Ye Z, Zhang Y, Wang Y, Huang Z, Song B. Chest CT manifestations of new coronavirus disease 2019 (COVID-19): a pictorial review. Eur Radiol. 2020;30(8):4381–9.
Cellina M, Orsi M, Valenti Pittino C, Toluian T, Oliva G. Chest computed tomography findings of COVID-19 pneumonia: pictorial essay with literature review. Jpn J Radiol. 2020. https://doi.org/10.1007/s11604-020-01010-7.
Long CJ, Fang P, Song TJ, Zhang JC, Yang Q. Imaging features of the initial chest thin-section CT scans from 110 patients after admission with suspected or confirmed diagnosis of COVID-19. BMC Med Imaging. 2020;20(1):64.
Ai T, Yang Z, Hou H, Zhan C, Chen C, Lv W, et al. Correlation of chest CT and RT-PCR testing in Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32–40.
Fang Y, Zhang H, Xie J, Lin M, Ying L, Pang P, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115–7.
Xie X, Zhong Z, Zhao W, Zheng C, Wang F, Liu J. Chest CT for typical 2019-nCoV pneumonia: relationship to negative RT-PCR testing. Radiology. 2020;296(2):E41–5.
Pan F, Ye T, Sun P, Gui S, Liang B, Li L, et al. Time course of lung changes on chest CT during recovery from 2019 novel Coronavirus (COVID-19) pneumonia. Radiology. 2020;295(3):715–21.
Bernheim A, Mei X, Huang M, Yang Y, Fayad ZA, Zhang N, et al. Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection. Radiology. 2020;295(3):685–91.
Li K, Wu J, Wu F, Guo D, Chen L, Fang Z, et al. The clinical and chest CT features associated with severe and critical COVID-19 pneumonia. Investig Radiol. 2020;55(6):327–31.
Yang R, Li X, Liu H, Zhen YL, Zhang XX, Xiong QX, et al. Chest CT severity score: an imaging tool for assessing severe COVID-19. Radiol Cardiothorac Imaging. 2020;2(2):e200047. https://doi.org/10.1148/ryct.2020200047.
Li K, Fang Y, Li W, Pan C, Qin P, Zhong Y, et al. CT image visual quantitative evaluation and clinical classification of coronavirus disease (COVID-19). Eur Radiol. 2020;30(8):4407–16.
Li L, Qin L, Xu Z, Yin Y, Wang X, Kong B, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65–71.
Gozes O, Frid-Adar M, Greenspan H, et al. Rapid AI development cycle for the coronavirus (COVID-19) pandemic: initial results for automated detection and patient monitoring using deep learning CT image analysis. https://arxiv.org/abs/2003.05037.
Singh D, Kumar V, Kaur M. Classification of COVID-19 patients from chest CT images using multi-objective differential evolution-based convolutional neural networks. Eur J Clin Microbiol Infect Dis. 2020;39:1379–89.
Shan F, Gao YZ, Wang J, et al. Lung infection quantification of COVID-19 in CT images with deep learning. https://arxiv.org/abs/2003.04655.
Kanne JP, Little BP, Chung JH, Elicker BM, Ketai LH. Essentials for radiologists on COVID-19: an update-radiology scientific expert panel. Radiology. 2020;296(2):E113–4.
Yang Y, Yang M, Shen C, et al. Evaluating the accuracy of different respiratory specimens in the laboratory diagnosis and monitoring the viral shedding of 2019-nCoV infections. medRxiv. 2020. https://doi.org/10.1101/2020.02.11.20021493.
Colombi D, Bodini FC, Petrini M, Maffi G, Morelli N, Milanese G, et al. Well-aerated lung on admitting chest CT to predict adverse outcome in COVID-19 pneumonia. Radiology. 2020;296(2):E86–96.
Chung M, Bernheim A, Mei X, Zhang N, Huang M, Zeng X, et al. CT imaging features of 2019 novel Coronavirus (2019-nCoV). Radiology. 2020;295(1):202–7.
Song F, Shi N, Shan F, Zhang Z, Shen J, Lu H, et al. Emerging 2019 novel Coronavirus (2019-nCoV) pneumonia. Radiology. 2020;295(1):210–7.
Wang D, Hu B, Hu C, Zhu F, Liu X, Zhang J, et al. Clinical characteristics of 138 hospitalized patients with 2019 novel Coronavirus-infected pneumonia in Wuhan, China. JAMA. 2020;323(11):1061–9.
Shi H, Han X, Zheng C. Evolution of CT manifestations in a patient recovered from 2019 novel Coronavirus (2019-nCoV) pneumonia in Wuhan, China. Radiology. 2020;295(1):20.
Duan YN, Qin J. Pre- and posttreatment chest CT findings: 2019 novel Coronavirus (2019-nCoV) pneumonia. Radiology. 2020;295(1):21.
Wu Y, Xie YL, Wang X. Longitudinal CT findings in COVID-19 pneumonia: case presenting organizing pneumonia pattern. Radiol Cardiothorac Imaging. 2020;2:e200031. https://doi.org/10.1148/ryct.2020200031.
Yang X, Yu Y, Xu J, Shu H, Xia J, Liu H, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med. 2020;8(5):475–81.
Chen N, Zhou M, Dong X, Qu J, Gong F, Han Y, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507–13.
Zhang G, Zhang J, Wang B, Zhu X, Wang Q, Qiu S. Analysis of clinical characteristics and laboratory findings of 95 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a retrospective analysis. Respir Res. 2020;21(1):74.
Liu J, Li S, Liu J, Liang B, Wang X, Wang H, et al. Longitudinal characteristics of lymphocyte responses and cytokine profiles in the peripheral blood of SARS-CoV-2 infected patients. EBioMedicine. 2020;55:102763.
We thank Yan Guo of GE Healthcare for her help regarding statistics. We also thank American Journal Experts for providing language editing services.
This study was supported by National Financial Appropriation Research Project (Grant Number 2017YFC1309100) and National Scientific Foundation of China (Grant Number 81971695). The funding body had no role in the design of the study, collection, analysis, and interpretation of data, or in writing the manuscript.
Ethics approval and consent to participate
The study was approved by the institutional review board of the First Affiliated Hospital of China Medical University. Informed consent was waived due to the nature of the retrospective study.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Cai, Q., Du, SY., Gao, S. et al. A model based on CT radiomic features for predicting RT-PCR becoming negative in coronavirus disease 2019 (COVID-19) patients. BMC Med Imaging 20, 118 (2020). https://doi.org/10.1186/s12880-020-00521-z
- Computed tomography