Skip to main content

Generation of virtual monoenergetic images at 40 keV of the upper abdomen and image quality evaluation based on generative adversarial networks

Abstract

Background

Abdominal CT scans are vital for diagnosing abdominal diseases but have limitations in tissue analysis and soft tissue detection. Dual-energy CT (DECT) can improve these issues by offering low keV virtual monoenergetic images (VMI), enhancing lesion detection and tissue characterization. However, its cost limits widespread use.

Purpose

To develop a model that converts conventional images (CI) into generative virtual monoenergetic images at 40 keV (Gen-VMI40keV) of the upper abdomen CT scan.

Methods

Totally 444 patients who underwent upper abdominal spectral contrast-enhanced CT were enrolled and assigned to the training and validation datasets (7:3). Then, 40-keV portal-vein virtual monoenergetic (VMI40keV) and CI, generated from spectral CT scans, served as target and source images. These images were employed to build and train a CI-VMI40keV model. Indexes such as Mean Absolute Error (MAE), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity (SSIM) were utilized to determine the best generator mode. An additional 198 cases were divided into three test groups, including Group 1 (58 cases with visible abnormalities), Group 2 (40 cases with hepatocellular carcinoma [HCC]) and Group 3 (100 cases from a publicly available HCC dataset). Both subjective and objective evaluations were performed. Comparisons, correlation analyses and Bland-Altman plot analyses were performed.

Results

The 192nd iteration produced the best generator mode (lower MAE and highest PSNR and SSIM). In the Test groups (1 and 2), both VMI40keV and Gen-VMI40keV significantly improved CT values, as well as SNR and CNR, for all organs compared to CI. Significant positive correlations for objective indexes were found between Gen-VMI40keV and VMI40keV in various organs and lesions. Bland-Altman analysis showed that the differences between both imaging types mostly fell within the 95% confidence interval. Pearson’s and Spearman’s correlation coefficients for objective scores between Gen-VMI40keV and VMI40keV in Groups 1 and 2 ranged from 0.645 to 0.980. In Group 3, Gen-VMI40keV yielded significantly higher CT values for HCC (220.5HU vs. 109.1HU) and liver (220.0HU vs. 112.8HU) compared to CI (p < 0.01). The CNR for HCC/liver was also significantly higher in Gen-VMI40keV (2.0 vs. 1.2) than in CI (p < 0.01). Additionally, Gen-VMI40keV was subjectively evaluated to have a higher image quality compared to CI.

Conclusion

CI-VMI40keV model can generate Gen-VMI40keV from conventional CT scan, closely resembling VMI40keV.

Peer Review reports

Introduction

Enhanced abdominal computed tomography (CT) is a common diagnostic tool applied for decades in abdominal diseases, including tumors, inflammation, and trauma. However, this method has certain limitations. Firstly, it reflects X-ray attenuation in the body as a whole, making it challenging to analyze tissues in detail, a common issue that arises when attempting to distinguish between calcified plaques and blood infused with iodine [1]. Secondly, its ability to detect soft tissues is limited, especially small, low-contrast soft tissue abnormalities. In certain situation with multiphasic CT yielding a diagnosis with low confidence, additional imaging methods such as MRI or PET-CT may be required. Recently, dual-energy CT (DECT) has emerged as a technology that can potentially reduce the need for additional imaging and improve diagnostic efficiency in multiple disorders [2, 3].

DECT provides additional spectral information that cannot be obtained by conventional CT. This technology improves the sensitivity and accuracy of lesion detection, enables material characterization, and reduces metal artifacts. Therefore, DECT has emerged as a promising diagnostic imaging tool [4, 5]. Such an approach provides notable benefits in terms of suppressing artifacts and enhancing image quality [6, 7]. Compared with conventional contrast-enhanced CT images, virtual monoenergetic images (VMI) at 40–70 keV derived from spectral CT have enhanced contrast and image quality for blood vessels and enhanced tissues. This provides a theoretical and technical basis for optimizing contrast agent injection protocols in enhanced scanning [8]. In lower extremity Computed Tomography Angiography, hepatic portal vein angiography, and contrast-enhanced scanning of the thorax, abdomen, and pelvis, the contrast agent dose can be decreased by 50–65% [9,10,11]. In individuals with renal insufficiency, ensuring image quality while reducing the contrast agent concentration is essential to prevent potential renal toxicity. Furthermore, factors such as individual variations and circulatory disorders result in suboptimal enhancement of blood vessels and tissues on CT images. Spectral CT allows for retrospective enhancement of the CT value in blood vessels by applying low-energy VMI, thereby improving image quality and enhancing diagnostic accuracy and confidence. This approach eliminates the need for repeated examinations and reduces unnecessary radiation exposure [12]. Multiple studies have demonstrated that VMI40keV exhibits the highest contrast-to-noise ratio, which is advantageous for lesion detection. VMI40keV maximizes the contrast of liver tumors, improves the image quality of multiphase abdominal enhancement scans, and enhances the detection of liver and pancreatic lacerations [13, 14]. However, DECT is more expensive than conventional CT, which limits its widespread adoption. Given this constraint, there is an urgent need to identify a cost-effective alternative that can mimic the advantages of DECT without the substantial financial outlay. This is where the idea of converting CI into VMI40keV becomes critical importance.

Deep Learning is renowned for its reliability, consistency, and accuracy in delivering results. These attributes have led to its extensive application across various domains, particularly in medical imaging [15,16,17,18]. Recently, Deep Learning has significantly transformed medical imaging, yielding remarkable advancements in image segmentation, diagnosis, and treatment planning. For examples, ConvUNeXt, a convolutional neural network (CNN) noted for its efficiency in medical image segmentation. Lightweight neural networks, such as those with multiscale feature enhancement, have demonstrated effectiveness in liver CT segmentation [19, 20]. Other notable models like DRU-Net and Dense-PSP-UNet underscore the capabilities of deep CNNs in enhancing both speed and accuracy in medical image segmentation tasks, particularly in liver ultrasound imaging [21, 22]. Additionally, the integration of CNN and transformer architectures, exemplified by CoTr, has further boosted the efficiency of 3D medical image segmentation [23]. Ansari et al. reviewed liver segmentation methods in clinical surgeries over the past decade and proposed a classification based on clinical value to assist clinicians in selecting the most suitable method. They systematically reviewed deep learning-based ultrasound image segmentation techniques over the past five years, summarizing methods, network architectures, loss functions, and the pros and cons of existing approaches for segmenting various organs [24, 25]. Akhtar et al. simulated hepatic resection surgery and assessed the indirect clinical risks of computer-aided diagnostic software, finding that it reduces the time to tumor recurrence compared to manual segmentation [26].

Besides, image generating tasks have attracted increasing attention in the field of computer vision. Among them, Generative Adversarial Network (GAN) models based on CNN, including Pix2Pix-GAN, are commonly used for image-to-image translation and transformation tasks [27]. The Pix2Pix-GAN model achieves image generation by utilizing two neural networks, including a generator and a discriminator. Its architecture typically consists of a U-Net generator, which allows for high-resolution image synthesis, and a patch-based discriminator, which evaluates the generated images at various scales. This design enables the model to focus on both local and global consistency, which is crucial for generating realistic images. Additionally, Pix2Pix GAN employs a loss function that combines adversarial loss, encouraging the generator to produce images indistinguishable from real ones, and content loss, ensuring the generated images retain the content of the input images. This combination of losses helps Pix2Pix-GAN to learn a robust mapping from input to output images, making it a powerful tool for image-to-image translation tasks. Conte et al. applied GAN to generate synthesized missing T1 and FLAIR MRI sequences for a multisequence brain tumor segmentation model [28]. Kawahara et al. employed GAN to generate monoenergetic CT images in DECT from kilovoltage CT scans, concluding that the proposed model offers a viable alternative for reconstructing monoenergetic CT images in DECT from single-energy CT scans [29].

In this study, we developed a CI-VMI40keV model based on Pix2pix-GAN.This model had been enhanced by augmenting the depth of its generator network, enabling it to generate Gen-VMI40keV from CI acquired from upper abdominal CT scans. We then assess its performance by conducting a comparative analysis of image quality among Gen-VMI40keV, VMI40keV, and CI. The key contributions of this study are:

  • 3D original image that has been segmented into a series of 2D images, each with a size of 512 × 512. No cropping or resampling was performed during this transformation, and the metadata was preserved to allow for corresponding region of interest (ROI) delineation for subsequent image quality assessment.

  • The best generator model was selected using a validation dataset based on Mean Absolute Error (MAE), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity (SSIM).

  • The quality of Gen-VMI40keV was evaluated using three sets of test groups. The first two groups were compared against corresponding images from our center, while the third group consisted of external data.

  • An encoding and decoding layer were added to accommodate 512 × 512 data and optimize the performance of the generator model.

  • Image quality was assessed using a combination of objective and subjective evaluation methods.

The remainder of this paper is structured as follows. Section 2 delves into the nuances of dataset preparation, the characteristics of patient datasets, the Pix2Pix framework with its architectural components and associated parameters, and the models employed. Furthermore, it discusses the model’s evaluation through objective indices of image quality, as well as both objective and subjective assessments of image quality. Section 3 presents the details of results and image quality analysis. Section 4 contains a discussion of the paper, concludes the paper’s value, acknowledging the study’s limitations and outlining potential avenues for future research.

Materials and methods

This retrospective study was approved by the Ethics Committee of Zhongshan Hospital affiliated to Xiamen University (IRB approval number: XMZSYY-AF-SC-12-03), who waived the requirement for informed consent.

Patient datasets

The study included training and validation sets, as well as two test groups (Group 1 and 2) of patients administered three-phase contrast-enhanced spectral CT scans of the upper abdomen. Additionally, another test group (Group 3) of patients who underwent conventional CT was included. The inclusion criterion for the study was the availability of portal venous phase CT images. Exclusion criteria were: (1) poor CT image quality with severe motion artifacts; (2) metallic implants causing significant radiographic artifacts; additional exclusions criterion for Test group 2&3: (3) unclear lesion appearance in the portal venous phase; (4) lesions occupying the liver, thus difficulty in distinguishing the normal tissue.

The training and validation sets (n = 444) were randomly selected from January 2021 to May 2021 from Zhongshan hospital affiliated to Xiamen University. Test group 1 (58 cases with no apparent abnormalities) and Test group 2 (40 cases with HCC) were randomly selected from July 2021 to December 2021 using Python (version 3.8) based on imaging report, respectively. Test group 3 patients were obtained from The Cancer Imaging Archive (TCIA, https://www.cancerimagingarchive.net/) [30]. Figure 1 depicts the process of obtaining datasets, including the application of exclusion criteria, as well as the stages of training, validation, and tests.

Fig. 1
figure 1

Flowchart of the training and validation sets and three test groups

Image acquisition

All the examined patients underwent three-phase contrast-enhanced CT using a dual-layer spectral detector CT (IQon, Philips, The Netherlands). Patients fasted for 5–7 h, drank 600 ml of water each, and were positioned supine with raised arms. Scans were performed from the upper liver to above the umbilicus. The main scanning parameters were: collimation, 0.625 mm × 64; pitch, 1.2 and 0.75 s per rotation; field-of-view (FOV), 35 cm; tube voltage, 120 kVp; automatic tube current modulation (90–180 mAs). The reconstructed images had a matrix of 512 × 512 and a slice thickness of 1.0 mm. After a non-contrast scan, patients received 60–75 ml of a iodinated contrast agent (Optiray, 300 mg/ml, Bayer) via a power injector at 3.0-3.2 ml/s. Arterial and portal venous phase scans occurred at 25 and 60 s after injection.

Finally, two-phase enhanced data were reconstructed using projection space spectral reconstruction to generate Spectral Based Images (SBI). The obtained portal venous SBI were transferred to a dedicated workstation (IntelliSpace V9, Philips Healthcare) for further analysis. VMI40keV and CI were derived from SBI with a slice thickness of 3 mm.

Characteristics of patient datasets

Table 1 summarizes baseline patient data in the training and validation sets and the three test groups, respectively. The final training set from our institution included 311 patients, aged 52.2 ± 14.1 years, including 141females, while the final validation set comprised 133 patients (aged 53.2 ± 15.2 years, with 61 females). The final numbers of patients in Test groups 1 (no apparent abnormalities) and 2 (HCC) from our institution were 58 (aged 48.4 ± 16.5 years, including 27 females) and 40 (aged 61.0 ± 13.5 years, including 6 females), respectively. The Test group 3 included 100 patients from TCIA, whose characteristics were unknown.

Table 1 Characteristics of patient data sets

Training the CI-VMI40keV model to generate Gen-VMI40keV from CI

We utilized the Pix2Pix framework, a conditional GAN designed for image-to-image translation, to train the GAN model to generate Gen-VMI40keV from CI. Prior to training, CT intensities ranging from − 1024 to 3071 were normalized to the (-1, 1) range. During the training process, this model was provided with paired sections, with one pair section belonging to the source (CI) and target (VMI40keV) domains. By leveraging the adversarial training, the GAN model learned to generate realistic VMI40keV, referred to as Gen-VMI40keV. The details of the CI-VMI40keV model training and Gen-VMI40keV from CI are shown in Fig. 2.

Fig. 2
figure 2

(a) Schematic diagram of the CI-VMI40keV model. (b) Synthesis Gen-VMI40keV upper abdominal CT images from conventional images

The CI-VMI40keV model in this study was trained for 250 epochs, as described by Isola et al. [27]. The training process involved alternating between training the discriminator and the generator for a gradient descent step. To ensure a balanced training, the discriminator loss was halved to slow down its training compared with the generator. The final loss function consisted of a combination of BCEWithLogitsLoss and L1 loss. During training, mini-batch random gradient descent was used with a batch size of 32. The Adam optimizer was utilized with a learning rate of 0.0002 and momentum parameters set to b1 = 0.5 and b2 = 0.999. These settings were crucial for optimizing the model’s performance and achieving the desired image translation results.

The Python software version 3.8 (Python Software Foundation) and PyTorch (version 1.12.1, https://pytorch.org/) were utilized. Model training and predictions were performed on a Linux workstation running Ubuntu version 20.04, equipped with an NVIDIA GeForce GTX 3090 GPU with 24 GB memory (NVIDIA, Santa Clara, CA, USA).

Model evaluation with objective indexes of image quality

To evaluate the performances of the models and select the best generative model, CI from the validation set were used as input to the CI-VMI40keV model. MAE, Peak PSNR, and SSIM were used for model assessment. They were derived as follows:

$$\text{M}\text{A}\text{E}(\text{I},\text{K}) =\frac{1}{\text{n}}\sum _{\text{i}=1}^{\text{n}}\left(\left|\text{I}-\text{K}\right|\right)$$
(1)
$$\text{M}\text{S}\text{E}(\text{I},\text{K}) =\frac{1}{\text{n}}{\sum _{\text{i}=1}^{\text{n}}(I-K)}^{2}$$
(2)
$$\text{P}\text{S}\text{N}\text{R} = 10 \times {\text{log}}_{10}\left(\frac{{\left({2}^{\text{n}}-1\right)}^{2}}{\text{M}\text{S}\text{E}}\right)$$
(3)
$$\text{S}\text{S}\text{I}\text{M}\left(\text{I},\text{K}\right)=\frac{\left(2{\mathcal{U}}_{\text{I}}{\mathcal{U}}_{\text{K}}+{\text{C}}_{1}\right)\left(2{{\sigma }}_{\text{I}\text{K}}+{\text{C}}_{2}\right)}{\left({{\mathcal{U}}_{\text{I}}}^{2}+{{\mathcal{U}}_{\text{K}}}^{2}+{\text{C}}_{1}\right)\left({{{\sigma }}_{\text{I}}}^{2}+{{{\sigma }}_{\text{K}}}^{2}+{\text{C}}_{2}\right)}$$
(4)

MAE is the average absolute difference between the generated (I) and actual (K) images. The closer the MAE to 0, the closer the Gen-VMI40keV to VMI40keV. PSNR assesses the noise distribution difference between Gen-VMI40keV and VMI40keV, where n represents the number of bits for pixel representation; MSE is the mean squared difference between I and K. A PSNR value of 20  30dB indicates poor image quality; 30  40dB implies noticeable image distortion but acceptable quality, and > 40dB suggests extremely high image quality. SSIM is a full-reference image quality assessment metric. In this context, \({\mathcal{U}}_{\text{I}}\) is the mean of I, \({\mathcal{U}}_{\text{K}}\) represents the mean of K, \({{{\sigma }}_{\text{I}}}^{2}\) is the variance of I, \({{{\sigma }}_{\text{K}}}^{2}\) denotes the variance of K, and \({{\sigma }}_{\text{I}\text{K}}\) represents the covariance between I and K, c1 and c2 are constants utilized to uphold stability, where c1 = (k1L)2 and c2 = (k2L)2. Here, k1 = 0.01 and k2 = 0.03. L symbolizes the dynamic range of pixel values, typically set to L = 255. The SSIM value ranges from 0 to 1, with a larger value indicating low image distortion.

The best generative model was used to generate Gen-VMI40keV, and the CT values of the Gen-VMI40keV were restored to the range of -1024 to 3071 HU. The coordinates and spacing of the obtained CI were assigned to the Gen-VMI40keV, so that Gen-VMI40keV, CI, and VMI40keV had the same spacing and spatial coordinates.

Objective evaluation of image quality

The objective evaluation was performed by a physician with seven years of experience in abdominal imaging. Using the medical image segmentation software ITK-Snap on the Test group 1, the regions of interest (ROIs) were delineated on CI. The ROIs were placed in the following areas, including 8 Couinaud segments of the liver, head/body/tail of the pancreas, spleen, subcutaneous adipose tissue, abdominal aorta, and erector spinae muscle. The CT value (mean) and the corresponding standard deviation (SD) were obtained. The areas of ROIs ranged from 100 to 1000 mm² and avoided blood vessels while maintaining density uniformity. Then, the ROIs were applied to VMI40keV and Gen-VMI40keV, ensuring consistent ROI sizes across images, and measurements were performed thrice to obtain an average value. The SD was considered the noise value, and the signal-to-noise ratio (SNR) was determined for each group of ROIs in the three image types as SNR = CT/SD. In Test groups 2 and 3, the same approach was applied to place ROIs in both HCC and normal liver tissues. The CT value (mean) and the corresponding SD were determined. The areas of ROIs ranged from 30 to 1000 mm², avoiding necrosis, blood vessels, calcification, etc. The contrast-to-noise ratio (CNR) for liver cancer was assessed as CNR = (CTHCC – CT liver tissue) / SD liver tissue.

Subjective evaluation of image quality

Two physicians each with 7 years of experience in abdominal imaging performed subjective ratings for image quality on Test groups 1, 2, and 3. In case of any discrepancy, a third senior physician with 15 years of experience made the final determination for subsequent analysis. The scoring was performed with a Likert 5-point scale as follows: 1, unidentifiable anatomical structures, extremely severe noise, very high image granularity, and poor image quality; 2, difficult anatomical structures to discern, blurry edges, severe noise, high image granularity, and relatively poor image quality; 3, some unclear anatomical structures, somewhat blurry edges, moderate noise, relatively high image granularity, and fair image quality; 4, quite clear anatomical structures, easily identifiable edges, minimal noise, small image granularity, and good image quality; 5, clear anatomical structures, smooth and clear edges, no apparent noise, minimal image granularity, and excellent image quality.

Statistical analysis

Statistical analysis was performed with R (version 4.1.0, https://www.r-project.org/), and statistical significance was defined as two-sided P < 0.05. The Kolmogorov-Smirnov test was used to assess the normality of continuous variables. Normally distributed data were expressed as mean ± standard deviation (SD), and non-normally distributed data as median (interquartile range) [M (Q1, Q3)]. In Test groups 1 and 2, both quantitative and quantitative indexes derived from CI, VMI40keV and Gen-VMI40keV were compared by the Friedman test. In case of statistically significant difference, post-hoc pairwise comparisons were performed by Dunn- Bonferroni correction. Pearson’s and Sperman’s correlation analyses were used to examine the correlations of CT values, noise, SNR, and CNR between VMI40keV and Gen-VMI40keV. The agreement of quantitative measurements from VMI40keV and Gen-VMI40keV was assessed with Bland-Altman plots. In the Test group 3, the Wilcoxon signed-rank test was applied to compare quantitative measurements and quantitative indexes from CI and Gen-VMI40keV.

Results

Selection of the best generator model

In the validation set, all PSNR and SSIM values were above 40 and 0.96, respectively. Figure 3 shows that the generator model at step 192 was selected as the best model, with the lowest MAE (16.407), highest PSNR (44.584), and highest SSIM (0.981).

Fig. 3
figure 3

Presents the mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) between VMI40keV and Gen-VMI40keV across different epochs ranging from 150 to 250 in the validation set. The line graphs depict the mean, and each point is accompanied by an error bar. The lower and upper whiskers on the vertical line represent the mean minus the standard deviation and mean plus the standard deviation, respectively

Objective evaluation of images

Except for the noise difference in the erector spinae muscle in the Test group 1, there were significant differences in CT value, noise, SNR, and CNR among the various groups for liver, pancreas, spleen, subcutaneous fat, aorta, and erector spinae muscle. Additionally, Gen-VMI40keV and VMI40keV exhibited significant positive correlations in CT value, noise, SNR, and CNR for various organs and HCC (all P < 0.01). The Pearson’s and Spearman’s correlation coefficients ranged from 0.645 to 0.980 (Table 2).

Table 2 Analysis of CT-value, noise SNR and CNR between CI, VMI40keV and Gen-VMI40keV in Test group 1 and Test group 2

Intra-group comparisons of CT values, SNR and CNR showed that both Gen-VMI40keV and VMI40keV had significantly higher CT values than CI (all P < 0.01). In the intra-group comparison of noise, except for subcutaneous fat and erector spinae muscle in Gen-VMI40kev vs. CI, and aorta and erector spinae muscle in VMI40kev vs. CI, there were no statistically significant differences in noise. For all other organs, Gen-VMI40kev and VMI40kev had slightly higher noise compared with CI, with statistically significant differences (P < 0.01).

In the Test group 3, there were statistically significant differences in CT value, noise, and CNR between HCC and normal liver parenchyma for Gen-VMI40keV versus.

CI (P < 0.01) (Table 3).

Table 3 Comparison of CT-value, noise and CNR between CI and Gen-VMI40keV in Test group 3

Bland-Altman plots for the Test group 1 showed mean differences in CT value for the liver, pancreas, spleen, subcutaneous fat, aorta, and erector spinae muscle between Gen-VMI40keV and VMI40kev of 4.34 HU, 7.05 HU, 6.45 HU, 0.44 HU, 2.71 HU, and 0.26 HU, respectively. Mean differences in noise were − 0.40 HU, 1.38 HU, 0.03 HU, -0.91 HU, 4.33 HU, and − 0.34 HU, respectively. Mean differences in SNR were 0.32, -0.07, 0.27, -0.57, -2.18, and 0.05, respectively. These measurement data were mostly within the respective 95% confidence intervals (Figs. 4 and 5).

Fig. 4
figure 4

Bland-Altman plot showing VMI40keV (CT-value, noise and SNR) and Gen-VMI40keV (CT-value, noise and SNR) on the liver, pancreas and spleen in the Test group 1. The middle horizontal line represents the mean value of the difference between VMI40keV and Gen-VMI40keV. The difference between the upper and lower horizontal lines represents the 95% confidence interval

Fig. 5
figure 5

Bland-Altman plot showing VMI40keV (CT-value, noise and SNR) and Gen-VMI40keV (CT-value, noise and SNR) on the aorta, subcutaneous fat and muscle in the Test group 1. The middle horizontal line represents the mean value of the difference between VMI40keV and Gen-VMI40keV. The difference between the upper and lower horizontal lines represents the 95% confidence interval

Bland-Altman plots for the Test group 2 showed mean differences in CT values between HCC and normal liver parenchyma for Gen-VMI40keV and VMI40keV of 1.81 HU and − 0.56 HU, respectively. Mean differences in noise were − 1.51 HU and 0.67 HU, respectively. The mean difference in CNR was − 0.29. Most of these measurement data were within the respective 95% confidence intervals (Fig. 6).

Fig. 6
figure 6

Bland-Altman plot showing VMI40keV (CT-value, noise and CNR) and Gen-VMI40keV (CT-value, noise and CNR) on the HCC, liver in the Test group 2. The middle horizontal line represents the mean value of the difference between VMI40keV and Gen-VMI40keV. The difference between the upper and lower horizontal line represents a 95% confidence interval

Subjective evaluation of images

In Test groups 1 and 2, median subjective image quality ratings for CI, Gen-VMI40keV, and VMI40keV were 4, 5, and 5, respectively. The overall differences among groups were statistically significant (P < 0.01). Intra-group comparisons revealed that both Gen-VMI40keV and VMI40keV had significantly higher image quality compared with CI (P < 0.01), with no statistically significant difference between Gen-VMI40keV and VMI40keV. In the Test group 3, statistically significant differences (P < 0.01) were found in image quality indexes between Gen-VMI40keV and CI (Table 4).

Table 4 Comparison of subjective scoring of CI, VMI40keV and Gen-VMI40keV in Test group-1, 2&3[M(Q1, Q3)]

Discussion

In recent years, it was demonstrated that jointly training of the generator and discriminator may improve tasks such as image synthesis and cross-mode image transformation in medical imaging [31,32,33]. The present study confirms the feasibility of medical image synthesis. In this study, Gen-VMI40keV generated from CI by CI-VMI40keV model were similar to VMI40keV acquired from DECT. This corroborates Yoshinori Funama et al., who conducted a similar study on generating pseudo low-monoenergetic CT images of the abdomen from 120-kVp CT images using cGAN [34]. MAE, PSNR, and SSIM were employed to compare Gen-VMI40keV and VMI40keV in this study. These three metrics are commonly used for image quality assessment in the field of image processing, and may help measure the similarity between Gen-VMI40keV and VMI40keV, with SSIM showing a correlation with the perceived quality within the context of the human visual system [35,36,37]. The results revealed that all models achieved PSNR and SSIM values above 40 and 0.98, respectively, in the validation dataset. This indicates that the models, after a certain number of training steps, produced Gen-VMI40keV in the validation dataset with a high degree of similarity to VMI40keV in terms of CT value, noise distribution, and anatomical structure. Subsequently, we selected the step model from all models reaching the 192nd step, with the lowest MAE and the highest PSNR and SSIM values, for further evaluation in the test groups. In this study, the PSNR and SSIM results we have achieved demonstrate a superior performance level. This discrepancy may be attributed to our utilization of an image resolution of 512*512 and an augmentation in the depth of the generator network in CI-VMI40keV model to accommodate images with a resolution of 512*512. Compared with previous reports [29, 34], the current model was further validated using external test groups (with and without lesions) and CT values were compared between Gen-VMI40keV and original VMI40keV generated from a spectral CT scanner.

Dual-layer detector spectral CT significantly improves the contrast of enhanced tissues in VMI40keV, with image noise at a lower level. This further leverages the advantages of low-energy VMI in lesion visualization and detection [38, 39]. Table 2 show that compared to CI, both VMI40keV and Gen-VMI40keV significantly improved CT values for all organs, as well as SNR and CNR, indicating an enhancement in image quality. Notably, despite the significant increase in CT values, the noise in Gen-VMI40keV was similar to or slightly higher than that of CI, demonstrating the model’s robustness to noise. The CT values, noise, SNR, and CNR between VMI40keV and Gen-VMI40keV were highly correlated, with correlation coefficients ranging from 0.645 to 0.980, indicating that Gen-VMI40keV well preserved the advantages of VMI40keV. Table 3 validates the model’s performance in an external test group, where Gen-VMI40keV continues to exhibit significant improvements in CT values and CNR compared to CI. The current findings suggest that Gen-VMI40keV, similar to VMI40keV, offers the increased SNR and CNR, resulting in the improved visualization of abdominal organs and HCC lesions. This improvement holds true even when working with images from scanners of different manufacturers. Gen-VMI40keV effectively enhances contrast between abnormal lesions and background tissues, raises vascular enhancement CT values, improves image quality, increases the detection rate for small lesions, and boosts diagnostic confidence.

HCC is typically identified by its hallmark features such as arterial phase hyperenhancement (wash-in) and hypoenhancement on portal- or delayed-phase images (wash-out) [40, 41]. However, imaging of small HCCs may deviate from the typical pattern due to factors such as well-differentiated HCC, fatty changes, and significant fibrosis within the tumor [42]. Consequently, these variations may complicate the diagnosis of small HCCs by conventional CT. Small HCC do not show portal-phase wash-out at dynamic CT images appearing nearly isodense on conventional images but demonstrate improved delineation on VMI40keV and Gen-VMI40keV (Fig. 7). In this study, the subjective image quality ratings for Test groups 1, 2 and 3 reveal significantly higher image quality for Gen-VMI40keV versus CI. In addition to its application to the data from the three test groups, the best generator model was also applied to other non-spectral CT data. Figure 8 presents a case of HCC in the Test group3, identified as HCC_019.

Fig. 7
figure 7

Artery-phase (AP) and portal-vein-phase (PVP) conventional CT image (a, b); PVP virtual monoenergetic images at 40 keV (c); and Gen-VMI40keV (d) produced by the best generative model in a patient with HCC (slice thickness of 3 mm). On the conventional AP contrast-enhanced CT images, no lesion was visible (a). The lesion was faint in PVP (arrow, b). PVP virtual monoenergetic images at 40 keV and Gen-VMI40keV showed HCC, which are more conspicuous (arrow, c&d)

Fig. 8
figure 8

Conventional images of a patients with HCC (a&c), identified as HCC_019 from the Test group3 (slice thickness of 2.5 mm), Gen-VMI40keV produced by the best generative model (b&d). On the axial and coronal portal-vein contrast-enhanced conventional images, the lesion was less visible than that on the Gen-VMI40keV (arrow)

In the paper, significant structural distortions or artifacts in the Gen-VMI40keV generated by the pix2pix GAN have not been observed. This positive outcome may be attributed to several factors that contributed to the robustness and fidelity of the model. Firstly, the dataset used was comprehensive and contained a substantial amount of data, which likely provided the model with a diverse and representative sample of the imaging task at hand. This extensive dataset would have helped the model to generalize better and avoid overfitting to specific patterns. Additionally, the model’s capacity was carefully chosen to match the complexity of the task, ensuring that it was neither underpowered nor overpowered. The learning rate of 0.0002, which is on the lower end, would have facilitated a more stable and gradual learning process, preventing the model from converging too quickly to suboptimal solutions. The absence of mode collapse, a common issue in GAN training, further indicates that the model was effectively exploring the data space without getting stuck in generating a limited variety of outputs. These factors collectively may have contributed to the high-quality image synthesis observed in the Gen-VMI40keV. However, this study focused solely on no apparent abnormalities and HCC of the upper abdomen, and it is possible that other diseases in the upper abdomen may exhibit artifacts or structural distortions.

Image segmentation can divide an image into regions with different semantic information, aiding in the accurate identification of key content within the image, such as the liver, tumors, and other critical areas. These key regions are crucial for image quality assessment because they are typically associated with medical diagnosis and treatment [22, 26]. Additionally, there are other parameters for evaluating image similarity, such as Mean Squared Error (MSE), Normalized Cross-Correlation (NCC), Mutual Information (MI), and the Feature Similarity Index (FSIM) [43]. In future analyses, we plan to incorporate image segmentation to further enhance the specificity of the evaluation. By segmenting the images into different tissue types, we will be able to assess the image quality for each specific region or organ and establish more objective evaluation metrics. This will allow for a more targeted and precise evaluation of the Gen-VMI40keV, potentially providing deeper insights into its clinical utility.

In conclusion, this work successfully developed CI-VMI40keV model to generate Gen-VMI40keV from CI and demonstrated its potential clinical utility. Through the evaluation of three test datasets in both objective and subjective aspects, Gen-VMI40keV demonstrated commendable quality comparable to VMI40keV and significantly enhanced the detectability of lesions, reduce the demand for DECT, thus expanding the application scope of advanced imaging technologies, yielding higher diagnostic confidence.

Limitations of this study were: only VMI40keV was analyzed, and no relevant analysis was conducted for VMI at different energy levels or for other applications of spectral CT, e.g., iodine and effective atomic number maps. In future research, different spectral images should be expanded to generate spectral images of diverse desired energy levels based on CI; only portal venous phase images were analyzed, and future studies should include images from different phases; the validation process was performed specifically on hepatocellular carcinoma (HCC) during testing. It may be beneficial that future studies should incorporate image analysis from a more extensive range of diseases.

Data availability

Test group 3 images can be obtained from The Cancer Imaging Archive at https://www.cancerimagingarchive.net/collection/hcc-tace-seg/. The code for the GAN model used in this study is available at https://github.com/picklesdaddy/VMI40kev.

Abbreviations

VMI40keV :

Virtual monoenergetic images at 40 keV

Gen-VMI40keV :

Generative virtual monoenergetic images at 40 keV

CI:

Conventional images

GAN:

Generative Adversarial Networks

CNN:

Convolutional neural network

HCC:

Hepatocellular Carcinoma

DECT:

Dual-energy CT

MAE:

Mean Absolute Error

MSE:

Mean Square Error

PSNR:

Peak Signal-to-Noise Ratio

SSIM:

Structural Similarity

ROI:

Region of interest

SD:

Standard deviation

SNR:

Signal-to-noise ratio

CNR:

Contrast-to-noise ratio

References

  1. Xu C, Zhou Y, Zhang R, Chen Z, Zhong W, Gong X et al. Metallic hyperdensity sign on noncontrast CT immediately after mechanical Thrombectomy predicts parenchymal hemorrhage in patients with Acute large-artery occlusion. AJNR Am J Neuroradiol. [Journal Article; Research Support, Non-U.S. Gov’t]. 2019;40(4):661–7.

  2. Kaur H, Hindman NM, Al-Refaie WB, Arif-Tiwari H, Cash BD, Chernyak V et al. ACR Appropriateness Criteria((R)) suspected liver metastases. J AM COLL RADIOL. [Journal Article; Practice Guideline; Review]. 2017;14(5S):S314–25.

  3. Krishna S, Murray CA, McInnes MD, Chatelain R, Siddaiah M, Al-Dandan O et al. CT imaging of solid renal masses: pitfalls and solutions. CLIN RADIOL. [Journal Article; Review]. 2017;72(9):708–21.

  4. Hamid S, Nasir MU, So A, Andrews G, Nicolaou S, Qamar SR. Clinical applications of dual-energy CT. Korean J Radiol. 2021;22(6):970.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Parakh A, An C, Lennartz S, Rajiah P, Yeh BM, Simeone FJ et al. Recognizing and minimizing artifacts at dual-energy CT. RADIOGRAPHICS. [Journal Article; Review]. 2021;41(2):509–23.

  6. Wang T, Han Y, Lin L, Yu C, Lv R, Han L. Image quality enhancement of CT hepatic portal venography using dual energy blending with computer determined parameters. J Xray Sci Technol. [Journal Article; Research Support, Non-U.S. Gov’t]. 2022;30(2):307–17.

  7. Schroder L, Stankovic U, Rit S, Sonke JJ. Image quality of dual-energy cone-beam CT with total nuclear variation regularization. Biomed Phys Eng Express. [Journal Article; Research Support, Non-U.S. Gov’t]. 2022;8(2).

  8. Albrecht MH, Vogl TJ, Martin SS, Nance JW, Duguay TM, Wichmann JL et al. Review of clinical applications for virtual monoenergetic dual-energy CT. Radiology. [Journal Article; Review]. 2019;293(2):260–71.

  9. Nagayama Y, Nakaura T, Oda S, Taguchi N, Utsunomiya D, Funama Y, et al. Dual-layer detector CT of chest, abdomen, and pelvis with a one-third iodine dose: image quality, radiation dose, and optimal monoenergetic settings. Clin Radiol. 2018;73(12):1021–58.

    Article  Google Scholar 

  10. Han D, Chen X, Lei Y, Ma C, Zhou J, Xiao Y et al. Iodine load reduction in dual-energy spectral CT portal venography with low energy images combined with adaptive statistical iterative reconstruction. Br J Radiol [Journal Article]. 2019;92(1100):20180414.

  11. Kristiansen CH, Thomas O, Tran TT, Roy S, Hykkerud DL, Sanderud A et al. Halved contrast medium dose in lower limb dual-energy computed tomography angiography—a randomized controlled trial. Eur Radiol. 2023.

  12. Ghandour A, Sher A, Rassouli N, Dhanantwari A, Rajiah P. Evaluation of virtual monoenergetic images on pulmonary vasculature using the dual-layer detector-based spectral computed tomography. J Comput Assist Tomo. 2018;42(6):858–65.

    Article  Google Scholar 

  13. Sun EX, Wortman JR, Uyeda JW, Lacson R, Sodickson AD. Virtual monoenergetic dual-energy CT for evaluation of hepatic and splenic lacerations. Emerg Radiol. 2019;26(4):419–25.

    Article  PubMed  Google Scholar 

  14. DiMaso LD, Miller JR, Lawless MJ, Bassetti MF, DeWerd LA, Huang J. Investigating split-filter dual‐energy CT for improving liver tumor visibility for radiation therapy. J Appl Clin Med Phys. 2020;21(8):249–55.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Chandrasekar V, Ansari MY, Singh AV, Uddin S, Prabhu KS, Dash S, et al. Investigating the Use of Machine Learning models to understand the drugs permeability across placenta. IEEE Access. 2023;11:52726–39.

    Article  Google Scholar 

  16. Ansari MY, Chandrasekar V, Singh AV, Dakua SP. Re-routing drugs to blood brain barrier: a comprehensive analysis of machine learning approaches with fingerprint amalgamation and data balancing. IEEE Access. 2023;11:9890–906.

  17. Ansari MY, Qaraqe M, Charafeddine F, Serpedin E, Righetti R, Qaraqe K. Estimating age and gender from electrocardiogram signals: a comprehensive review of the past decade. Artif Intell Med. 2023;146:102690.

    Article  PubMed  Google Scholar 

  18. Ansari MY, Qaraqe M, MEFood:. A large-scale representative benchmark of quotidian foods for the middle east. IEEE Access. 2023;11:4589–601.

    Article  Google Scholar 

  19. Han Z, Jian M, Wang G, ConvUNeXt. An efficient convolution neural network for medical image segmentation. Knowl Based Syst. 2022;253:109512.

    Article  Google Scholar 

  20. Ansari MY, Yang Y, Balakrishnan S, Abinahed J, Al-Ansari A, Warfa M et al. A lightweight neural network with multiscale feature enhancement for liver CT segmentation. SCI REP-UK. 2022;12(1).

  21. D. A MJ, F S, J. G XC, ^editors. DRU-Net: An Efficient Deep Convolutional Neural Network for Medical Image Segmentation. 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI); 2020 0003-7-20. Pub Place; Year Published.

  22. Ansari MY, Yang Y, Meher PK, Dakua SP. Dense-PSP-UNet: a neural network for fast inference liver ultrasound segmentation. Comput Biol Med. 2023;153:106478.

    Article  PubMed  Google Scholar 

  23. Xie Y, Zhang J, Shen C, Xia Y, (eds.) CoTr: efficiently bridging CNN and transformer for 3D medical image segmentation; 2021 2021-1-1; Cham. Pub Place: Springer International Publishing; Year Published.

  24. Ansari MY, Abdalla A, Ansari MY, Ansari MI, Malluhi B, Mohanty S et al. Practical utility of liver segmentation methods in clinical surgeries and interventions. BMC Med Imaging. 2022;22(1).

  25. Ansari MY, Changaai Mangalote IA, Meher PK, Aboumarzouk O, Al-Ansari A, Halabi O, et al. Advancements in Deep Learning for B-Mode Ultrasound Segmentation: a Comprehensive Review. IEEE Trans Emerg Top Comput Intell. 2024;8(3):2126–49.

    Article  Google Scholar 

  26. Akhtar Y, Dakua SP, Abdalla A, Aboumarzouk OM, Ansari MY, Abinahed J, et al. Risk assessment of computer-aided diagnostic software for hepatic resection. IEEE Trans Radiation Plasma Med Sci. 2022;6(6):667–77.

    Article  Google Scholar 

  27. Philip I, Jun-Yan Z, Tinghui Z, Alexei AE. (eds.). Image-to-image translation with conditional adversarial networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017. Pub Place; Year Published.

  28. Conte GM, Weston AD, Vogelsang DC, Philbrick KA, Cai JC, Barbera M et al. Generative adversarial networks to synthesize missing T1 and FLAIR MRI sequences for use in a multisequence brain tumor segmentation model. Radiology. [Journal Article; Research Support, N.I.H., Extramural]. 2021;299(2):313–23.

  29. Kawahara D, Ozawa S, Kimura T, Nagata Y. Image synthesis of monoenergetic CT image in dual-energy CT using kilovoltage CT with deep convolutional generative adversarial networks. J Appl Clin Med Phys. [Journal Article]. 2021;22(4):184–92.

  30. Ma Y, Wang J, Zhang H, Li H, Wang F, Lv P et al. A CT-based radiomics nomogram for classification of intraparenchymal hyperdense areas in patients with acute ischemic stroke following mechanical thrombectomy treatment. Front Neurosci-Switz. 2023;16.

  31. Wang B, Pan Y, Xu S, Zhang Y, Ming Y, Chen L et al. Quantitative cerebral blood volume image synthesis from Standard MRI using image-to-image translation for brain tumors. RADIOLOGY. [Journal Article; Multicenter Study]. 2023;308(2):e222471.

  32. Longuefosse A, Raoult J, Benlala I, Denis DSB, Benkert T, Macey J et al. Generating high-resolution synthetic CT from lung MRI with ultrashort echo times: initial evaluation in cystic fibrosis. Radiology. [Clinical study; Journal Article; Research Support, Non-U.S. Gov’t]. 2023;308(1):e230052.

  33. Shi Z, Li H, Cao Q, Wang Z, Cheng M. A material decomposition method for dual-energy CT via dual interactive Wasserstein generative adversarial networks. Med Phys [Journal Article]. 2021;48(6):2891–905.

  34. Funama Y, Oda S, Kidoh M, Nagayama Y, Goto M, Sakabe D, et al. Conditional generative adversarial networks to generate pseudo low monoenergetic CT image from a single-tube voltage CT scanner. Phys Med [Journal Article]. 2021;2021–3–1:83:46–51.

    Google Scholar 

  35. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. [Comparative Study; Evaluation Study; Journal Article; Research Support, Non-U.S. Gov’t; Research Support, U.S. Gov’t, Non-P.H.S.;, Validation. Study]. 2004;13(4):600–12.

  36. Kawahara D, Ozawa S, Saito A, Nagata Y. Image synthesis of effective atomic number images using a deep convolutional neural network-based generative adversarial network. Rep Pract Oncol Radiother [Journal Article]. 2022;27(5):848–55.

  37. Seibold C, Fink MA, Goos C, Kauczor HU, Schlemmer HP, Stiefelhagen R, Kleesiek J (eds.). Prediction of low-Kev monochromatic images from polyenergetic CT scans for improved automatic detection of pulmonary embolism. 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI); 2021 2021-1-1. Pub Place; Year Published.

  38. Reimer RP, Grosse HN, Fehrmann EA, Krauskopf A, Zopfs D, Kroger JR et al. Virtual monoenergetic images from spectral detector computed tomography facilitate washout assessment in arterially hyper-enhancing liver lesions. EUR RADIOL [Journal Article]. 2021;31(5):3468–77.

  39. Arico’ FM, Trimarchi R, Portaluri A, Barilla’ C, Migliaccio N, Bucolo GM et al. Virtual monoenergetic dual-layer dual-energy CT images in colorectal cancer: CT diagnosis could be improved? RADIOL MED. [Journal Article]. 2023;128(8):891–9.

  40. Borgheresi A, Gonzalez-Aguirre A, Brown KT, Getrajdman GI, Erinjeri JP, Covey A et al. Does enhancement or perfusion on preprocedure CT predict outcomes after embolization of hepatocellular carcinoma? Acad Radiol [Journal Article; Research Support, N.I.H., Extramural]. 2018;25(12):1588–94.

  41. Shah S, Shukla A, Paunipagar B. Radiological features of hepatocellular carcinoma. J Clin Exp Hepatol [Journal Article; Review]. 2014;4(Suppl 3):S63–6.

  42. Loy LM, Low HM, Choi JY, Rhee H, Wong CF, Tan CH. Variant Hepatocellular Carcinoma subtypes according to the 2019 WHO classification: an imaging-focused review. AJR Am J Roentgenol. [Journal Article; Review]. 2022;219(2):212–23.

  43. S. M, S. PD. Toward computing cross-modality symmetric non-rigid medical image registration. IEEE Access. 2022;10:24528–39.

Download references

Acknowledgements

Not applicable.

Funding

Medical and Health Guidance Project Foundation of Xiamen City (grant no. 01105827).

Author information

Authors and Affiliations

Authors

Contributions

H.Z. and Q.H. wrote the main manuscript text. H.Z., Q. H., and X.C. reviewed and edited the manuscript. Y.Q., Q.H., X.Z., and H.Z. curated the data. H.Z. and Q.H. conducted formal analysis and prepared Figs. 1, 2, 3, 4, 5 and 6. Y.W., J.W., and S.D. prepared Tables 1, 2, 3 and 4; Figs. 7 and 8. H.Z. and Q.H. contributed to software and visualization. All authors reviewed the manuscript.

Corresponding author

Correspondence to Hua Zhong.

Ethics declarations

Ethics approval and consent to participate

This retrospective study obtained approval from the Institutional Review Board of Zhongshan Hospital Affiliated to Xiamen University (IRB approval number: XMZSYY-AF-SC-12-03). Written informed consent was not required in accordance with local legislation and institutional requirements.

Competing interests

Xingbiao Chen is employee of Philips Healthcare, the manufacturer of the scanner and the rest of the author has no conflict of interest.

Consent for publication

Not Applicable.

Study subjects or cohorts overlap

No study subjects or cohorts have been previously reported.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhong, H., Huang, Q., Zheng, X. et al. Generation of virtual monoenergetic images at 40 keV of the upper abdomen and image quality evaluation based on generative adversarial networks. BMC Med Imaging 24, 151 (2024). https://doi.org/10.1186/s12880-024-01331-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12880-024-01331-3

Keywords