Skip to main content

A review on deep learning MRI reconstruction without fully sampled k-space



Magnetic resonance imaging (MRI) is an effective auxiliary diagnostic method in clinical medicine, but it has always suffered from the problem of long acquisition time. Compressed sensing and parallel imaging are two common techniques to accelerate MRI reconstruction. Recently, deep learning provides a new direction for MRI, while most of them require a large number of data pairs for training. However, there are many scenarios where fully sampled k-space data cannot be obtained, which will seriously hinder the application of supervised learning. Therefore, deep learning without fully sampled data is indispensable.

Main text

In this review, we first introduce the forward model of MRI as a classic inverse problem, and briefly discuss the connection of traditional iterative methods to deep learning. Next, we will explain how to train reconstruction network without fully sampled data from the perspective of obtaining prior information.


Although the reviewed methods are used for MRI reconstruction, they can also be extended to other areas where ground-truth is not available. Furthermore, we may anticipate that the combination of traditional methods and deep learning will produce better reconstruction results.

Peer Review reports


Magnetic resonance imaging (MRI) plays an important role in clinical medicine, and it can visualize human organs and tissues to help follow-up diagnosis. However, MRI has always faced the challenge of long scan time. Before the advent of deep learning, two common methods were used to accelerate MRI, one was compressed sensing (CS) utilizing image compressibility, and another was parallel imaging using redundant information between coils [1,2,3]. Although these methods have made certain achievements, it still faces the challenges of long iteration time and low acceleration rate.

Recently, deep learning has become a method for accelerating MRI. Compared with traditional methods, it not only improves the quality of reconstructed images but has the advantages of real-time imaging. The quality of images is measured comprehensively by the peak signal-to-noise ratio (PSNR) and mean structure similarity index measure (MSSIM). Higher PSNR means less noise and Higher MSSIM entail better structure similarity with the ground truth. Meanwhile, real-time imaging is important for some clinic applications, for example, deep learning can achieve real-time adaptive magnetic resonance imaging (MRI)-guided radiotherapy by achieving higher acceleration factors to reduce total delays [4] and provide a powerful diagnostic tool for dynamic assessment of wrist function [5]. However, most of them require a large amount of data to perform network learning in a supervised learning manner.

Traditional optimization methods and deep learning in a supervised manner have done a lot of work and related reviews can be found in [6, 7]. Due to physiological constraints such as organ motion or physical constraints such as signal decay, it is difficult, impractical and impossible to obtain fully sampled data. Some researchers try to utilize transfer learning to solve this problem [8], while they still require a small number of fully sampled data to adjust the pre-trained network. Hence, how to perform network learning and image reconstruction in the absence of fully sampled data is an active research topic. Here, the main related methods in deep learning-based MRI reconstruction without fully sampled data are reviewed.

The remainder of this paper is organized as follows. First, we give a brief overview of the traditional reconstruction model, meanwhile, some reconstruction algorithms involved in the following review are roughly discussed. Then deep learning model for MRI reconstruction is illustrated in a supervised manner. Next, we review how to train a reconstruction network without fully sampled data from the perspective of obtaining prior information. Finally, we emphasize the necessity of deep learning reconstruction without fully sampled data and the current challenges and look forward to the future.

Traditional methods for MRI reconstruction

Reconstruction model

Reconstructing a high-quality image from under-sampling data is a typical inverse problem. A multi-coil imaging model can be expressed as follows,

$${\mathbf{y}} = {\mathbf{Ax}} + {{\varvec{\upeta}}}\;{\text{with}}\;{\mathbf{A}}_{i} { = }{\mathbf{UFS}}_{i} ,$$

where \({\mathbf{x}} \in {\mathbb{C}}^{N}\) is the image to be reconstructed, \({\mathbf{y}} \in {\mathbb{C}}^{M}\) is the noisy measured data \(\left( {M < N} \right)\), \({{\varvec{\upeta}}}\) is the noise, and \({\mathbf{A}}\) denotes a measurement operator consisting of a sampling matrix \({\mathbf{U}} \in {\mathbb{R}}^{M \times N}\), Fourier transform operator \({\mathbf{F}}\) and the sensitivity map matrix \({\mathbf{S}}_{i}\) for the ith coil. A common reconstruction model is to add a regularization term to constrain its solution space,

$$\mathop {\arg \min }\limits_{{\mathbf{x}}} \frac{1}{2}||{\mathbf{y}} - {\mathbf{Ax}}||_{2}^{2} + \lambda \Re ({\mathbf{x}}),$$

where \(||{\mathbf{y}} - {\mathbf{Ax}}||_{2}^{2}\) ensures consistency with the measured data, \(\Re ({\mathbf{x}})\) is a regularization item, and \(\lambda\) is a tradeoff between the data consistency and the regularization terms. In most cases, the difference lies in whether it is single-channel reconstruction or multi-channel [1,2,3] reconstruction. Meanwhile, multiple regularization items can be chosen such as 2D wavelet [9], total variation (TV) [10], dictionary [11], 3D wavelet [12], 3D k-t sparse, 3D low-rank (k-t SLR) [13]. Some methods are often used to iteratively solve the above optimization problems [6].

Sparsity or low-rankness constraints are often used as priors to reduce the artefacts of the reconstruction image when the acceleration rate is high. Lustig et al. [14, 15] firstly applied compressed sensing to MRI reconstruction and achieved reliable results. Afterwards, researchers found that the key to MRI reconstruction based on compressed sensing lies in the design of the sparse domain, which mainly includes pre-constructed [9, 14,15,16,17,18] or adaptive [12, 19,20,21] basis and dictionary [11, 22, 23]. Besides, low-rankness methods are mainly used for dynamic and high-dimensional imaging by exploring the relationship between multiple images [24]. The structured low-rankness of k-space is discovered and used for reconstruction [25]. Meanwhile, the low-rankness of the structured matrix is used to jointly reconstruct the image with other aspects, including transform-domain weighted k-space [26,27,28,29] and slowly varying image phases [30, 31]. Although good achievements have been achieved, traditional optimization reconstruction methods complete iterations with more time.

Optimization algorithm

Practical and effective optimization algorithms are essential. A large number of algorithms have been studied to solve various optimization problems. Deep learning networks and optimization iterative reconstruction algorithms still have a certain connection, thus some of them will be reviewed. Here, we only briefly introduce the algorithms that will be involved in the review, including variable-splitting with the quadratic penalty (VSQP) [32], proximal gradient descent (PGD) [33], iterative shrinkage-thresholding algorithm (ISTA) [34], alternate directions method of multipliers (ADMM) [35].


We use variable-splitting with the quadratic penalty (VSQP) for Eq. (2), the formulation is as follow,

$$\begin{aligned} {\mathbf{z}}^{i} & = \mathop {\arg \min }\limits_{{\mathbf{z}}} \mu ||{\mathbf{x}}^{i - 1} - {\mathbf{z}}||_{2}^{2} + \Re ({\mathbf{z}}) \\ & = prox_{\Re } ({\mathbf{x}}^{i - 1} ), \\ \end{aligned}$$
$${\mathbf{x}}^{i} = \mathop {\arg \min }\limits_{{\mathbf{x}}} \frac{1}{2}||{\mathbf{y}} - {\mathbf{Ax}}||_{2}^{2} + \mu ||{\mathbf{x}} - {\mathbf{z}}^{i} ||_{2}^{2} ,$$

where \({\mathbf{z}}^{i}\) is the auxiliary intermediate variable, \({\mathbf{x}}^{i}\) is the image to be reconstructed in the ith iteration, and \(\mu\),\(prox_{\Re } ( \cdot )\) is the secondary penalty parameter and proximity operator respectively. In a deep network, this algorithm can be unrolled for a fixed number of iterations as network architecture, Eq. (3) is mainly related to the choice of priors, and can be interpreted as a denoising operation [50], which is executed in the manner of a neural network. Equation (4) depends on the selection of the forward model and corresponds to the data-consistency (DC) layer in the network, which is usually solved by

$$({\mathbf{A}}^{H} {\mathbf{A}} + \mu {\mathbf{I}}){\mathbf{x}}^{i} = ({\mathbf{A}}^{H} {\mathbf{y}} + \mu {\mathbf{z}}^{i} ),$$

where \(\cdot^{H}\) denotes conjugate transpose, Eq. (5) can be updated by using a conjugate gradient (CG) [36] to avoid the matrix inversion process and cope with multi-coil reconstruction scenarios.


In VSQP, if the following formula is used to update \({\mathbf{x}}^{i}\) instead of Eq. (4), the proximal gradient descent (PGD) is formulated as.

$${\mathbf{x}}^{i} = {\mathbf{z}}^{i} + \rho {\mathbf{A}}^{T} ({\mathbf{y}} - {\mathbf{Az}}^{i} ),$$

where \(\rho\) is a gradient descent step size. Since the simplest gradient descent is used to update, it often requires more iterations to achieve better results.


As discussed above, both VSQP and PGD use a network to directly learn the approximation mapping for Eq. (3). Here, an iterative shrinkage-thresholding algorithm (ISTA) is used to guide the completion of Eq. (3). As well known, ISTA comes from solving \(l_{1}\) norm problems; however, magnetic resonance images tend to be sparse in a certain domain rather than self-sparse, hence there is no simple closed-form solution. To explain in more detail, we use the following substitution in Eq. (2).

$$\Re {(}{\mathbf{x}}{)} = \lambda {||}{\mathbf{\Psi x}}||_{1} ,$$

we can get the final solution by alternately iterating the following sub-problem and equation Eq. (6).

$$\begin{aligned} {\mathbf{z}}^{i} &= {{\varvec{\Psi}}}^{H} prox_{\rho \Re } ({\mathbf{\Psi x}}^{i - 1} ) \\ & = \widetilde{\hbar }({\mathbf{x}}^{i - 1} )\Gamma_{\kappa \Re } (\hbar ({\mathbf{x}}^{i - 1} )), \\ \end{aligned}$$

where \({{\varvec{\Psi}}}\) represents a tight frame, \(\Gamma_{\kappa } (x)\) denotes shrinkage operator such that

$$\Gamma_{\kappa} (x) = {\text{sign(x)}} \cdot {\max\{|{\text{x}}|-\kappa, 0\}},$$

when ISTA meets deep learning, a nonlinear transform operator \(\hbar ({\mathbf{x}})\) is used here instead of \({{\varvec{\Psi}}}\) and \(\widetilde{\hbar }({\mathbf{x}}^{i - 1} )\) is an inverse operator of \(\hbar ({\mathbf{x}}^{i - 1} )\), meanwhile, \(\widetilde{\hbar }({\mathbf{x}}^{i - 1} )\) and \(\hbar ({\mathbf{x}}^{i - 1} )\) are implemented with a neural network respectively, \(\kappa\) is a new parameter that includes \(\rho\). More details can be acquired in [37]. Moreover, when tight frame sparsity is enforced, Liu et al.[17] proposed a projected iterative soft-thresholding algorithm (pFISTA) to address the problem that ISTA can not be directly applied to MRI reconstruction, meanwhile, Zhang et al. [38]proved the convergence of pFISTA applied to parallel imaging. Subsequently, Lu et al. [39] constructed pFISTA-SENSE-ResNet network based on pFISTA and achieved better results compared with traditional parallel imaging in terms of MSSIM and PSNR.


For formula Eq. (2), we can make the following Augmented Lagrangian function by utilizing a new variable

$$\mathop {\max }\limits_{{\mathbf{u}}} \mathop {\min }\limits_{{{\mathbf{x}},{\mathbf{v}}}} \frac{1}{2}||{\mathbf{y}} - {\mathbf{Ax}}||_{2}^{2} + \Re ({\mathbf{v}}) + \frac{\nu }{2}||{\mathbf{x}} - {\mathbf{v}} + {\mathbf{u}}||_{2}^{2} ,$$

where \({\mathbf{u}}\) and \(\nu\) denote Lagrangian multiplier and penalty parameter, respectively. Equation (10) can be solved by three alternate iteration sub-problems. For simplicity, we do the following substitutions:

$$f_{\nu } ({\mathbf{x}},{\mathbf{v}},{\mathbf{u}}) = \frac{1}{2}||{\mathbf{y}} - {\mathbf{Ax}}||_{2}^{2} + \Re ({\mathbf{v}}) + \frac{\nu }{2}||{\mathbf{x}} - {\mathbf{v}} + {\mathbf{u}}||_{2}^{2} ,$$

then the alternate iteration subproblem is as follows:

$$\left\{ \begin{aligned} & {\mathbf{x}} \leftarrow \mathop {\min }\limits_{{\mathbf{x}}} f_{\nu } ({\mathbf{x}},{\mathbf{v}},{\mathbf{u}}) \hfill \\& {\mathbf{v}} \leftarrow \mathop {\min }\limits_{{\mathbf{v}}} f_{\nu } ({\mathbf{x}},{\mathbf{v}},{\mathbf{u}}) \hfill \\ & {\mathbf{u}} \leftarrow {\mathbf{u}} + {\mathbf{x}} - {\mathbf{v}} \hfill \\ \end{aligned} \right..$$

Alternate directions method of multipliers (ADMM) can be combined with TV [40] and dictionary learning [11] to complete MRI reconstruction together. When the ADMM algorithm is expanded into a network, different network versions can be constructed according to the learning situation of the network, for instance, image transformation has been replaced by the network for ADMM-net-I [41], ADMM-net-II [41] also learns data consistency except for image transformation.

Deep learning with fully sampled k-space data

In the past few years, deep learning has achieved outstanding performance in the medical field, including biological magnetic resonance spectroscopy [42,43,44,45,46] and accelerated MRI [36, 39, 47,48,49,50,51,52]. MRI reconstruction based on deep learning can be roughly divided into two categories, data-driven and model-driven. The former uses the redundant information in the original input to learn the potential mapping relationship from input to output, including learning the mapping from zero-filed to artefact-free images [47] and the interpolation rules of k-space [49, 50].

For the sake of allowing the network to exploit the information of the imaging system, the researchers proposed physical model-driven deep learning methods [36, 39, 51, 52]. The network uses a fixed number of iterations to unroll the traditional optimization iterative algorithm, which not only achieves better reconstruction results but makes the network more interpretable. When there are a large number of training sample pairs, the supervised reconstruction can be expressed as follows,

$$\mathop {\arg \min }\limits_{\theta } \frac{1}{N}\sum\limits_{i = 1}^{N} {{\text{L}} ({\mathbf{x}}_{ref}^{i} ,f({\mathbf{y}}^{i} ,{\mathbf{A}}^{i} ,\theta ))} ,$$

where \({\mathbf{x}}_{ref}^{i}\) is the reference image of ith subject, \({\text{L}} ( \cdot , \cdot )\) is the loss function between the network output image and the reference image, and N is the number of fully sampled datasets in the training database. Let \(f({\mathbf{y}}^{i} ,{\mathbf{A}}^{i} ,\theta )\) denotes output image of network for under-sampled k-space data \({\mathbf{y}}^{i}\) and measurement operator \({\mathbf{A}}^{i}\) of ith subject, where the network is parameterized by \(\theta\). Equation (13) can be implemented using stochastic gradient descent (SGD) [53], whose basic form is

$$\theta^{k + 1} = \theta^{k} - \rho^{k} \frac{1}{N}\sum\limits_{i = 1}^{N} {\nabla {\text{L}} ({\mathbf{x}}_{ref}^{i} ,f({\mathbf{y}}^{i} ,{\mathbf{A}}^{i} ,\theta ))} ,$$

where \(\rho^{k}\) is the gradient descent size. According to actual needs, it can change with the number of iterations, or be a constant. Due to the slow convergence of the basic SGD, researchers use other variants of SGD to speed up the convergence of the algorithm and avoid convergence to the saddle point [54,55,56]. Furthermore, unrolled network based on the traditional optimization algorithm is often used to improve the interpretability of the network and reconstruction quality [36, 39, 57, 58]. A comprehensive review of model-driven MRI deep learning reconstruction can be found in [7].

Deep learning without fully sampled k-space data

As discussed above, a network in a supervised manner can learn maps to complement the missing information in the input from fully sampled data. However, for the scenario without fully sampled data, it is difficult to find the optimal solution from infinite latent solutions without other information. For traditional optimization algorithms, the regularization term is usually manually pre-defined to obtain the optimal solution by compressing the solution space. Hence, how to better discover effective prior information is very important for deep learning without fully sampled k-space data, the flowchart is shown in Fig. 1.

Fig. 1
figure 1

Flowchart of deep learning for MRI reconstruction with fully sampled data (a) and without fully sampled data (b). The difference between (a) and (b) is that (a) can train the network in a supervised manner. The network takes undersampled data and other prior as inputs and update parameters by backpropagation algorithms such as SGD and its variation. In reconstructing phase, the trained network can reconstruct high-quality images from the input

Next, we will show the MRI reconstruction process in a deep learning manner from the perspective of a prior acquisition.

Deep image prior

The experiment in [59] showed that only a generation network can still achieve good results in the absence of any other reference data, which illustrate that the convolutional neural network can replace the regularization term in Eq. (2) by capturing the low-level image implicit prior.

Yazdanpanah et al. [60] and Senouf et al. [61] apply this idea to MRI reconstruction, only use the zero-filled image as the input of the network, and then iteratively update the network parameters to approximate the k-space of the output image to the under-sampled k-space data. This idea was further extended to dynamic MRI by Jin et al. [62]. However, there are also obvious shortcomings, since it required well-designed network architectures and was easy to overfit to noise with the iteration gradually approaches the target as the optimization function is based on lossy data. Hence the regularization term was introduced to alleviate this situation [63]. Meanwhile, many works illustrate that unroll network based on the physical model can improve the interpretability of the network and reconstruction quality, and some of these traditional algorithms are briefly introduced in "Traditional methods for MRI reconstruction" section.

A. Wang et al. [64] proposed to combine a VQSP-based iterative network (as depicted in Fig. 2) with the high robustness of the classic iterative algorithm. The loss function adds a regularization term based on [60] as follows,

$$L({\mathbf{y}},\theta ) = \frac{1}{N}\sum\limits_{i = 1}^{N} {[||{\mathbf{A}}I_{\theta } ({\mathbf{y}}_{{\mathbf{i}}} ) - {\mathbf{y}}_{{\mathbf{i}}} ||_{2}^{2} + \Re (I_{\theta } ({\mathbf{y}}_{{\mathbf{i}}} ))]} ,$$

where \(I_{\theta } ({\mathbf{y}}_{{\mathbf{i}}} )\) is an output of the network, the \(\Re ( \cdot )\) is pre-defined such as TV. Although the addition of the \(\Re ( \cdot )\) can ameliorate the situation in [60] that is easy to overfit to noise, it will encourage the solution to approach the characteristics of the pre-defined regularization term. The experimental performance revealed that the model in [64] was more robust than the supervised learning without using the above loss function, and had a better reconstruction utility compared with the classical method using the same loss function.

Fig. 2
figure 2

Unrolled network frame for VSQP. Here, each block consists of a regularization R and a data consistency (DC), which correspond to Eq. (3) and Eq. (4) respectively in VSQP

Self-partition k-space data

Yaman et al. [32] proposed to divide the measured under-sampling space \(\Omega\) into two disjoint subsets satisfying \(\Omega = \Lambda \cup \Theta\) to train VSQP -based unrolled reconstruction network, where \(\Lambda\) was used as the network input for training and \(\Theta\) was used to calculate the loss function, which could be called self-supervised learning via data under-sampling (SSDU). The experimental results showed that at a certain moderately acceleration rate the mentioned method can achieve comparable performance to supervised learning with fully sampled data and be better than traditional compressed sensing and parallel imaging. Since the under-sampled data needs to be divided, the information provided to the network for learning is further reduced, which will result in a poor network reconstruction performance at a high acceleration time. Thereby, Yaman et al. [65] further proposed a multi-mask method increasing the use of under-sampled data to improve the quality of reconstruction at higher acceleration rates. Here, the under-sampling data was divided into multiple disjoint subsets \(\Omega = \Lambda_{j} \cup \Theta_{j}\) for \(j\) = 1, …, K denoting the number of partitions for each scan. We can visualize the process through Fig. 3. The result illustrated that the multi-mask method outperforms SSDU at high acceleration rates. Even so, since the essence that partition will decrease information to the network can not be changed, acceleration rates are still limited. Furthermore, Hosseini et al. [66] tried to fine-tune the pre-trained reconstruction network in a scan-specific manner by using SSDU to reduce the risks of generalization to rare pathological conditions.

Fig. 3
figure 3

Image reconstruction with self-partition undersampled k-space data. Acquired undersampled k-space data \(\Omega\) will be divided into two subsets satisfying \(\Omega = \Lambda_{j} \cup \Theta_{j}\) before training network, where \(j = 1,\ldots, {\text{K}}\) denoting the number of partitions for each scan, \(\Lambda\) and \(\Theta\) is used as input for training and to calculate the loss function separately. The network is unrolled based on the VSQP algorithm. This figure is reproduced following Fig. 1 in Ref. [65]

k-space Information complement each other

The lack of fully sampled data motivates to study how to use the information available in lossy images [67]. The key problem that we can not use supervised learning for network training is that we do not have missing information as a label in each sample.

Inspired by Noise2Noise (N2N) [67], the artefact-contaminated dataset obtained by sampling the same object multiple times will supplement the information missing in each sample, which enables the training of the imaging priors. However, sampling multiple times for the same object violates the original intention of accelerating MRI reconstruction and not requiring fully sampled data, hence it has a little restriction on application scenarios.

Different body parts and regions of interest have unique complications, such as the liver that needs to hold the breath for imaging and cardiac that produce motion. Thereby, specific deep learning models need to be adapted for specific tasks [68, 69]. Gan et al. [70] use the middle adjacent layers having the most relevant brain regions in each object from open dataset OASIS-3 [71] to simulate multiple sampling of the same object. While the training data comes from the different breathing phases within the same slice of the liver in [72], which has obvious shortcomings for patients who do not have periodic breathing. Similarly, for organs such as cardiac that produce motion, Ke et al. [73] used a time-interleaved acquisition scheme to build a series of fully encoded data as reference images for network training by merging the k-space of several adjacent frames along the time dimension. The remaining part will explain more details.

Gan et al. [70] proposed to train two networks simultaneously, one was used for reconstruction by utilizing information supplement between different samples, and another to register the image as the object may have moved in the actual scanning process. The experimental results showed that the method was superior to the unregistered Noise2Noise method and the traditional total variation (TV) method in terms of sharpness, contrast and de-artefacts.

Additionally, experiments have shown that converting advanced denoising devices into regularization items can achieve good results [63, 74, 75], regularization by denoising (RED) [74] and the plug-and-play-prior (PnP) [76] are two common skills. Meanwhile, the deep network can be flexible to extract useful information from the data set compared with handcrafted prior. Hence Liu et al. [72] proposed to pre-train a de-artefact network as imaging prior through information complement between under-sampled data, the regularization function used the basic framework of RED, where the denoiser was replaced by the pre-train network \(I_{\theta } ({\mathbf{x}})\) in the following form.

$$\Re ({\mathbf{x}}) = \frac{\tau }{2}{\mathbf{x}}^{T} ({\mathbf{x}} - I_{\theta } ({\mathbf{x}}))$$

where \(\tau\) is a regularization parameter. Bring Eq. (16) into Eq. (2), then iteratively update \({\mathbf{x}}\) by using the simplest gradient descent algorithm for Eq. (2). Due to the use of data fidelity information, compared to the results obtained directly through \(I_{\theta } ({\mathbf{x}})\), the proposed method reconstruction results on liver data were better and outperformed traditional iterative algorithms, such as k-t SLR. Moreover, we can use more advanced traditional iterative algorithms to update \({\mathbf{x}}\) such as ADMM and so on.

In [73], Ke et al. averaged all acquired frames to improve the signal-to-noise ratio (SNR) and relieve memory pressure. It should be noted that the merge operation only occurs in the training sample synthesis stage, and it is not required in the subsequent testing stage. Therefore, the reconstructed image will not result in a lower temporal resolution. Concretely, a network is established to learn the correlation between the coils instead of obtaining through ESPIRIT [77], the physical model-based ADMM-Net-III [41] was used as the reconstruction network. The method structure diagram is shown in Fig. 4. The experimental results showed that the reconstruction quality was better than conventional reconstruction methods, such as k-t SLR [13], L + S [78], KLR [79], etc., and the reconstruction time was shorter [73]. Nevertheless, as the breathing pattern is inconsistent with different people, the generalization ability of the model will be affected.

Fig. 4
figure 4

The structure diagram in [73]. In data preparation, the fully encoded k-space is obtained by k-space integration and averaging of multiple frames in a time-interleaved sampling manner, then which is undersampled with a designed sampling mask and performs some operations including inverse Fourier transform and coil combination to get input and output data pairs separately. The parallel neural network consists of coil reconstruction and coil combination, we can refer to [41] for more details about ADMM-Net-III. This figure is reproduced following Fig. 1 and Fig. 3 in Ref. [73]

Assuming known probability distribution as prior

Generative Adversarial Networks (GAN) [80] has shown strong advantages in unsupervised learning and can estimate the basic data distribution while obtaining higher image quality [81, 82]. In principle, the purpose of adversarial training is to approximate, in terms of distances, the probability distribution of the label set. Hence, there is no need for the ground truth corresponding to the input. Nevertheless, the real probability distribution between the generated output and the input of the discriminator will directly affect the utility of the final generator output as both of them approximating when the generator and discriminator obtaining equilibrium.

Cole et al. [83] proposed to build an ISTA-based GAN network for MRI reconstruction in the absence of fully sampled data. The network framework shown in Fig. 5 is based on the assumption that the randomly undersampled k-space \({\mathbf{y^{\prime}}}\) has the approximate distribution as the initially obtained measured k-space \({\mathbf{y}}\), where \({\mathbf{y}}\prime\) is obtained by performing forward measurement operation including random undersampled mask on the output of the generator. Hence, since the distribution \({\mathbf{A}}\) is known, the true underlying distribution of \({\mathbf{x}}\) can be uniquely determined \({\mathbf{y}}\prime\) [84].

Fig. 5
figure 5

Unsupervised GAN learning system. The input and output of the generator is measurement complex-valued k-space data and two-dimensional image, then the output of generator performs forward measurement operation including a random undersampled mask to get simulation undersampled k-space data, finally, discriminator tries to distinguish between simulation data and measurement data. This figure is reproduced following Fig. 1 in Ref. [83]

Meanwhile, Wasserstein distance is continuous and differentiable compared with Jensen-Shannon to serve as the measure of the distance between two probability distributions [85], and it is variation WGAN-GP which has proven to have the best convergence performance [86] was selected as the loss function of the network here. Ultimately, the experiment was carried out on dynamic contrast-enhanced (DCE) data and knee data, and the results showed that the reconstruction quality has competed to supervised counterpart and better than CS.

Besides, experiments have proved that image style transfer tasks do not need ground-truth can be finished by unpaired training with only adversarial training [87], which is based on the assumption that there is a potential distribution relationship between unpaired samples and try to let the network learn this relationship. Hence, Lei et al. [88] suggested 2D images that are easier to obtain can be used as training labels for DCE images to train a PGD-based GAN network, Sim et al. incorporate the cycle consistent generative adversarial network (cycleGAN) [87] and forward physics in MRI using optimal driven theory to complete unpaired samples training [89]. For this type of method, the degree of joint probability distribution between unpaired samples will directly affect the experimental results. Thereby, the experimental results can be predicted to be no better than CS because the unpaired samples are completely disjoint in [88].

In addition to the mentioned methods, there are other ways to solve the problem of network training without fully sampled images. For example, traditional parallel compressed sensing imaging can be used as the true label for network training [90], but the final reconstruction quality of the network will not be significantly better than traditional parallel compressed sensing [83].


In summary, deep learning can complete related tasks by learning an implicit mapping relationship. Although the learning process cannot be explained in detail, many experimental results show that deep learning is effective and feasible. In normal conditions, the performance of supervised learning is better than that of network learning without labels. But in many scenarios, it is very difficult and infeasible to obtain labels, which seriously hinders the application of supervised learning, and also highlights the importance of deep learning without ground truth. Despite the reviewed methods are applied to MRI reconstruction, it can also be extended to other areas where it is difficult to obtain real data, such as dynamic positron emission tomography (PET) [91] or computed tomography (CT) [92].

While deep learning shows strong learning capabilities, it has also been criticized for its poor interpretability. Accordingly, the theoretical research of deep learning has become a new hot research direction. Researchers try to analyze the effective network structure from different angles to guide the construction of new networks, such as differential equations [93] and matrix decomposition [94].

Nevertheless, exploring the effective prior information in the lossy data and the inherent characteristics of the network is still a direction that needs to be studied. In addition, an appropriate sampling strategy for a specific body part is important for deep learning reconstruction performance. Typically, fast coronary imaging usually uses the spiral under-sampling scheme [95] and some studies try to learn sampling strategy and reconstruction at the same time through the network [96].

Fortunately, inspired by the unfolding network based on the physical model. As traditional iterative algorithms and deep learning have their advantages, the former is computationally complex but has strong guidance significance, and the latter has the advantages of real-time imaging and powerful learning capabilities. We may anticipate that the deeper integration of traditional methods and deep learning, which will not only guide the construction of networks to bring more interpretability but also more importantly, can obtain better results. Additionally, information complementarity between multi-contrast images may be used as priors to participate in reconstruction.

Availability of data and materials

No datasets were generated or analyzed during the current study in this review.



Alternate directions method of multipliers


Conjugate gradient


Computed tomography


Compressed sensing


Data consistency


Dynamic contrast-enhanced


Fast Fourier transform


Generative adversarial networks


Iterative shrinkage-thresholding algorithm


Magnetic resonance imaging


Mean structure similarity index measure




Peak signal-to-noise ratio


Positron emission tomography


Projected iterative soft-thresholding algorithm


Proximal gradient descent




Regularization by denoising


Stochastic gradient descent


Self-supervised learning via data under-sampling


Total variation


Variable-splitting with the quadratic penalty


  1. Pruessmann KP, Weiger M, Scheidegger MB, Boesiger P. SENSE: sensitivity encoding for fast MRI. Magn Reson Med. 1999;42(5):952–62.

    Article  CAS  PubMed  Google Scholar 

  2. Griswold MA, Jakob PM, Heidemann RM, Nittka M, Jellus V, Wang J, Kiefer B, Haase A. Generalized autocalibrating partially parallel acquisitions (GRAPPA). Magn Reson Med. 2002;47(6):1202–10.

    Article  PubMed  Google Scholar 

  3. Lustig M, Pauly JM. SPIRiT: Iterative self-consistent parallel imaging reconstruction from arbitrary k-space. Magn Reson Med. 2010;64(2):457–71.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Terpstra ML, Maspero M, D’Agata F, Stemkens B, Intven MP, Lagendijk JJ, van den Berg CA, Tijssen RH. Deep learning-based image reconstruction and motion estimation from undersampled radial k-space for real-time MRI-guided radiotherapy. Phys Med Biol. 2020;65(15):155015.

    Article  PubMed  Google Scholar 

  5. Radke KL, Wollschläger LM, Nebelung S, Abrar DB, Schleich C, Boschheidgen M, Frenken M, Schock J, Klee D, Frahm J, Antoch G, Thelen S, Wittsack H-J, Lutz AM. Deep learning-based post-processing of real-time MRI to assess and quantify dynamic wrist movement in health and disease. Diagnostics. 2021;11(6):1077.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Fessler JA. Optimization methods for MR image reconstruction (long version); 2019.

  7. Zhang HM, Dong B. A review on deep learning in medical image reconstruction. J Oper Res Soc. 2020;8:311–40.

    Article  CAS  Google Scholar 

  8. Dar SUH, Özbey M, Çatlı AB, Çukur T. A transfer-learning approach for accelerated MRI using deep neural networks. Magn Reson Med. 2020;84(2):663–85.

    Article  PubMed  Google Scholar 

  9. Qu X, Zhang W, Guo D, Cai C, Cai S, Chen Z. Iterative thresholding compressed sensing MRI based on contourlet transform. Inverse Probl Sci Eng. 2010;18(6):737–58.

    Article  Google Scholar 

  10. Knoll F, Bredies K, Pock T, Stollberger R. Second order total generalized variation (TGV) for MRI. Magn Reson Med. 2011;65(2):480–91.

    Article  PubMed  Google Scholar 

  11. Zhan Z, Cai JF, Guo D, Liu Y, Chen Z, Qu X. Fast multiclass dictionaries learning with geometrical directions in MRI reconstruction. IEEE Trans Biomed Eng. 2016;63(9):1850–61.

    Article  PubMed  Google Scholar 

  12. Qu X, Hou Y, Lam F, Guo D, Zhong J, Chen Z. Magnetic resonance image reconstruction from undersampled measurements using a patch-based nonlocal operator. Med Image Anal. 2014;18(6):843–56.

    Article  PubMed  Google Scholar 

  13. Lingala SG, Hu Y, DiBella E, Jacob M. Accelerated dynamic MRI exploiting sparsity and low-rank structure: k-t SLR. IEEE Trans Med Imaging. 2011;30(5):1042–54.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Lustig M, Donoho D, Pauly JM. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn Reson Med. 2007;58(6):1182–95.

    Article  PubMed  Google Scholar 

  15. Lustig M, Donoho DL, Santos JM, Pauly JM. Compressed sensing MRI. IEEE Signal Process Mag. 2008;25(2):72–82.

    Article  Google Scholar 

  16. Block KT, Uecker M, Frahm J. Undersampled radial MRI with multiple coils. Iterative image reconstruction using a total variation constraint. Magn Reson Med. 2007;57(6):1086–98.

    Article  PubMed  Google Scholar 

  17. Liu Y, Zhan Z, Cai JF, Guo D, Chen Z, Qu X. Projected iterative soft-thresholding algorithm for tight frames in compressed sensing magnetic resonance imaging. IEEE Trans Med Imaging. 2016;35(9):2130–40.

    Article  Google Scholar 

  18. Eslahi SV, Dhulipala PV, Shi C, Xie G, Ji JX. Parallel compressive sensing in a hybrid space: application in interventional MRI. In: Annual international conference of the IEEE engineering in medicine and biology society (EMBC); 2017. p. 3260–63.

  19. Lai Z, Qu X, Liu Y, Guo D, Ye J, Zhan Z, Chen Z. Image reconstruction of compressed sensing MRI using graph-based redundant wavelet transform. Med Image Anal. 2016;27:93–104.

    Article  PubMed  Google Scholar 

  20. Qu X, Guo D, Ning B, Hou Y, Lin Y, Cai S, Chen Z. Undersampled MRI reconstruction with patch-based directional wavelets. Magn Reson Imaging. 2012;30(7):964–77.

    Article  PubMed  Google Scholar 

  21. Ravishankar S, Bresler Y. Data-driven learning of a union of sparsifying transforms model for blind compressed sensing. IEEE Trans Comput Imaging. 2016;2(3):294–309.

    Article  Google Scholar 

  22. Ravishankar S, Bresler Y. MR image reconstruction from highly undersampled k-space data by dictionary learning. IEEE Trans Med Imaging. 2010;30(5):1028–41.

    Article  PubMed  Google Scholar 

  23. Wang Y, Ying L. Compressed sensing dynamic cardiac cine MRI using learned spatiotemporal dictionary. IEEE Trans Biomed Eng. 2013;61(4):1109–20.

    Article  Google Scholar 

  24. Yu Y, Jin J, Liu F, Crozier S. Multidimensional compressed sensing MRI using tensor decomposition-based sparsifying transform. PLoS ONE. 2014;9(6):e98441.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Shin PJ, Larson PE, Ohliger MA, Elad M, Pauly JM, Vigneron DB, Lustig M. Calibrationless parallel imaging reconstruction based on structured low-rank matrix completion. Magn Reson Med. 2014;72(4):959–70.

    Article  PubMed  Google Scholar 

  26. Ongie G, Jacob M. Off-the-grid recovery of piecewise constant images from few Fourier samples. SIAM J Imaging Sci. 2016;9(3):1004–41.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Jin KH, Lee D, Ye JC. A general framework for compressed sensing and parallel MRI using annihilating filter based low-rank Hankel matrix. IEEE Trans Comput Imaging. 2016;2(4):480–95.

    Article  Google Scholar 

  28. Hu Y, Liu X, Jacob M. A generalized structured low-rank matrix completion algorithm for MR image recovery. IEEE Trans Med Imaging. 2019;38(8):1841–51.

    Article  PubMed  Google Scholar 

  29. Zhang X, Guo D, Huang Y, Chen Y, Wang L, Huang F. Image reconstruction with low-rankness and self-consistency of k-space data in parallel MRI. Med Image Anal. 2020;63:101687.

    Article  PubMed  Google Scholar 

  30. Haldar JP. Low-rank modelling of local k-space neighborhoods (LORAKS) for constrained MRI. IEEE Trans Med Imaging. 2014;33(3):668–81.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Haldar JP, Zhuo J. P-LORAKS: low-rank modelling of local k-space neighborhoods with parallel imaging data. Magn Reson Med. 2016;75(4):1499–514.

    Article  PubMed  Google Scholar 

  32. Yaman B, Hosseini SAH, Moeller S, Ellermann J, Uğurbil K, Akçakaya M. Self-supervised learning of physics-guided reconstruction neural networks without fully sampled reference data. Magn Reson Med. 2020;84(6):3172–91.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Combettes PL, Pesquet J-C. Proximal splitting methods in signal processing. In: Fixed-point algorithms for inverse problems in science and engineering. Springer; 2011. p. 185–212.

  34. Daubechies I, Defrise M, De Mol C. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun Pure Appl Math. 2004;57(11):1413–57.

    Article  Google Scholar 

  35. Boyd S, Parikh N, Chu E. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn. 2011;3(1):1–122.

    Article  Google Scholar 

  36. Aggarwal HK, Mani MP, Jacob M. MoDL: Model-based deep learning architecture for inverse problems. IEEE Trans Med Imaging. 2019;38(2):394–405.

    Article  PubMed  Google Scholar 

  37. Zhang J, Ghanem B. ISTA-Net: interpretable optimization-inspired deep network for image compressive sensing. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2018. p. 1828–37.

  38. Zhang X, Lu H, Guo D, Bao L, Huang F, Xu Q. A guaranteed convergence analysis for the projected fast iterative soft-thresholding algorithm in parallel MRI. Med Image Anal. 2021;69:101987.

    Article  PubMed  Google Scholar 

  39. Lu T, Zhang X, Huang Y, Guo D, Huang F, Xu Q, Hu Y, Yang L, Lin J, Yan Z, Qu X. pFISTA-SENSE-ResNet for parallel MRI reconstruction. J Magn Reson. 2020;318:106790.

    Article  CAS  PubMed  Google Scholar 

  40. Yang J, Zhang Y, Yin W. A fast alternating direction method for TVL1-L2 signal reconstruction from partial Fourier data. IEEE J Sel Top Signal Process. 2010;4(2):288–97.

    Article  Google Scholar 

  41. Cheng J, Wang H, Zhu Y, Liu Q, Zhang Q, Su T. Model-based deep medical imaging: the roadmap of generalizing iterative reconstruction model using deep learning; 2019.

  42. Qu X, Huang Y, Lu H, Qiu T, Guo D, Agback T, Orekhov V, Chen Z. Accelerated nuclear magnetic resonance spectroscopy with deep learning. Angew Chem Int Ed. 2020;59(26):10297–300.

    Article  CAS  Google Scholar 

  43. Chen D, Wang Z, Guo D, Orekhov V, Qu X. Review and prospect: deep learning in nuclear magnetic resonance spectroscopy. Chem Eur J. 2020;26(46):10391–401.

    Article  CAS  PubMed  Google Scholar 

  44. Wang Z, Guo D, Huang Y, Tu Z, Orekhov V, Qu X. Accelerated NMR spectroscopy: merge optimization with deep learning; 2020.

  45. Hu W, Chen D, Qiu T, Chen H, Chen X, Yang L. Denoising single voxel magnetic resonance spectroscopy with deep learning on repeatedly sampled in vivo data; 2021.

  46. Huang Y, Zhao J, Wang Z, Guo D, Qu X. Complex exponential signal recovery with deep hankel matrix factorization; 2020.

  47. Wang S, Su Z, Ying L, Peng X, Zhu S, Liang F. Accelerating magnetic resonance imaging via deep learning. In: IEEE 13th international symposium on biomedical imaging (ISBI); 2016. p. 514–17.

  48. Han Y, Yoo J, Kim HH, Shin HJ, Sung K, Ye JC. Deep learning with domain adaptation for accelerated projection-reconstruction MR. Magn Reson Med. 2018;80(3):1189–205.

    Article  PubMed  Google Scholar 

  49. Akçakaya M, Moeller S, Weingärtner S, Uğurbil K. Scan-specific robust artificial-neural-networks for k-space interpolation (RAKI) reconstruction: database-free deep learning for fast imaging. Magn Reson Med. 2019;81(1):439–53.

    Article  PubMed  Google Scholar 

  50. Han Y, Sunwoo L, Ye JC. K-space deep learning for accelerated MRI. IEEE Trans Med Imaging. 2020;39(2):377–86.

    Article  PubMed  Google Scholar 

  51. Sun J, Li H, Xu Z. Deep ADMM-Net for compressive sensing MRI. In: Advances in neural information processing systems; 2016. p. 10–18.

  52. Adler J, Öktem O. Learned primal-dual reconstruction. IEEE Trans Med Imaging. 2018;37(6):1322–32.

    Article  PubMed  Google Scholar 

  53. Bottou L. Large-scale machine learning with stochastic gradient descent. In: Proceedings of the 19th international conference on computational statisticsparis france (COMPSTAT). Physica-Verlag HD; 2010. p. 177–86.

  54. Kingma DP, Ba J. Adam: a method for stochastic optimization; 2014.

  55. Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res. 2011;12(7):2121–59.

    Google Scholar 

  56. Hinton G, Srivastava N, Swersky K. Neural networks for machine learning. Coursera, video lectures. 2012;264(1):2146–53.

    Google Scholar 

  57. Hammernik K, Klatzer T, Kobler E, Recht MP, Sodickson DK, Pock T, et al. Learning a variational network for reconstruction of accelerated MRI data. Magn Reson Med. 2018;79(6):3055–71.

    Article  PubMed  Google Scholar 

  58. Qin C, Schlemper J, Caballero J, Price AN, Hajnal JV, Rueckert D. Convolutional recurrent neural networks for dynamic MR image reconstruction. IEEE Trans Med Imaging. 2019;38(1):280–90.

    Article  PubMed  Google Scholar 

  59. Ulyanov D, Vedaldi A, Lempitsky V. Deep image prior. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. p. 9446–54.

  60. Yazdanpanah AP, Afacan O, Warfield SK. Non-learning based deep parallel MRI reconstruction (NLDpMRI). In: Medical imaging 2019: image processing, vol. 10949. International Society for Optics and Photonics (SPIE); 2019. p. 1094904.

  61. Senouf O, Vedula S, Weiss T, Bronstein A, Michailovich O, Zibulevsky M. Self-supervised learning of inverse problem solvers in medical imaging. Cham: Springer; 2019. p. 111–9.

    Google Scholar 

  62. Jin KH, Gupta H, Yerly J, Stuber M, Unser M. Time-dependent deep image prior for dynamic MRI; 2019.

  63. Mataev G, Milanfar P, Elad M. Deepred: deep image prior powered by red. In: Proceedings of the IEEE international conference on computer vision (ICCV). 2019.

  64. Wang AQ, Dalca AV, Sabuncu MR. Neural network-based reconstruction in compressed sensing MRI without fully-sampled training data. Cham: Springer; 2020. p. 27–37.

    Google Scholar 

  65. Yaman B, Hosseini SAH, Moeller S, Ellermann J, Uğurbil K, Akçakaya M. Multi-mask self-supervised learning for physics-guided neural networks in highly accelerated MRI; 2020.

  66. Hosseini SAH, Yaman B, Moeller S, Akçakaya M. High-fidelity accelerated MRI reconstruction by scan-specific fine-tuning of physics-based neural networks. In: Proceeding in 2020 42nd annual international conference of the IEEE engineering in medicine and biology society (EMBC); 2020. p. 1481–1484.

  67. Lehtinen J, Munkberg J, Hasselgren J, Laine S, Karras T, Aittala M. Noise2noise: learning image restoration without clean data; 2018.

  68. Sun L, Fan Z, Ding X, Huang Y, Paisley J. Region-of-interest undersampled MRI reconstruction: a deep convolutional neural network approach. Magn Reson Imaging. 2019;63:185–92.

    Article  PubMed  Google Scholar 

  69. Ramzi Z, Philippe C, Jean JS. Benchmarking MRI reconstruction neural networks on large public datasets. Appl Sci. 2020;10(5):1816.

    Article  CAS  Google Scholar 

  70. Gan W, Sun Y, Eldeniz C, Liu J, An H, Kamilov US. Deep image reconstruction using unregistered measurements without groundtruth; 2020.

  71. LaMontagne PJ, Benzinger TL, Morris JC, Keefe S, Hornbeck R, Xiong C, et al. OASIS-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer disease. 2019.

  72. Liu J, Sun Y, Eldeniz C, Gan W, An H, Kamilov US. RARE: Image reconstruction using deep priors learned without ground truth. IEEE J Sel Top Signal Process. 2020;14(6):1088–99.

    Article  Google Scholar 

  73. Ke Z, Cheng J, Ying L, Zheng H, Liang D. An unsupervised deep learning method for multi-coil cine MRI. Phys Med Biol. 2020;65(23):235041.

    Article  PubMed  Google Scholar 

  74. Romano Y, Elad M, Milanfar P. The little engine that could: regularization by denoising (RED). SIAM J Imaging Sci. 2017;10(4):1804–44.

    Article  Google Scholar 

  75. Reehorst ET, Schniter P. Regularization by denoising: clarifications and new interpretations. IEEE Trans Comput Imaging. 2019;5(1):52–67.

    Article  PubMed  Google Scholar 

  76. Venkatakrishnan SV, Bouman CA, Wohlberg B. Plug-and-play priors for model based reconstruction. In: Proceeding of the 2013 IEEE global conference on signal and information processing (GlobalSIP); 2013. p. 945–8.

  77. Uecker M, Lai P, Murphy MJ, Virtue P, Elad M, Pauly JM, et al. ESPIRiT: an eigenvalue approach to autocalibrating parallel MRI: where SENSE meets GRAPPA. Magn Reson Med. 2014;71(3):990–1001.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Otazo R, Candes E, Sodickson DK. Low-rank plus sparse matrix decomposition for accelerated dynamic MRI with separation of background and dynamic components. Magn Reson Med. 2015;73(3):1125–36.

    Article  PubMed  Google Scholar 

  79. Nakarmi U, Wang Y, Lyu J, Liang D, Ying L. A Kernel-based low-rank (KLR) model for low-dimensional manifold recovery in highly accelerated dynamic MRI. IEEE Trans Med Imaging. 2017;36(11):2297–307.

    Article  PubMed  PubMed Central  Google Scholar 

  80. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S. Generative adversarial nets. In: Advances in neural information processing systems. New York; 2014. p. 2672–80.

  81. Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks; 2015.

  82. Zhu J-Y, Krähenbühl P, Shechtman E, Efros AA. Generative visual manipulation on the natural image manifold. In: Proceeding of the European conference on computer vision (ECCV); 2016. p. 597–613.

  83. Cole EK, Pauly JM, Vasanawala SS, Ong F. Unsupervised MRI reconstruction with generative adversarial networks; 2020.

  84. Bora A, Price E, Dimakis AG. AmbientGAN: Generative models from lossy measurements. In: Proceeding of the international conference on learning representations (ICLR). 2018.

  85. Villani C. The Wasserstein distances. In: Optimal transport, vol. 338. Springer; 2009. p. 93–111.

  86. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC. Improved training of wasserstein gans. In: Advances in neural information processing systems; 2017. p. 30.

  87. Zhu J-Y, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision (ICCV); 2017. p. 2223–32.

  88. Lei K, Mardani M, Pauly JM, Vasanawala S. Wasserstein GANs for MR imaging: from paired to unpaired training. IEEE Trans Med Imaging. 2021;40(1):105–15.

    Article  PubMed  Google Scholar 

  89. Sim B, Oh G, Kim J, Jung C, Ye JC. Optimal transport driven CycleGAN for unsupervised learning in inverse problems. SIAM J Imaging Sci. 2020;13(4):2281–306.

    Article  Google Scholar 

  90. Cheng JY, Chen F, Alley MT, Pauly JM, Vasanawala SS. Highly scalable image reconstruction using deep neural networks with bandpass filtering; 2018.

  91. Gong K, Catana C, Qi J, Li Q. Direct patlak reconstruction from dynamic PET using unsupervised deep learning. In: Proceeding of the 15th international meeting on fully three-dimensional image reconstruction in radiology and nuclear medicine, vol. 11072. International Society for Optics and Photonics (SPIE); 2019. p. 110720R.

  92. Gallegos IO, Koundinyan S, Suknot AN, Jimenez ES, Thompson KR, Goodner RN. Unsupervised learning methods to perform material identification tasks on spectral computed tomography data. In: Radiation detectors in medicine, industry, and national security XIX, vol. 10763. International Society for Optics and Photonics (SPIE); 2018. p. 107630G.

  93. Lu Y, Zhong A, Li Q, Dong B. Beyond finite layer neural networks: bridging deep architectures and numerical differential equations. In: Proceedings of international conference on machine learning (ICML). 2018; p. 3276–3285.

  94. Ye JC, Han Y, Cha E. Deep convolutional framelets: a general deep learning framework for inverse problems. SIAM J Imaging Sci. 2018;11(2):991–1048.

    Article  Google Scholar 

  95. Irarrazabal P, Nishimura DG. Fast three dimensional magnetic resonance imaging. Magn Reson Med. 1995;33:656–62.

    Article  CAS  PubMed  Google Scholar 

  96. Aggarwal H K, Jacob M. J-MoDL: joint model-based deep learning for optimized sampling and reconstruction; 2019.

Download references


The authors thank Zhangren Tu, Chen Qian, and Haoming Fang for valuable discussions.


This work is supported in part by the National Natural Science Foundation of China (61871341, 61901188, 61672335), in part by the Natural Science Foundation of Fujian Province of China under Grant (2021J011184), Health-Education Joint Research Project of Fujian Province (2019-WJ-31), and the science and technology fund of Fujian education department (JT180280).

Author information

Authors and Affiliations



GZ and DG conceived the idea of the manuscript and were major contributors in writing the manuscript. YG and JZ prepared some figures and polish the English. XQ and ZW revised the manuscript. ZL and XD replied and revised comments. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Di Guo.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

Xiaobo Qu, works as a Senior Editor for BMC Medical Imaging. The other authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zeng, G., Guo, Y., Zhan, J. et al. A review on deep learning MRI reconstruction without fully sampled k-space. BMC Med Imaging 21, 195 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: