Compressed sensing MRI prior art
In CS MRI, \(\textbf {x}\in \mathbb {C}^{p}\) denotes the vector form of the 2D MR image to be reconstructed and y=F
u
x denotes the sensing procedure in k-space, where \(\textbf {F}_{u}\in \mathbb {C}^{m\times p}\left (m \ll p\right)\) means undersampled Fourier Encoding matrix and \(\textbf {y}\in \mathbb {C}^{m}\) represents k-space measurements. Assume Ψ represents an analytical sparse transform operator or the inverse of a set of signals learnt from image patches, the sparse representation is defined as α=Ψ x. Reconstructing the unknown MR image x from measurements y by using CS is to solve the constrained CS MRI optimization problem (3)
$$ \min_{\textbf{x}} \left\|\boldsymbol{\Psi} \textbf{x}\right\|_{1} \ \ \text{s.t.} \ \left\|\textbf{F}_{u} \textbf{x}-\textbf{y}\right\|^{2}_{2} \leq \boldsymbol{\varepsilon} $$
((3))
where \(\boldsymbol {\varepsilon }\in \mathbb {C}^{m}\) represents the allowed noise vector in reconstructed image, l 1 regularization enforces sparsity of Ψ x and error constraint fits the reconstruction to the sampled k-space measurements. Finite difference referred to as TV is typically added into (3) for noise reduction and spatial homogeneity enhancement. Then the formulation is
$$ \min_{\textbf{x}} \left\|\boldsymbol{\Psi} \textbf{x}\right\|_{1}+ \lambda TV\left(\textbf{x}\right) \ \ \text{s.t.} \ \left\|\textbf{F}_{u} \textbf{x}-\textbf{y}\right\|^{2}_{2} \leq \boldsymbol{\varepsilon} $$
((4))
where λ>0 denotes the weight of TV regularization. This becomes problem formulation settled by classical LDP when Ψ represents DWT. Besides, most state-of-the-art AL based reconstruction methods consider one unconstrained problem rather than (4), such as TVCMRI, FCSA, iterative thresholding CS MRI based on SFLCT and SALSA [50], etc
$$ \min_{\textbf{x}} \lambda_{1}\left\|\boldsymbol{\Psi} \textbf{x}\right\|_{1}+ \lambda_{2} TV\left(\textbf{x}\right)+\frac{1}{2}\left\|\textbf{F}_{u} \textbf{x}-\textbf{y}\right\|^{2}_{2} $$
((5))
Additionally, reconstruction based on explicit DL from image patches has been explored
$$ \begin{aligned} &\min_{\textbf{x},\textbf{D},\boldsymbol{\Gamma}} \sum\limits_{ij}\left\|\textbf{R}_{ij}\textbf{x}-\textbf{D}\boldsymbol{\alpha}_{ij}\right\|^{2}_{2}+ \lambda \left\|\textbf{F}_{u} \textbf{x}-\textbf{y}\right\|^{2}_{2} \ \text{s.t.} \ \left\|\boldsymbol{\alpha}_{ij}\right\|_{0} \\ &\quad\leq T_{0}, \forall i,j \end{aligned} $$
((6))
(6) is the formulation settled by DLMRI integrating DL with image reconstruction simultaneously (refer to [35]), where matrix \(\textbf {R}_{\textit {ij}}\in \mathbb {C}^{n \times p}\) is an operator that extracts maximum overlapped patch x
ij
as a column vector from x, denoted as x
ij
=R
ij
x. \(\sum _{\textit {ij}}\textbf {R}_{\textit {ij}}^{T} \textbf {R}_{\textit {ij}}\in \mathbb {C}^{p \times p}\) represents a diagonal matrix, in which the diagonal elements correspond to pixel locations of x. Γ is sparse representation set {α
ij
}
ij
of all training patches of image. T 0 denotes sparse threshold and D the explicit dictionary. Here DL problem is depicted in detail as a foundation of the proposed sparsity structure. The DL problem is
$$ \min_{\textbf{D}, \boldsymbol{\Gamma}} \sum\limits_{ij}\left\|\textbf{R}_{ij}\textbf{x}-\textbf{D}\boldsymbol{\alpha}_{ij}\right\|^{2}_{2} \ \ \text{s.t.} \ \left\|\boldsymbol{\alpha}_{ij}\right\|_{0} \leq T_{0}, \forall i,j $$
((7))
Formulation (7) is nonconvex and NP-hard [51, 52] because it comes down to sparse encoding problem for fixed D and x. K-SVD [53, 54] is a simple but efficient approach to attack (7). It simultaneously implements dictionary update step where each atom of D renews sequentially and corresponding sparse encoding for image patches that currently use it. The dictionary atom update involves computing K singular value decompositions(SVDs), once for each atom.
Proposed method
In this section, a composite regularization CS MRI method established on a novel composite sparsity structure is presented. In the sparsity structure, UDCT decomposes spatial image into one lowpass sub-band and several highpass sub-bands. Patch-based dictionary is learnt from the lowpass sub-band coefficients patches via Sparse K-SVD [25]. Then a novel composite regularization reconstruction model is thereby established and solved via VS and ADMM-2. The reconstruction model involves spatial image and transform coefficients regularization and k-space data fitting. The framework in Fig. 1 shows clearly the implementation process of the proposed method, in which the unknown MR image x is initialized with a zero-filling reconstructed image via direct inverse Fourier transform to k-space measurements, denoted as \(\textbf {x}_{0}=\textbf {F}^{H}_{u}\textbf {y}\). UDCT decomposes both the real and imaginary parts of x 0 into S levels, each level possessing 2κ
s
directional sub-bands. The real and imaginary parts of complex-valued MR image are handled separately because UDCT can only perfectly deal with real-valued data. Take the real part of zero-filling reconstructed image for example, the lowpass UDCT sub-band is divided into maximum overlapped patches (for dividing method, refer to [35]) as training database for DL to enhance its sparsity. The obtained dictionary D r (r=1(0) denotes result over real (imaginary) part) is the result of Sparse K-SVD to the training database. The sparse encodings set are referred to as the double sparse coefficients for all training lowpass UDCT sub-band patches over learnt D r. For imaginary part, the same procedure is implemented. Let x 0 be the initial intermediate image and (D r)† the pseudo-inverse of D r. The reconstruction step starts afterwards. All nonoverlapping vector form patches (n×1 sized) are arrayed to produce a matrix from lowpass UDCT sub-band of intermediate image. Results of (D r)† multiplying with the above matrix are the representation coefficients \(\boldsymbol {\Gamma }^{r}_{R}\) (differing from double sparse coefficients) of lowpass UDCT sub-band coefficients over the dictionary. They are generally not sparse but easier to handle in our reconstruction approach. The composite regularization reconstruction formulation is solved by using VS and ADMM-2 based on C-SALSA thoughts in an iterative procedure (an updated intermediate image for once iteration), which involves modifying the representation coefficients, UDCT sub-bands coefficients and k-space measurements. The proposed method is named as local sparsity enhanced CS MRI(LSECSMRI). Formulations and implementations of the proposed sparsity structure and relevant reconstruction model are described in detail in the following content.
Composite sparsity structure
To the best of our knowledge, DWT is not applicable for 2D image sparse representation [55, 56] because of its very limited directions and incapacity to capture geometric regularity along singularities of surfaces. Multi-scale geometric analysis(MGA) methods like contourlet transform [56], nonsubsampled contourlet transform(NSCT) [57], SFLCT, FDCT and DST, etc, conquer the defects of DWT. Contourlet transform lacks shift-invariance though, which causes pseudo-Gibbs phenomena around singularities. The resulting contourlets cannot ensure good frequency localization and exhibit some fuzzy artifacts inevitably with a low redundancy of 4/3. NSCT is overcompletely designed with better frequency selectivity, regularity and fully shift-invariance. But it possesses very high redundancy and large time consumption. SFLCT is a semi-redundant contourlet transform providing flexible redundancy. It only increases redundancy in the lowpass filter l 0(ω) to reduce pseudo-Gibbs phenomena, which are mainly induced by down-sample in the lowpass filter. The redundancy can be 2.33 if we don’t down sample l 0(ω) and 1.33 if we down sample it by setting ↓(d,d)=↓(2,2), where d is the down-sample parameter that determines the redundancy of contourlet. But SFLCT performs negatively in capturing clear directional features. Similarly, DST provides low redundancy less than or equal to 2. The needle-shaped elements of FDCT allow very high directional sensitivity and anisotropy and are thus very efficient in representing line-like edges. Nevertheless, FDCT possesses too high redundancy, which makes it sub-optimal in sparse representation, either.
UDCT [46] is recently proposed as an innovative implementation of discrete curvelet transform disposing real-valued signals. Utilizing the ideas of FFT-based discrete curvelet transform and filter-bank based contourlet transform, UDCT is designed as a perfect multi-resolution reconstruction filter bank(FB) and executed by FFT algorithm, possessing advantages of the both. The discrete curvelet functions in UDCT are defined by a parameterized family of smooth windowed functions that satisfy two conditions: 1) 2π periodic; 2) their squares form a partition of unity and the centers of the curvelet functions at each resolution are positioned on a uniform lattice. Moreover, the lattices of lower scales are subset of those at higher scales, guaranteeing clear parent-children relationship. UDCT can provide flexible instead of fixed number of clear directions at each scale to capture various directional geometrical structures of image accurately. Besides, the forward and inverse transform form a tight and self-dual frame with an acceptable redundancy of 4, allowing the input real-valued images to be perfectly reconstructed. In terms of the implementation of UDCT, once all the curvelet windows are computed, the actual forward and inverse UDCT computations are straightforward. UDCT has asymptotic approximation properties: for image x with C 2 (C is a constant) singularities, the best N-term approximation x
N
(N is the number of most important coefficients allowing reconstruction) in the curvelet expansion is given by [12, 55, 58]
$$ \left\|\textbf{x}-\textbf{x}_{N}\right\|^{2}_{2} \leq CN^{-2}\left(\text{logN}\right)^{3} \ \ N\rightarrow{} \infty $$
((8))
This property is known as the optimal sparsity. Therefore, UDCT is considered as the optimal predefined sparse method and is introduced into CS MRI in this paper.
UDCT belongs to predefined sparse priors, which implies that it lacks adaptivity. Explicit dictionary representations in spatial domain gain a higher degree of freedom in the training but sacrifice efficiency of the result. Besides, this kind of explicit dictionary cannot describe hierarchical structures. Incorporating the advantages of UDCT with DL and compensating for the defects of each other, an efficient composite sparsity structure is proposed for local sparsity enhancement. This structure learns adaptive dictionary over lowpass UDCT sub-band coefficients patches. For real (imaginary) part of image x, the DL problem is deduced as
$$ \min_{\textbf{D}^{r}, \boldsymbol{\Gamma}^{r}} \sum\limits_{ij}\left\|\textbf{R}_{ij}(\boldsymbol{\Psi} \textbf{x}^{r})_{l}-\textbf{D}^{r}\boldsymbol{\alpha}_{ij}\right\|^{2}_{2} \ \ \text{s.t.} \ \left\| \boldsymbol{\alpha}_{ij}\right\|_{0} \leq T_{0}, \forall i,j $$
((9))
In (9), Ψ represents UDCT operator. x r denotes real (imaginary) part if r=1(0) and the subscript l denotes lowpass sub-band. The resulting D r is thus dictionary for real (imaginary) part. Γ r is double sparse representation set of {α
ij
}
ij
. The proposed composite sparsity structure has some advantages. Compared with predefined sparse priors, it provides adaptivity and sparser representations for lowpass sub-band coefficients according to the different structural features of lowpass and highpass UDCT sub-bands coefficients. Compared with explicit dictionary, it allows hierarchical sparsity (depending mostly on UDCT) and reduces overfitting and instability in handling noise. Contrast with double sparsity model, it reduces the amount of calculations for not training dictionaries over highpass UDCT sub-bands, which are generally sparse enough. Therefore, the proposed sparse method can fine fit and hierarchically sparsify MR images with important characteristics preserved and avoid wasting time, making a big difference for improved MR image reconstruction performance. In Fig. 2, an example of dictionary trained by using Sparse K-SVD according to the proposed sparsity structure is exhibited. The used image is complex-valued T2-weighted brain image of slice 27 (MR T2wBrain_slice_27 of 256×256 sized). It is acquired from a healthy volunteer at a 3T Siemens Trio Tim MRI scanner, using the T2-weighted turbo spin echo sequence (T R/T E=6100/99 ms, 220×220 mm field of view, 3 mm slice thickness) [37]. The training maximum overlapped patches are 8×8 sized and obtained from lowpass UDCT sub-band coefficients after S=1 level’s UDCT decomposition for real part of MR T2wBrain_slice_27. One can see that the resulting dictionary is a highly structured matrix, implying several favorable properties.
Compressed sensing MRI reconstruction
The proposed reconstruction model and involved reconstruction procedure are demonstrated in this section. It has been proved that composite regularization performs better than either spatial image TV regularization or sparse coefficients l 1 regularization in reconstruction [5, 21, 33]. Besides, the lowpass UDCT sub-band α
l
is the rough approximation of spatial image to be reconstructed and contains large amounts of information, which indicates that TV regularization on lowpass UDCT sub-band coefficients may further improve edge details and suppress noise, promoting the reconstruction quality. According to the different structural features of spatial image, lowpass sub-band α
l
and highpass sub-bands α
h
, a new composite regularization reconstruction model is thereby proposed to handle various regularization efficiently in this paper. The reconstruction model can be denoted as
$$\begin{array}{@{}rcl@{}} \widetilde{\boldsymbol{\alpha}}&=&\min_{\boldsymbol{\Gamma}^{r}_{R},\boldsymbol{\alpha}} TV\left(\boldsymbol{\Psi}^{-1}\boldsymbol{\alpha}\right)+\left\|{\boldsymbol{\alpha}_{h}}\right\|_{1}+TV\left(\textbf{D}^{r}\boldsymbol{\Gamma}^{r}_{R}\right)+\left\|\textbf{D}^{r}\boldsymbol{\Gamma}^{r}_{R}\right\|_{1}\\ \textbf{x}&=&\boldsymbol{\Psi}^{-1}\widetilde{\boldsymbol{\alpha}} \ \ \text{s.t.} \ \left\|\textbf{F}_{u} \textbf{x}-\textbf{y}\right\|^{2}_{2} \leq \boldsymbol{\varepsilon} \end{array} $$
((10))
where \(\boldsymbol {\varepsilon }\in \mathbb {C}^{m}\) is proportional to the noise standard deviation and controls the allowed noise level. The subscript h means highpass UDCT sub-bands coefficients. Spatial image TV regularization T V(Ψ −1 α) and lowpass UDCT sub-band coefficients TV regularization \(TV\left (\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right)\) regularize image without smoothing the boundaries, guaranteeing clear edge details in the reconstructed result. UDCT sub-bands coefficients l 1 regularization realizes sparsity and automatic selection of the most important characteristics. Undersampled k-space data fidelity term makes the reconstruction fitting the measurements. This reconstruction model is effective for embedding efficient composite regularization according to the different structural features of spatial image, lowpass and highpass UDCT sub-bands coefficients. Since TV regularization is capable of maintaining the boundaries of the objects, the reconstructed edge details can be further strengthened by two level of TV regularization. Besides, integrating \(\boldsymbol {\Gamma }^{r}_{R}\) modification into lowpass UDCT sub-band coefficients TV and l 1 regularization can guarantee the accuracy of updated \(\boldsymbol {\Gamma }^{r}_{R}\), reduce the complexity of the reconstruction model and overfitting significantly.
Basic thought to settle this reconstruction model is based on C-SALSA. And it is slightly different from C-SALSA because the designed composite regularization contains more than one regularization terms. Since parameter ε is clearly defined and easy to set, the previous AL based methods ignore it and introduce other regularization parameters λ 1(2). They are thus inefficient [47], such as TVCMRI, FCSA and SALSA, etc. While in the proposed method, the constrained problem (10) is equivalent to an unconstrained one with a discontinuous objective based on the thoughts of C-SALSA
$$ {\begin{aligned} &\min_{\boldsymbol{\Gamma}^{r}_{R},\boldsymbol{\alpha},\textbf{x}} \lambda_{1}\left[TV\left(\boldsymbol{\Psi}^{-1}\boldsymbol{\alpha}\right)+\left\|{\boldsymbol{\alpha}}_{h}\right\|_{1}+TV\left(\textbf{D}^{r}\boldsymbol{\Gamma}^{r}_{R}\right)+\left\|\textbf{D}^{r}\boldsymbol{\Gamma}^{r}_{R}\right\|_{1}\right]\\ &\quad+\lambda_{2}\mathscr{L}_{E(\boldsymbol{\varepsilon},\textbf{I},\textbf{y})}\left(\textbf{F}_{u} \textbf{x}\right) \end{aligned}} $$
((11))
where λ 1 measures the weight of TV and l 1 regularization, λ 2 measures the weight of k-space data fidelity and E (ε,I,y) is simply a closed ε-radius Euclidean ball centered at y. As is shown, (11) can be treated as a special case of E q.30 in [47] by defining \(\Phi \left (\cdot \right)=TV\left (\boldsymbol {\Psi }^{-1}\boldsymbol {\alpha }\right)+\left \|\boldsymbol {\alpha }_{h}\right \|_{1}+TV\left (\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right)+\left \|\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right \|_{1}\). It has the form of E q.13 in [47] with J=5 number of functions. These functions in (11) are closed, proper and convex (for details, refer to [47]). Minimization problem (11) is allowed to be decoupled into 5 independent and resoluble ones by a particular mapping way. These 5 independent subproblems include spatial image TV regularization T V(Ψ −1 α), lowpass UDCT coefficients TV regularization \(TV\left (\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right)\), UDCT sub-bands coefficients l 1 regularization \(\left \|\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right \|_{1}\) and ∥α
h
∥1 and k-space data fidelity \(\mathscr {L}_{E(\boldsymbol {\varepsilon },\textbf {I},\textbf {y})}\left (\textbf {F}_{u} \textbf {x}\right)\). Since all the functions in (11) are closed, proper, convex and [I Ψ F
u
]T has full column rank (Ψ itself is a full column rank matrix), convergence of ADMM-2 involving problem (11) is guaranteed according to Theorem1 (Eckstein-Bertsekas, [59]). The solution of (11) is enforced to approach that of (10) via slowly taking λ 1(2) to very large values by multiplying with a continuous factor ρ (a continuation process with initial values as λ 10, λ 20 and ρ>1). Introducing ADMM-2 to this particular case requires the definition of the Moreau proximal maps associated with l 1 regularization, TV regularization and \(\mathscr {L}_{E(\boldsymbol {\varepsilon }, \textbf {I}, \textbf {y})}\). Since the input of regularizer can be spatial image, UDCT sub-bands coefficients and representation coefficients, μ is introduced to represent the input of regularization uniformly. The Moreau proximal map function of regularization \(\Phi \left (\boldsymbol {\mu }\right):\mathbb {C}^{p}\rightarrow {}\mathbb {C}^{p}\) is denoted as
$$ \boldsymbol{\Theta}_{\Phi}\left(\hat{\boldsymbol{\mu}}\right)=\arg \min_{\boldsymbol{\mu}}\frac{1}{2}\left\|\boldsymbol{\mu}-\hat{\boldsymbol{\mu}}\right\|^{2}_{2}+\Phi\left(\boldsymbol{\mu}\right) $$
((12))
where \(\hat {\boldsymbol {\mu }}\) is the result of mapping to μ according to the mapping relation \(\mathbb {C}^{p}\rightarrow {}\mathbb {C}^{p}\). A simple soft threshold method \(\boldsymbol {\Theta }_{\lambda _{1}\Phi }(\cdot)=soft(\cdot, \lambda _{1})\) solves the l 1 regularization relevant to ∥α
h
∥1 and \(\left \|\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right \|_{1}\). The available fast Chambolle’s algorithm [60] handles TV regularization efficiently. Define ν=F
u
x via VS technique, then the Moreau proximal map of \(\mathscr {L}_{E(\boldsymbol {\varepsilon },\textbf {I},\textbf {y})}\) is simply the orthogonal projection of ν on the closed ε-radius ball centered at y, which can be attacked by
$$ \boldsymbol{\Theta}_{\lambda_{2}\mathscr{L}_{E(\boldsymbol{\varepsilon},\textbf{I},\textbf{y})}}\left(\boldsymbol{\nu}\right)=\left\{ \begin{array}{ll} \textbf{y}+\boldsymbol{\varepsilon}\frac{\boldsymbol{\nu}-\textbf{y}}{\left\|\boldsymbol{\nu}-\textbf{y}\right\|_{2}} &\text{if} \left\|\boldsymbol{\nu}-\textbf{y}\right\|_{2} \textgreater \boldsymbol{\varepsilon}\\ \boldsymbol{\nu} &\text{if} \left\|\boldsymbol{\nu}-\textbf{y}\right\|_{2} \leq \boldsymbol{\varepsilon} \end{array} \right. $$
((13))
In ADMM-2, error minimization with respect to ν aims at fitting measurements of k-space. Regularization terms with respect to spatial image, α
h
and \(\boldsymbol {\Gamma }^{r}_{R}\) avoid overfitting. The fidelity and regularization terms optimization are implemented alternatively. \(\boldsymbol {\Gamma }^{r}_{R}\) is modified from the weighted average between results of \(TV\left (\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right)\) and \(\left \|\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\right \|_{1}\). Then lowpass UDCT sub-band coefficients are obtained as the weighted average between results of T V(Ψ −1 α) and \(\textbf {D}^{r}\boldsymbol {\Gamma }^{r}_{R}\). The modified highpass UDCT sub-bands coefficients are obtained as the weighted average between results of T V(Ψ −1 α) and ∥α
h
∥1. Then image in spatial domain is acquired as the result of inverse UDCT to UDCT coefficients. The subproblem with respect to E(ε,I,y) can be solved via (13) efficiently. The ultimate reconstructed image x
i
(i as counter of iterations for ADMM-2) is the result of the reweighted average between the above spatial domain image through regularization penalty and result of (13) for once iteration. Such process reduces reconstruction error and brings about a high-quality reconstructed image efficiently. The flowchart for LSECSMRI reconstruction in Fig. 3 exhibits clearly the reconstruction process based on the proposed sparsity structure.
Summary of LSECSMRI
In LSECSMRI procedure, Ψ is uniform discrete curvelet decomposition operator of S levels (s=1,2,…,S) with 2κ
s
directions for each level, performing on real and imaginary parts of complex-valued MR data, respectively. The obtained UDCT sub-bands coefficients, representation coefficients of lowpass UDCT sub-band coefficients over D r and measurements y are sent into the composite regularization reconstruction model for MR image reconstruction. The proposed reconstruction method has some advantages as follows. Firstly, the reconstruction method handles the lowpass sub-band and highpass directional sub-bands coefficients regularization separately with different regularization methods, allowing effective reconstruction. Besides, adding the indicator function of ε into the objective makes the resulting problem equivalent to the original problem (10) based on the thoughts of C-SALSA. The resulting problem can be decoupled into several subproblems, which are easy to solve with fast convergent algorithms. Additionally, \(\boldsymbol {\Gamma }^{r}_{R}\) is modified by using the result of TV and l 1 regularization on lowpass UDCT sub-band coefficients, which reduces the computational complexity and overfitting significantly. Hence, the proposed reconstruction approach provides accurate solution along with rapid convergence speed. Superiorities of LSECSMRI are confirmed experimentally later.