Automated lesion detection on MRI scans using combined unsupervised and supervised methods

Background Accurate and precise detection of brain lesions on MR images (MRI) is paramount for accurately relating lesion location to impaired behavior. In this paper, we present a novel method to automatically detect brain lesions from a T1-weighted 3D MRI. The proposed method combines the advantages of both unsupervised and supervised methods. Methods First, unsupervised methods perform a unified segmentation normalization to warp images from the native space into a standard space and to generate probability maps for different tissue types, e.g., gray matter, white matter and fluid. This allows us to construct an initial lesion probability map by comparing the normalized MRI to healthy control subjects. Then, we perform non-rigid and reversible atlas-based registration to refine the probability maps of gray matter, white matter, external CSF, ventricle, and lesions. These probability maps are combined with the normalized MRI to construct three types of features, with which we use supervised methods to train three support vector machine (SVM) classifiers for a combined classifier. Finally, the combined classifier is used to accomplish lesion detection. Results We tested this method using T1-weighted MRIs from 60 in-house stroke patients. Using leave-one-out cross validation, the proposed method can achieve an average Dice coefficient of 73.1 % when compared to lesion maps hand-delineated by trained neurologists. Furthermore, we tested the proposed method on the T1-weighted MRIs in the MICCAI BRATS 2012 dataset. The proposed method can achieve an average Dice coefficient of 66.5 % in comparison to the expert annotated tumor maps provided in MICCAI BRATS 2012 dataset. In addition, on these two test datasets, the proposed method shows competitive performance to three state-of-the-art methods, including Stamatakis et al., Seghier et al., and Sanjuan et al. Conclusions In this paper, we introduced a novel automated procedure for lesion detection from T1-weighted MRIs by combining both an unsupervised and a supervised component. In the unsupervised component, we proposed a method to identify lesioned hemisphere to help normalize the patient MRI with lesions and initialize/refine a lesion probability map. In the supervised component, we extracted three different-order statistical features from both the tissue/lesion probability maps obtained from the unsupervised component and the original MRI intensity. Three support vector machine classifiers are then trained for the three features respectively and combined for final voxel-based lesion classification.


Background
Accurate detection of lesions in the brain is critical to both clinical practice and neuropsychological research. For example, every year more than 795,000 people in the United States suffer a new or recurrent stroke (http:// www.strokeassociation.org) and the identification and analysis of the brain lesions resulting from a stroke can help understand the lesion-deficit relationship [1,2], predict patient diagnosis and prognosis [3], and chart the development of brain pathology over time [3]. In the past two decades, Magnetic Resonance Imaging (MRI) has become a reliable and increasingly popular technique for identifying brain damage and pathologies [4]. To study brain lesions using MRI, the first step is to accurately detect the lesion from different-modality MRIs (e.g. Diffusion-weighted imaging, T1-MRI, FLAIR, or T2-MRI). Here we focus on T1-weighted images in chronic stroke, as this modality is generally available, provides high spatial resolution and good contrast between gray matter (GM), white matter (WM) and cerebral spinal fluid (CSF). The goal of the research presented here is to develop a fully automatic algorithm for accurately detecting lesions from T1-MRI scans in chronic stroke patients.
Traditionally, the gold standard for lesion detection relied on manual delineation by one or more trained neurologists/radiologists creating a binary lesion mask [5]. These methods show high reliability (e.g. 0.86-0.95) between raters [4,6]. However, manual labeling is laborious and subjective. In recent years, several automated methods have been developed for brain lesion detection [7][8][9][10][11][12][13]. However, automated brain lesion detection from MR images is still a very challenging problem, particularly when only the T1-MRI is available. First, it is sometimes difficult to separate the lesion from the surrounding tissues that is relatively structurally intact based only on image intensity since the intensities of the lesion and healthy tissues may be similar, not to mention that the intensity of the lesion may not be homogeneous. Second, lesions are often non-rigid and complex in shape and vary greatly in size and position across different patients. Therefore it is difficult to construct a compact and informative geometric 'prior' to guide lesion detection.
For unsupervised methods, general knowledge or assumptions (priors) derived from healthy controls, such as spatial locations of different brain tissues, are used to guide the brain segmentation and the lesion detectionthe presence of a lesion usually breaks such general priors derived from healthy controls. In [13], a statistical method based on the Markov random field is developed to identify the lesion by comparing the image of a patient and the images of a group of healthy controls. Shen et al. [18] found that, in the lesion regions, the intensity-based segmentation and the location of the tissues are inconsistent with these in healthy controls. They use a fuzzy c-means algorithm to quantify such inconsistencies and then apply a threshold to obtain a binary lesion segmentation. This method was further improved in [19] by introducing 1) a normalized inconsistency measurement, which enables the use of a better threshold, and 2) an extra step that can better distinguish the ventricles from the lesions. Seghier et al. [11] extend a widely used brainsegmentation algorithm [30] to handle lesion detection by introducing a new tissue class for lesions. This extended segmentation algorithm iteratively performs nonrigid atlas-based registration to refine the probability maps of gray matter (GM), white matter (WM), cerebrospinal fluid (CSF), and subject-specific lesions. After that, the refined probability maps of GM and WM are fed into a fuzzy clustering procedure [31] for final lesion classification. This algorithm was further improved by Sanjuan et al. [20] by introducing a new procedure for updating the probability map of subject-specific lesions. Without using lesion samples manually annotated by experts, these unsupervised methods may not accurately capture the subtle difference between the lesion and its surrounding healthy tissues.
For supervised methods, a set of (training) image samples, on which lesions have been manually annotated by experts, are used to train a classifier, which can then be used for lesion detection on a new image by classifying each voxel on this image as lesion or healthy tissue. In Anbeek et al. [7], a K-Nearest Neighbor (KNN) classifier is trained for white matter lesion detection by combining intensity information from T1-weighted, T2weighted, proton density-weighted (PD), and fluid attenuation inversion recovery (FLAIR) MR scans. Based on the same features, Lao et al. [26] suggest the use of support vector machine (SVM) instead of KNN to improve the accuracy of white matter lesion detection. Quddus et al. [28] combine SVM with a boosting technique to detect lesions in PD scans. Morra et al. [27] train a SVM classifier for multiple sclerosis (MS) lesion detection by using Haar-like wavelet features, which are extracted from T1-weighted, T2-weighted, FLAIR, mean diffusivity, and fractional anisotropy MR scans. Schneell et al. [29] extract voxel-based features by measuring each voxel's likelihood to be located in the five selected brain areas and the background from a high angular resolution diffusion imaging (HARDI) scan. Based on these features, a SVM classifier is then trained to distinguish voxels from lesion and those from healthy tissues. Geremia et al. [25] propose a random decision forest method [32] for MS lesion detection by using local and context-rich features [33] from T1weighted, T2-weighted and FLAIR MR scans. Jiang et al. [34] extract Gabor features [35] from multimodal MRIs and then use a distance metric learning [36] to train a classifier to classify each voxel, followed by a graph-cut operation for final segmentation of tumors. To achieve accurate lesion detection, these supervised methods usually require the use of multimodal MRIs, especially the T2-weighted, PD and FLAIR MRIs on which the lesion and healthy tissues show higher intensity difference than on the T1-weighted MRIs [37,38].
In this paper, we propose a new lesion detection method based only on T1-weighted MR images by combining the advantages of unsupervised and supervised methods. Specifically, we first conduct a modified enantiomorphic normalization [39] to warp the native T1-weighted MRI into the MNI standard space [40] and construct an initial lesion probability map (LPM) by comparing the normalized MR image with the healthy controls. Then we perform non-rigid and reversible atlas-based registration to refine the template of gray matter probability map, the template of white matter probability map, the template of external CSF probability map, the template of ventricle probability and initial LPM. These probability maps are combined with the original MR image intensity to construct high-dimensional voxel-based features, with which we supervisedly train a combined SVM classifier for final lesion detection. The unsupervised lesion probability map endues more anatomic/spatial information to the voxel-based features, which increases the discriminations of the lesion and the healthy tissues on the T1-MRIs. For the final supervised lesion detection, we formulate the problem as a feature-set classification where different SVM classifiers are learned for different features. Using the leave-one-out cross-validation, the proposed method can achieve an average Dice coefficient of 73.1 % on inhouse stroke patients and an average Dice coefficient of 66.5 % on MICCAI BRATS 2012 dataset, when compared to expert annotated lesion maps.

Study design
This study adhered to human experimentation guidelines of the U.S. Department of Health and Human Services and Helsinki Declaration. The CDC Institutional Review Board approved the study protocols (IRB # Pro00005458). All participants were volunteers who gave a written and verbal consent to be included in the study and were not required to give informed consent to allow us to show the images in published figures.

Method overview
The proposed method detects lesions on T1-weighed MR images by two sequential components: an unsupervised component to construct an LPM, followed by a supervised component to detect the final lesion areas. As shown in Fig. 1, the unsupervised component consists of four sequential steps: 1) detection of lesion-hemisphere and enantiomorphic normalization that normalizes the input T1-weighted MRI into a standard space, 2) comparing the normalized MRI to a group of normalized healthy controls and identifying an initial probability map of the lesion (LPM) by using the fuzzy clustering pipeline (FCP), 3) amending the template probability maps of GM, WM, external CSF and ventricle using the initial LPM, and 4) refining the LPM and constructing the probability maps of GM, WM, external CSF and ventricle from the input (normalized) MRI by mapping to the amended template [30]. Note that, in Step 3) a new amended template, including four tissue probability maps, needs to be constructed for each input T1-weighted MRI separately.
As shown in Fig. 2, the supervised component also consists of four sequential steps: 5) at each voxel, extracting zero-order, 1st-order, and 2nd-order statistical features from the probability maps of GM, WM and external CSF and the refined LPM resulted from the unsupervised component, and the original T1-weighted MRI; 6) training three binary SVM classifiers that recognize each voxel to be part of lesion or not, by using the three types of statistical features, respectively; 7) combining the three SVM classifiers into a single SVM classifier and applying the combined SVM classifier to detect the lesion voxels; and 8) performing inverse normalization to get the lesions in the original T1-weighted MRI. In this paper, we normalize the brain to the MNI standard space for lesion estimation, feature extraction and final lesion classification to make the parameter setting of proposed method insensitive to the different-size brains and different-resolution images.
Besides the basic idea of combining the unsupervised and supervised components, we also make technical contributions in several of the steps. In Step 1), we automatically detect the lesion hemisphere and only use the healthy hemisphere for enantiomorphic normalization. This step improves the accuracies of enantiomorphic normalization and the final lesion detection. We will discuss it in detail in Section "Step 1): Lesion hemisphere detection and enantiomorphic Normalization". In Step 2), we extend the FCP [31] to construct the initial LPM. This step will be briefly discussed in Section "Step 2): Initial estimation of LPM". In Step 3), we amend the template probability maps of the four brain tissues and refine the initial LPM, which will be elaborated in Section "Step 3): Amendment to normalization-segmentation template tissue maps". From Step 5) to Step 7), we apply statistical feature based supervised learning algorithm to further improve the lesion detection accuracy. We briefly discuss this algorithm in Section "Step 5): Feature extraction". In this paper, Step 4) simply follows the six-tissue unified segmentation normalization (USN) pipeline [30] in SPM8 [41] ('new segment' in SPM8, referred to as 'segment' in SPM12) and Step 8) first follows the inverse normalization in SPM8 [41] and then binarizes the lesion likelihood map into a binary lesion mask.

Unsupervised component
As mentioned in Section "Method overview", in the unsupervised components, we focus on Step 1) detecting the lesioned hemisphere and enantiomorphic normalization, Step 2) constructing initial LPM using FCP and Step 3) amending the template tissue probability maps and refining the initial LPM.

Step 1): Lesion hemisphere detection and enantiomorphic Normalization
To spatially align the brains of a patient and a set of healthy controls and make them comparable, we normalize all the brain MRIs (i.e., both the patient and the healthy For the healthy controls, we can use the USN pipeline [30] in SPM8 [41] for this purpose. USN is a probabilistic framework based on a mixture of Gaussian distributions that enables nonlinear image registration, tissue classification,and bias correction. However, note that the patient's T1 scan has an unusual appearance at the location of the lesion (the feature we are trying to detect), while the template images used for normalization are based on healthy individuals. The difference in intensity between the lesion and corresponding locations in the template can disrupt the automated normalization process [42]. This step is inspired by the previously developed enantiomorphic normalization procedure [39], where the lesion region is manually delineated and replaced by its mirror region from the counter hemisphere prior to normalization. Because the lesion region is not priorly known here, we estimate the healthy brain underlying a lesioned brain by first identifying the healthy hemisphere and then using its mirror copy to replace the lesioned hemisphere.
It is important to note that brains are by nature asymmetric, even the healthy brains are not perfectly symmetric [43]. Without knowing the ground-truth lesion mask, here we assume the brain symmetry to construct a rough estimate of the healthy brain underlying a lesioned brain, for enantiomorphic brain normalization to the standard space. While we know both the estimated healthy brain and the enantiomorphic normalization are not perfect, we only use them to estimate an initial LPM and expect that the later steps will further refine the initial LPM.
In Section "Discussion", we conduct experiment to analyze the impact of the brain asymmetry to the proposed method.
As illustrated in Fig. 3, the proposed normalization procedure consists of the following steps. a) Estimate the symmetry axis/plane, i.e., the middle plane that separates the two hemispheres, as shown by the red line in Fig. 3. In this paper, we use the tool developed by Rorden [44] for the symmetry axis estimation. Specifically, let I s be the source MRI of the patient, with size x max × y max × z max , and we construct a reference image I r by flipping I s over the x − z plane, i.e., I r (x, y, z) = I s (x, y max − y, z) for all the voxels (x, y, z). We then use SPM8 coregistration routine [41,45] to rigidly register I s and I r using an affine transformation T sr , i.e., T sr (I s ) is spatially aligned with I r . This registration is achieved by maximizing the mutual information metric. By applying half of this transformation to the source image, we construct a half-way image, as illustrated in Fig. 3. I h = 1 2 T sr (I s ) and the symmetry axis of I h is the plane y = 1 2 y max . This plane can be mapped to the source image by using the inverse transform 2T −1 sr , which is the desired symmetry axis in the source image, as shown by the red line in Fig. 3.

Fig. 3
An illustration of the symmetry axis estimation and enantiomorphic normalization b) Identify the lesioned hemisphere. Since the intensity of a lesion on T1-MRI is usually lower than its counter region in the other hemisphere without a lesion, we calculate the intensity-weighted center of mass of I s as . ( This center of mass would be biased towards the healthy hemisphere. Therefore, by assuming that the lesioned hemisphere is darker than the healthy hemisphere we can detect the lesioned hemisphere. Note that this step could be modified to process other modality MRIs. For example, in the T2-weighted MRIs, lesion regions tend to show higher intensities than the healthy tissues. c) Estimate the underlying healthy brain. This can be achieved by replacing the lesioned hemisphere using the mirror of the healthy hemisphere, flipped over the symmetry axis. d) Normalize the patient MRI. We first use the nonlinear USN in SPM8 to warp the estimated healthy brain to the MNI standard space and then apply the same transformation to the native patient image for normalization.

Step 2): Initial estimation of LPM
In this step, our goal is to construct an initial lesion probability map (LPM) based on the normalized patient image. In the standard space, we compare the intensity of each voxel between the patient and the healthy controls to estimate the lesion likelihood of the voxel in the patient image. For each brain T1-MRI, we take the healthy hemisphere for estimating the intensity distribution and then normalize this distribution [1,46,47] to calculate the Z-score for all the voxels in the brain.
Similarly, each healthy control T1-MRI is converted to Z-scores based on the whole brain mean and standard deviation. To suppress image noise, we further perform a Gaussian smoothing, with a kernel of 8 mm full-widthhalf-maximum (FWHM), to each image [13,48]. Let I s (x, y, z) and I h (x, y, z) be the intensity of the patient image and the average intensity over all the healthy controls at voxel (x, y, z) -be reminded that both patient image and healthy-control images have been normalized to the standard space and their intensities have been unified to z-scores.
First, we calculate the dissimilarity (distance in intensities) [31] between the patient and healthy controls at voxel (x, y, z) by where α controls the sensitivity in detecting the lesion voxels. As mentioned in Step 1): Lesion hemisphere detection and enantiomorphic Normalization Step b), on T1weighted MRIs, the intensity of lesions is usually lower compared to the intensity of the healthy tissues at the mirrored locations across the symmetry axis. This way, it is expected that the lesion voxels show negative I(x, y, z). We therefore define an initial lesion probability P L (x, y, z) as where the exponent λ > 0 is the weighting parameter that comprises the fuzzy set [31]. As shown in Table 1, we empirically select α = 0.4 and λ = 5 to achieve the best performance in our experiment.

Step 3): Amendment to normalization-segmentation template tissue maps
As shown in Fig. 1, the initial LPM estimated based only on voxel intensity is inaccurate -some healthy tissues may show high probability in LPM. To alleviate this problem, we further derive the probability maps of the non-lesion tissues, such as GM, WM, and external CSF of the patient. Typically, these tissue maps can be derived by applying USN pipeline [30]. However, this typical routine requires the input MRI to be from a healthy control subject, as the template does not include a lesion probability map. In this section, we propose a method to amend the templates by introducing an additional lesion probability map -based on the initial LPM -for each patient MRI. Particularly, we amend the templates of GM, WM, and external CSF probability map and the initial LPM. Note that this amendment is conducted independently for each patient image. Let P T GM (x, y, z), P T WM (x, y, z), P T eCSF (x, y, z) be the template probability maps of GM, WM and external CSF respectively -we do not amend the template probability maps of soft tissue, bone, and air because they are usually not affected by lesion. The basic idea of the amendment is to reduce the template probability value of GM, WM, external CSF at voxels with high lesion probability in terms of the initial LPM estimated in Step 2) and introduce a new template probability map P T L (x, y, z) for the lesion. Specifically, at a voxel (x, y, z), if P L (x, y, z) > γ , we simply set P T GM (x, y, z), P T WM (x, y, z), P T eCSF (x, y, z) all to be zero and the new template lesion probability P T L (x, y, z) = 1. Otherwise (i.e.,P L (x, y, z) ≤ γ ), we update the template probability maps by where γ indicates the confidence to the initial LPMthe smaller the value of γ , the more confident to the initial LPM P L (x, y, z). In this paper, as shown in Section "Results", we empirically select γ with the highest Dice coefficient score, γ = 5 6 . Figure 1 shows the amended template probability maps for a patient image.

Step 4): LPM refinement
Based on the amended template, we apply the USN pipeline [30,41] to process the corresponding patient image. This way, we can obtain the probability maps of GM, WM, external CSF and lesion for this patient and we denote them by P GM (x, y, z), P WM (x, y, z), P eCSF (x, y, z), and P L (x, y, z) respectively. We further apply a Gaussian smoothing using a kernel with FWHM of 8 mm to help reduce the noise in the lesion probability map. A sample result of these probability maps is shown in Fig. 4, where the amended template leads to better-estimated probability maps than the original template. In this step, we also follow the pipeline to derive P V (x, y, z), the probability map of ventricle, which is part of the CSF. Later, after both unsupervised and supervised components, we will use the derived P V (x, y, z) to exclude ventricle.

Supervised component
In the supervised components, we take a set of training images of the patients, where experts have manually labeled what we defined as the "ground-truth" for brain lesions. We then extract image and structure features from these training images and use them to train two-class classifiers where each voxel is recognized as either lesion or non-lesion tissues. With the trained classifiers, we can extract the same features from new test images and classify their voxels to get the final lesion detection. Please note that both training images and testing images were acquired using the same MR scanner.
In the feature extraction, we take advantage of both its T1-MRI intensity information, the probability maps of the each brain tissue and lesion derived in the unsupervised component. Specifically, based on the 5 × 5 × 5 sliding blocks in the normalized T1-MRI, the probability maps of GM, WM, external CSF and lesion (LPM), we extract three types of features. Based on each of these three types of features, we train three linear kernel SVM classifiers for lesion detection. We combine these three SVM classifiers to achieve the lesion detection. Finally, we automatically select an optimal threshold t * to the SVM output (taking values in [ −1, 1]) for the binary lesion classification. We will discuss the selection of optimal threshold in Section "Experiment results on in-house dataset and analysis".

Step 5): Feature extraction
To capture the subtle differences between lesion and healthy tissues, at each voxel we consider a 5 × 5 × 5 sliding block centered at this voxel for extracting the features of this voxel. For each voxel, features are extracted from five equal-size maps -the T1 image, and the probability maps of GM, WM, external CSF, and lesions (LPM) independently. Specifically, we extract three types of features:  x, y, z). b) 1st-order feature : At voxel (x, y, z),we extract its 1st-order features as The 1st-order feature is also a 690-dimensional feature vector. c) 2nd-order feature: At voxel (x, y, z), we extract its 2nd-order features as which is a 690 × 690 = 476, 100-dimensional feature vector. Compared with the traditional zero-order features, the 1st-order features are less sensitive to false-positive detection of the lesions and the 2nd-order features emphasize the lesion boundary features by taking the local covariance of the zero-order features, as illustrated in Fig. 2.

Step 6): Individual SVM classifiers
SVM [50] is one of the most popular machine-learning methods used in neuroimaging applications for lesion detection [26][27][28][29]. It is a kernel-based machine-learning method designed to construct a hyper-plane as the decision plane, which separates two different classes with the largest margin [50]. The use of SVM consists of two phases: training and testing. During the training phase, the algorithm finds statistical properties in the training data that discriminate between healthy tissues and lesion tissues. After the training, during the test phase, the algorithm can classify the lesions in a test data. In this paper, for each of the above three types of features, we train a linear SVM classifier that estimates the probability of each voxel to be lesion. Here, we use the linear-kernel SVM for these three individual classifiers. Based on these three trained classifiers, we can estimate three lesion probability maps p 0 (x, y, z), p 1 (x, y, z), and p 2 (x, y, z) for a voxel (x, y, z) in a test patient image (normalized to standard space), based on the three types of features respectively. These three probability maps take value in the range of [ −1, 1].
Note that, besides the benefits of the supervised learning (with manually labeled lesion by one expert, the subtle difference between healthy tissue and lesion are further described), features extracted from the T1-MRI, and the probability maps of GM, WM, external CSF, and lesion (LPM) can increase statistical power on classifying lesion from healthy tissue, as shown in Tables 2 and 3.

Step 7): Combined classification
In this step, we combine the lesion probability maps p 0 , p 1 , and p 2 generated in Section "Step 6): Individual SVM classifiers" estimated from three individual classifiers into a unified lesion probability map. Specifically, at voxel (x, y, z) of a patient image, the final unified lesion likelihood is computed by where ω 1 , ω 2 and ω 3 are the weights for the likelihood estimated by three individual classifiers and P V is the ventricle probability map derived in Section "Step 4): LPM refinement". Table 4 provides the details of choosing parameters ω 1 , ω 2 and ω 3 . Based on the results, we empirically select ω 1 = 0.1, ω 2 = 0.3 and ω 3 = 0.6. Additionally, as shown in Table 3, the combined classification results increase the accuracy, precision, recall and dice coefficient in comparison to lesion detection results using single type of features. Further detailed discussion will be held in Section "Experiment results on in-house dataset and analysis". Note that, as shown in Fig. 2, this lesion likelihood map is estimated in the standard space, we perform inverse normalization transform to map this lesion likelihood map back to the original native space. Final binary lesion classification is then achieved by thresholding this  lesion likelihood map and the selection of optimal threshold will be discussed in Section "Experiment results on in-house dataset and analysis". An alternative way is to combine all three features and train a single SVM classifier. We choose not to do this because the three features are of different orders and substantially different dimensions. Simply combing them and feeding them into a single SVM may make one feature to dominate the others. In particular, the 2nd-order feature has a dimension of 476,100 and both the zero-order feature and 1st-order feature only have dimensions of 690. In such cases, the use of multiple SVMs for different-order features has been shown to be more effective in other applications [51].
To justify the proposed method, in this paper we compare its performance with several other state-of-the-art lesion detection methods. Stamatakis et al. [13] compare the patient brain to a set of normal controls without segmentation for lesion detection. The T1-MRIs of normal controls and patients are normalized to a standard space. After a spatial smoothing, they are statistically compared voxel-by-voxel to identify regions outside the normal range established by the controls. Instead of using whole brain, Seghier et al. [11] compare the GM/WM between the patient image and normal controls for lesion detection. This algorithm [11] is performed recursively in Sanjuan et al. [20] for enhancing the lesion detection.

Experiment data and setting
For our experiments, we compare the proposed method with state-of-the-art lesion detection methods on: 1) inhouse dataset, and 2) MICCAI brain tumor image segmentation (BRATS) challenge 2012 dataset. The in-house data and BRATS-2012 data were produced by different scanners with different image size.
For in-house dataset: MRI was acquired with a 3T Siemens Trio system fitted with a 12-channel head-coil. All subjects were scanned with the same 3D T1-MRI sequence using a MP-RAGE (TFE) sequence: FOV = 256 mm × 256 mm, 160 sagittal slices, 9-degree flip angle, TR = 2250 ms, TI = 900 ms, TE = 4.2 ms, 1 mm resolution image. We collected T1-MRI for 60 stroke patients (29 females and 31 males) with an age mean of 61.6 years and a standard deviation of 12.27 years. The average time post-stroke was 39.86 months with a standard deviation of 49.24 months. For the healthy controls used in the unsupervised component, in total we collected 115 brain images from normal subjects without brain damage using the same hardware and sequence. The average age of healthy controls was 70.2 years with a standard deviation of 10.8 years.
For constructing the ground-truth lesion delineation, all patients were also scanned with a high-resolution T2-MRI, yielding a 1 mm isotropic image. This sequence used

Evaluation metrics
In this paper, we evaluate the performance of a lesiondetection method using a leave-one-out cross validation strategy. In each testing round, all but one patient images are taken for training while the remaining one is used for testing. All the performance measures are averaged over all the 60 testing rounds. For the evaluation criteria, we follow [11,13,20] by using the Accuracy, Precision, Recall and Dice coefficient.
where TP, FP, and FN represent the number of true positives, false positives, and false negatives respectively. For example, a voxel is classified as lesion, but not lesion in the ground truth segmentation, is counted as a false positive. We apply different threshold t to the lesion likelihood (from the combined classification, see Eq. (8)) to obtain a two-class classification, with which we can draw curves of the Dice's coefficient. Figure 6 shows the Dice coefficient at different threshold t when testing the proposed method on the 60 patient images -each curve corresponds to one patient (and one round of the leave-one-out testing). We can see that, for most patients, we need to select t < 0.5 for binary classification to better separate the lesions from the remaining tissues. Without specific claims, in the remainder of the paper all the reported Accuracy, Precision, Recall and Dice coefficient values are obtained by selecting an optimal t * Fig. 6 Curves of Dice coefficient (by varying the threshold t) for the 60 stroke patients Table 5 Directly taking the refined LPM, i.e., up to Section "Step 4): LPM refinement") of the proposed method, for performance evaluation, under different γ γ Accuracy Precision Recall Dice 1 6 0.976 ± 0.012 0.470 ± 0.197 0.611 ± 0.154 0.516 ± 0.178 2 6 0.978 ± 0.012 0.510 ± 0.194 0.624 ± 0.151 0.550 ± 0.174 3 6 0.979 ± 0.011 0.550 ± 0.189 0.631 ± 0.148 0.582 ± 0.169 4 6 0.980 ± 0.011 0.600 ± 0.186 0.634 ± 0.143 0.612 ± 0.166 5 6 0.981 ± 0. for each MRI that leads to the highest Dice coefficient, for both the proposed method and the comparison methods.

Experiment results on in-house dataset and analysis
First, we conduct experiments to show the usefulness of each step proposed in this paper. Table 2 shows the performance (averaged over all rounds of the leave-oneout tests) of the proposed method after each main step. Specifically, the initial LPM is the result after Section " Step 2): Initial estimation of LPM" and we threshold this initial LPM, using the optimal threshold t for each image that leads to the highest Dice coefficient as described above, to derive the performance after this step. Similarly, the refined LPM is the result in Section "Step 4): LPM refinement" and we use the similar technique to threshold it for a performance evaluation. The rows of 'Only zero-order features' , 'Only 1st-order features' and 'Only 2nd-order features' show the performance of the proposed method by using only one of the three types of the statistical features described in Section "Step 5): Feature extraction", respectively. The Accuracy of the lesion detection is always high (above 90 %) because brain MRIs always contain large areas of dark background outside the brain and these areas can be easily classified correctly as non-lesions. From Table 2, we can see that each main step of the proposed method contributes further performance improvement, in terms of Accuracy, Precision, Recall and Dice coefficient. Additionally, the combination of different order statistical features leads to better lesiondetection performance than using only one type of these features. In particular, from Table 2, we can see that that the SVM with the 2nd-order feature shows lower performance than the SVMs with the zero-order and 1st-order features in terms of Precision, Recall, Dice and Accuracy. However, if we only combine zero-order and 1st-order features, the Dice coefficient is 0.694 ± 0.126, which is still lower than the combination of all three classifiers (0.731 ± 0.106). This shows that the 2nd-order features still provide complementary information to the other features and help improve the final lesion classification.
Two main free parameters in the initial LPM estimation are α in Eq. (2) and λ in Eq. (3). Table 1 shows the lesion-detection performance based on the initial LPM after Step 2) under different α and λ. Based on these results, we select α = 0.4 and λ = 5. Table 5 shows the lesion-detection performance based on the refined LPM after Step 4) under different γ . Based on these results, we Table 6 The performance of the proposed method and the three comparison methods on the in-house data  Table 4 shows the lesion-detection performance based on the combined classification after Step 7) under different ω 1 , ω 2 and ω 3 . Based on these results, we select ω 1 = 0.1, ω 2 = 0.3 and ω 3 = 0.6.
Second, we conduct experiments to show the effectiveness of combining both unsupervised and supervised components. From Table 3, we can see that, if we directly feed the MRI intensity into the supervised classification, the lesion detection performance is very poor. On the other hand, if we feed the probability maps of brain tissues and lesions estimated from the unsupervised component, i.e., the results after Step 4) in Section "Step 4): LPM refinement", into the supervised classification, we achieve substantially improved performance. If we combine both intensity and probability map features, we achieve further performance improvements. This shows that the combination of both unsupervised and supervised components boosts the lesion detection performance. In this experiment, we also try the use of each type of the features, i.e., zero-order, 1st-order and 2nd-order statistical features, and the results in Table 3 further confirms that the combination of all three types of features leads to better performance than using only one of them. We also perform a paired-sample one tailed t-test to compare lesion detection results after the unsupervised and supervised components. Based on their Dice coefficients, the onetailed p-value for this t-test is p = 6.4e − 11, and t = 8. 19. This shows that there is a significant performance improvement by including the supervised component.
Finally, we compare the performance of the proposed method with three state-of-the-art lesion detection methods [11,13,20] on our collected 60 patient images. From Table 6a, we can see that, the proposed method achieves a substantially better lesion detection performance than these three comparison methods in terms of all the four evaluation criteria, each of which is the average over all 60   Figure 7 qualitatively shows the detected lesions on selected 2D slices of different patients (from three views), where red contours denote the detected lesion boundaries and the green contours denote the ground-truth lesion boundaries. Figure 8 qualitatively shows the lesion-detection results produced by the proposed method and the three comparison methods on selected 2D slices, where red and green contours indicate the detected lesion boundaries and the ground-truth lesion boundaries, respectively.
The performances reported in Table 6a, for both the proposed methods and the comparison methods, are based on the threshold t optimized for each test image (corresponding to a round of leave-one-out cross validation) in terms of the Dice coefficient with the ground truth. In practice, we do not have ground truth lesion detection for new test images and cannot find such an optimal threshold t for each image. Instead, we propose a simple strategy to select a threshold t for each test image such that it maximizes the structural consistency of the detected lesion. Specifically, considering the 2D serialsections along the horizontal direction from the top to the bottom, denote the lesion detected (using threshold t) on slice i to be R t i . We calculate the overlap between neighboring slices as with | · | to be the area. With this, we select the threshold t * as Table 6b shows the performance of the proposed method using this strategy of threshold selection which is independent of the ground truth. For fairer comparison, in this experiment, ground-truth independent thresholds are also used for the three comparison methods when binarizing the lesion likelihood map for lesion regions. For the three comparison methods, the thresholds are chosen to be the ones as suggested in their original papers and their performances are also reported in Table 6b. We can see that while the performance is lower than the one when using the optimal threshold t in terms of the ground truth, the proposed strategy of theshold selection still leads to a Dice coefficient of 69.8 %, which is much higher than all three comparison methods.

Experiment results on MICCAI BRATS 2012 dataset
We also test the proposed method on MICCAI BRATS 2012 dataset. The dataset contains 30 glioma patients (both 24 low-grade MRIs and 6 high-grade T1-MRIs) along with expert annotations for "active tumor" and "edema". As in the in-house dataset, we use leave-one-out cross validation to train and test the proposed method on MICCAI BRATS 2012 dataset. The values of the parameters α, γ , λ, w 1 , w 2 , and w 3 are the same as those used for the in-house dataset.
Quantitative comparisons are shown in Table 7a and b. Some examples of lesion/tumor detection from the database are given in Fig. 8. Table 7a reports the performance where the threshold t is selected to maximize the Dice coefficients against the ground truth. Table 7b reports the performance where the threshold t as in Eq. (8) without using the ground truth.

Discussion
In this section, we further discuss several issues and limitations of the proposed method. In Step 1), we assume brain symmetry to construct a rough estimate of a healthy brain underlying the input lesioned brain for enantiomorphic normalization. However, in practice, brain is by nature asymmetric and the assumption of perfect brain symmetry is not held. In this section, we first conduct an experiment to assess  the impact of brain asymmetry to the proposed method. Specifically, we assume that the ground-truth lesion mask is known and we use the USN pipeline with cost function masking in SPM8 [42] for brain normalization. We then use this normalization result to replace the proposed enantiomorphic normalization result in the proposed method and the results on the in-house dataset are reported in Table 8. We can see, by using USN, the Dice of the initial LPM increases by 0.047, the Dice of refined LPM (after unsupervised component) increases by 0.014, and the Dice of the final lesion detection increases by only 0.012. These results show that the assumption of the brain symmetry in the proposed method does not introduce too much impact to final performance of the proposed method. Note that, in practice, the ground-truth lesion mask is unknown and it is impossible to achieve such USN.
Another issue in the proposed method is the selection of the block size for feature extraction. In all the previous experiments, we choose block size of 5 × 5 × 5. We further conduct an experiment to examine the lesion-classification performance using different block sizes and the result on the in-house data is shown in Table 9. We can see that the use of block size 5 × 5 × 5 leads to a better performance than the use of block size 3 × 3 × 3. Furthermore, the use of block size 7 × 7 × 7 can further improve a little the performance (from 0.731 to 0.733 in terms of Dice coefficient). However, we found that the use of 7 × 7 × 7 block size takes much more computation time, because of the substantial increase of the feature dimensions. The most time consuming step is to generate the different order features. When increasing the block size from 5 × 5 × 5 to 7 × 7 × 7, the CPU time to generate features on an image from the in-house data will increase from 133 min to 388 min on a MAC 10.10.5 computer with a 2.2GHz Intel Core i7 CPU and 16GB memory.
The presence of brain atrophy will further break the symmetry of the brain and increase the difficulty of brain normalization. The proposed method assumes the brain symmetry for brain normalization and initial lesion estimation. Brain normalization was also important for accurate tissue segmentation, which provides the input to the supervised component. Therefore, the presence of brain atrophy may seriously affect the performance of the final lesion classification. An example is shown in the first row (patient 11) of Fig. 9, where ventricle of the patient is dilated and twisted -the structures surrounding the left occipital lobe is warped towards its counterparts on the right. For this patient, the proposed method only achieves a lesion classification with a Dice coefficient of 0.491. Partial volume effect changes the voxel intensity and makes it more difficult to capture the tissue boundaries in the segmentation. The proposed method takes tissue segmentation for feature extraction and supervised classification. Therefore, partial volume effect may also affect the performance of the proposed method. Many methods and tools have been developed to identify and correct the partial volume effect [43,54]. Using these methods and tools to preprocess the images may help reduce its effect to the proposed method.
Another issue may fail the proposed method is that SPM may confuse lesion and CSF since they share similar intensities. But in most cases, this is not a serious issue when lesion is in GM and WM and can be separated with CSF in the normalized standard space. However, when lesion is near CSF or ventricle is substantially deformed, SPM may have troubles in distinguishing the lesion and the CSF. As a result, the proposed method may confuse the lesion and CSF in the classification. An example is shown in the row 2 (patient 12) of Fig. 9, where part of the left lateral ventricle adjacent to the lesion is misclassified as the lesion by the proposed method. For this patient, the proposed method achieves a Dice coefficient of 0.532.  Bias field inhomogeneity may fail the proposed method because it will affect the correct identification of the lesioned hemisphere. In addition, the symmetric axis estimation is very sensitive to the bias field inhomogeneity. An example is shown in Fig. 10, where an artificial bias field is added to each slice of a T1-MRI, following the strategies in [55,56]. Let X and Y be the size of row and column. Centered at X 10 , Y 10 , the added bias field at each 2D slice is a 2D Gaussian with 40 mm FWHM and the center intensity of I B is 10 times of the maximum intensity in the original 3D MRI. Following [55,56], we first evaluate the estimated symmetry axis against the ground truth axis using the yaw and roll angles between them. Without the added bias field, the yaw and roll angles are 1.788 ± 0.137 and 2.281 ± 0.244 respectively over all the in-house data. With the added bias field, the yaw and roll angles are 2.211 ± 0.154 and 14.899 ± 4.647, respectively, which are much larger than those without the added bias field.
Another failure case of the proposed method is when both hemispheres contain lesions. In this case, the underlying healthy brain cannot be estimated using a hemisphere for brain normalization in Step 1).

Conclusions
In this paper, we introduced a novel automated procedure for lesion detection from T1-weighted MRIs by combining both an unsupervised and a supervised component. In the unsupervised component, we developed new approaches to identify the lesioned hemisphere and used it to help normalize the patient MRI with lesions and initialize/refine a lesion probability map. In the supervised component, we combined different-order statistical features extracted from both the tissue/lesion probability maps obtained from the unsupervised component and the original MRI intensity and applied three SVM classifiers for final voxel-based lesion classification. In the experiments, we evaluated each main step of the proposed method and verified its effectiveness. Lesion detection on a set of 60 stroke patient MRIs showed that the proposed method could achieve superior performance compared to three state-of-the-art lesion detection methods in terms of four different evaluation criteria. In particular, using the leave-one-out cross validation, the proposed method achieved an average Dice coefficient of 73.1 % on the 60 patient MRIs against the ground truth lesions that were manually labeled by trained neurologists. Also, we used the proposed method to detect brain tumor on 30 real tumor T1-MRIs on MICCAI BRATS 2012 dataset and the proposed method achieved an average Dice coefficient of 66.5 %.