Skip to main content

Application of mean-shift clustering to Blood oxygen level dependent functional MRI activation detection



Functional magnetic resonance imaging (fMRI) analysis is commonly done with cross-correlation analysis (CCA) and the General Linear Model (GLM). Both CCA and GLM techniques, however, typically perform calculations on a per-voxel basis and do not consider relationships neighboring voxels may have. Clustered voxel analyses have then been developed to improve fMRI signal detections by taking advantages of relationships of neighboring voxels. Mean-shift clustering (MSC) is another technique which takes into account properties of neighboring voxels and can be considered for enhancing fMRI activation detection.


This study examines the adoption of MSC to fMRI analysis. MSC was applied to a Statistical Parameter Image generated with the CCA technique on both simulated and real fMRI data. The MSC technique was then compared with CCA and CCA plus cluster analysis. A range of kernel sizes were used to examine how the technique behaves.


Receiver Operating Characteristic curves shows an improvement over CCA and Cluster analysis. False positive rates are lower with the proposed technique. MSC allows the use of a low intensity threshold and also does not require the use of a cluster size threshold, which improves detection of weak activations and highly focused activations.


The proposed technique shows improved activation detection for both simulated and real Blood Oxygen Level Dependent fMRI data. More detailed studies are required to further develop the proposed technique.

Peer Review reports


Functional Magnetic Resonance Imaging (fMRI) is a technique that is used to identify regions of activations in the brain. Techniques based on cross-correlation analysis (CCA) and the General Linear Model (GLM) is commonly used for fMRI data analysis [17], however these techniques are not without drawbacks. Both techniques typically perform its calculations on a per voxel basis. This means that each calculation does not take into consideration any relationship that neighboring voxels may have with each other. This has the effect of lowering the sensitivity of the technique when looking for activations. This is especially true in low contrast to noise ratio (CNR) situations.

There has been interest in enhancing fMRI data analysis with cluster size tests. Various techniques have been examined with that intention in mind. Cluster analysis (based on random field theory) is commonly used to help isolate activations [810]. K-means clustering [11] is a method where observations are partitioned into "k" number of clusters where each observation belongs to the cluster with the closest mean. Fuzzy clustering [12] is similar to k-means clustering, except that fuzzy clustering takes into consideration that a single observation can belong to more than one cluster. Both K-means and Fuzzy clustering have been examined for improving fMRI data analysis [1316]. Mean-shift clustering (MSC) is another technique to consider for the same purpose.

The MSC technique was first introduced by Fukunaga et al. [17] for examining pattern recognition, but the technique was mostly unexplored until more recently [18, 19]. The technique has found uses in image processing and vision tasks. Image segmentation has also been explored with this technique on brain images [20]. MSC revolves around a density estimation that is done on a predetermined feature space. Intuitively, the technique works by calculating the mean shift vector, then shifting the kernel as dictated by the mean shift vector. This process is repeated as appropriate until convergence at which time a cluster in the feature space can be identified. By selectively choosing the features used for the feature space, it would be possible to incorporate characteristics of the data that normally would not be part of the analysis with CCA and GLM based techniques. MSC also offers some other advantages with regards to implementation. The technique does not require assumptions to be made about noise distribution. Compared to typical cluster analysis, no hard cut-off in cluster size is required with MSC. Since MSC is based on density estimation of a feature space, it does not make any assumption on the shape of the clusters either as well as allowing different feature spaces to be used to incorporate different characteristics of the data into analysis. These advantages with MSC may allow higher sensitivity when detecting activations for improved results.

To the best knowledge of the authors, MSC has not been closely explored in the application of fMRI activation analysis. As a first step, this study investigates a straight forward application of MSC to fMRI data analysis.


The proposed MSC technique was evaluated using both simulated and real fMRI data. The data analysis was performed using Cross Correlation Analysis (CCA) to generate a statistical parametric image (SPI). The mean-shift clustering was then applied to a feature space constructed using selected characteristics of the SPI. Comparisons were made among CCA, CCA plus cluster analysis (CCA + CA), and CCA + MSC to examine the application of MSC.

Simulated data

The simulated data was designed to emulate fMRI data using one hundred images, 128 × 128 matrix size, and with a block design of two and a half off/on cycles (20 images per off or on cycle). Activations of various sizes (20 × 20, 10 × 10, 2 × 2 voxels) were inserted onto the data for analysis. Gaussian noise at different CNRs (0.20, 0.40, 0.06, and 0.80) was generated and inserted into the simulated data.

fMRI data

Ten subjects (5 females, 5 males, age 22–32) gave informed written consent with the approval of the University of Iowa’s (USA) Institutional Review Board. All subjects reported that they were right-handed, not using medications at the time of scanning, healthy, and had no history of any mental or psychiatric conditions. All ten subjects were scanned at the University of Iowa's Medical Education and Research Facility.

Blood Oxygen Level Dependent (BOLD) fMRI data were acquired on a Siemens 3 T Trio scanner (Siemens Medical Solutions, Erlangen, Germany). A gradient echo EPI pulse sequence was used with the following parameters: TR = 2000 ms, flip angle = 90 degrees, TE = 30 ms, matrix = 64 × 64, FOV = 220 mm, slice thickness = 5 mm with 20% gap, 180 images per run. Each scanning session was composed of seven six-minute runs, though only the first run was used for the purpose of this study. A T1 anatomical scan was also performed with the following parameters: TR = 1590 ms, flip angle = 10 degrees, TE = 3.39 ms, matrix = 128 × 128, FOV = 220 mm, slice thickness = 2 mm.

Unilateral electrical stimulation was delivered to the subject’s right median nerve using a Grass S48 stimulator (Grass Technologies, West Warwick, Rhode Island, USA). The stimulation voltage used was 15 volts above the motor threshold, which was individually defined as the minimum voltage required to obtain a thumb twitch. The delivered stimulations were square wave pulses with 0.2 ms duration. A block design of four and a half off/on cycles (40 seconds off, 40 seconds on) with a randomized inter-stimulation interval (ISI) between 1.0-2.0 seconds was used. A randomized ISI was used to reduce any effect that expecting a stimulation occurring with a fixed inter stimulation interval might have on the resulting BOLD signal. The volunteers were asked to passively feel the stimulation, stay still, stay awake, and not actively perform anything else for the duration of the scan.

Mean shift clustering

MSC is based on density estimation of a predetermined multi-modal feature space of image characteristics. Previously used feature spaces, such as perceived color [21, 22], are generally not applicable to fMRI analysis since color is not a feature that would be of interest. For this study, a feature space of the estimated Z values of the SPI and the mean voxel values surrounding a voxel (eight neighboring voxels in 2D and twenty six neighboring voxels in 3D) was used as they can be features of interest and incorporating them in the analysis may help with activation detection. The estimated Z values were used because it relates directly to statistical significance, and the mean voxel value of the surrounding voxels were used to take into consideration neighboring effects. The image features are mapped into a point in a multi-dimension space. The density is calculated within a defined kernel on the feature space. The kernel is moved based on the density gradient in the feature space until the local maximum is found. Points in the feature space associated with the same local maximum are considered to belong to the same cluster, and the calculation is repeated until all points are assigned to a cluster.

Using the Parzen window technique [23], the kernel density estimation at point x can be described by:

f ^ x = c k n h d i = 1 n k x - x i h 2

where n is the number of data points, c k is a constant, k is the kernel, h is the kernel size, and d is the number of dimensions in the feature space. The local maximum density is identified at f ^ x = 0 by moving the kernel based on the gradient ascent in the feature space. Equation 1 can be rewritten as:

f ^ = 2 c k n h d + 2 i = 1 n x - x i k ' x - x i h 2

Assuming g(x) = -k’(x), the gradient density estimator can then be described as:

f ^ = 2 c k n h d + 2 i = 1 n x i - x g x - x i h 2
f ^ = 2 c k n h d + 2 i = 1 n g x - x i h 2 × i = 1 n x i g x - x i h 2 i = 1 n g x - x i h 2 - x

The first term of Equation 4 is proportional to the density estimate computed with the kernel. The second term:

m x = i = 1 n x i g x - x i h i = 1 n g x - x i h - x

is the mean shift vector where g is the kernel, h is the kernel size, x is the mean estimate inside the kernel, and xi is the element inside the kernel. The mean shift vector, m(x), defines how the kernel will move along the density gradient towards the local maximum which corresponds with dense regions in the feature space. This calculation is performed at each data point, shifted by m(x) along the density gradient, and repeated until convergence is reached when local maximum is found. This procedure allows the mean shift clustering technique to identify such locations without having to estimate the probability density function of the associated data. Points associated with the same local maximum belong to the same cluster.

Within the mean-shift vector equation, the parameter that likely has the largest effect on the analysis is the kernel size, h, as differences in kernel sizes can change the density estimates calculated which the MSC technique is based on. While adaptive techniques do exist, a range of kernel sizes will be used to examine how the technique will behave.

Data analysis

The general approach to the proposed MSC analysis method is applying MSC to a feature space constructed using selected characteristics of the SPI generated using CCA. CCA was chosen over GLM because (1) which technique used to generate the SPI is less important for the purposes of this study; (2) the CCA technique allows easy control over the significance level while it is more difficult to do so with GLM.

The real fMRI images were processed using Analysis of Functional NeuroImages (AFNI) [24] and custom Matlab software. As part of the CCA analysis, three-dimensional motion correction was performed to minimize motion effects. All images were normalized to Talairach space ( Constant, linear, and quadratic trends were removed. To investigate the effect of a Gaussian filter on activation detection with MSC, no Gaussian filter and a Gaussian filter with full width half maximum (FWHM) of 4 mm was applied. SPIs were generated for individual subjects.


The proposed method (CCA + MSC) was compared with typical CCA and CCA plus cluster analysis (CCA + CA) procedure using the same simulated data. Activations of sizes 20 × 20, 10 × 10, and 2 × 2 were inserted onto the data to identify how the techniques behave with different sized activations. The total area of all test patterns were maintained to be the same by varying the number of inserted activations. The 2 × 2 activation size can be considered to be a simulation for highly focused activations. Gaussian white noise was generated and inserted into the simulated data at several CNRs (0.20, 0.40, 0.60, 0.80) for examination on how the technique reacts to noise. No additional smoothing filter was applied to the simulated data.

The proposed technique was assessed based on sensitivity and specificity and compared with the performances of the aforementioned techniques (CCA, CCA + CA, CCA + MSC) on simulated data. True positive rate comparisons were used to examine how the different kernel sizes affect the outcomes at various CNRs, and false positive rate comparisons were used to examine the amount of noise that appear in the results of each technique. Since simulated data was used, the ground truth is known, so identifying true and false positive rates is a simple task by comparing detected activations with true activations. Receiver operating characteristic (ROC) curves were drawn to allow a direct comparison of performance between the techniques.

The real fMRI data from ten subjects was analyzed for evaluating CCA, CCA + CA, and CCA + MSC. The three techniques are evaluated using both individual fMRI data and averaged fMRI data while controlling significance level at p = 0.01. Different statistical thresholds were determined and applied in order to set the significance level to p = 0.01 for comparison purposes. The threshold for CCA was calculated to be z = 4.8 for the filtered and z = 4.9 for the unfiltered (Bonferroni corrected). The CCA + CA used a threshold of z = 2.6 which was calculated based on a study performed by Xiong et al. [10] with a cluster of threshold of 6 voxels for FWHM = 4 mm and 4 voxels for the unfiltered dataset. The CCA + MSC used a threshold of z = 2 for both filtered and unfiltered data. The z threshold for CCA + MSC was selected based on the results of simulations to achieve an approximate significance level of p = 0.01 (see Results section).


True positive rate comparison

Using the proposed CCA + MSC technique, true positive rates were plotted to examine the behavior of the technique at various kernel sizes. The statistical threshold was applied at Z = 3. The kernel sizes used were in the range of 0.05 to 0.50. The CNRs of the simulated data are 0.20, 0.40, 0.60, and 0.80. The results show the true positive rate is dependent on the kernel size (Figure 1). For low CNRs (0.20, 0.40), the true positive rates start to decline at a kernel size of 0.25. For very high CNR (0.80), the true positive rate remains high across the range of kernel sizes used. The true positive rates also fluctuates as the kernel size is changed, but this is expected as the noise is being randomly generated and changing the kernel size likely has an effect on the cluster assignment of the voxels. Figure 1 indicates that the kernel size can be optimized for enhanced activation detection. A kernel size of 0.20 was used for the rest of the simulated data comparisons as it seems to be an appropriate choice for all CNR and activation sizes.

Figure 1
figure 1

Effect of kernel size on true positive rates for various activation sizes with simulated data. The statistical threshold was held constant at z = 3. A range of kernel sizes was used from 0.05 to 0.50. CNRs used are 0.20, 0.40, 0.60, and 0.80. A: 20 × 20 activation size. B: 10 × 10 activation size. C: 2 × 2 activation size.

False positive rate comparison

False positive rates of the proposed CCA + MSC technique were examined while varying the z threshold (0.1 to 5.0) using simulated data. The same CNRs were used. A kernel size of 0.20 was selected based on the true positive rate comparisons (Figure 1). The false positive rate shows significant improvements over CCA at all simulated activation sizes, especially at lower z thresholds (Figure 2). The false positive rate for all activation sizes except 2 × 2 are relatively flat at low z thresholds. It slightly increases as z threshold decreases until z = 0.05 where it increases beyond the figure cap of 0.05 and thus not shown. The 2 × 2 case follows the CCA curve, but an improvement was still seen. The comparison was also made without an activation map (only Gaussian noise), and the result is similar with the 20 × 20 case. Figure 3 shows representative activation maps generated by CCA and CCA + MSC with a z = 1 threshold at CNR = 0.80 with a 0.20 kernel size. It can be visually observed that there are less false activations in the case with CCA + MSC, which agrees with the results from Figure 2.

Figure 2
figure 2

Change of false positive rates at different z thresholds using various activation sizes with simulated data. Kernel size was held constant at 0.20. The z thresholds were varied from 0 to 5. CNRs used are 0.20, 0.40, 0.60, and 0.80. A: 20 × 20 activation size. B: 10 × 10 activation size. C: 2 × 2 activation size. D: Activation map consisting of noise only.

Figure 3
figure 3

Activation map of CCA and CCA + MSC. A threshold of Z = 1 was applied. CNR of 0.80 was used with kernel size of 0.20. A: CCA activation map. B: CCA + MSC activation map

ROC comparison of different kernel sizes

A comparison was made with ROC curves between CCA, CCA + CA, and the proposed CCA + MSC at the same CNRs as before, and on the same simulated data set (Figure 4). Simulated data was used for this comparison with the statistical threshold varied from Z = 0.1 to 5.0 and the kernel size being set at 0.10, 0.20, and 0.50. The ROC curves indicate that a properly chosen kernel size, in this case 0.20, show an improvement over both CCA and CCA + CA. If the kernel size (e.g., 0.50) used is too large, essentially no activations are detected using CCA + MSC. If the kernel size (e.g., 0.10) is too small, the CCA + MSC technique is inferior to CCA + CA and performs similarly with CCA. This would be consistent with the true positive rate comparison (Figure 1).

Figure 4
figure 4

ROC curves for CCA, CCA + CA, CCA + MSC with different kernel sizes and different activation sizes using simulated data. Kernel sizes are 0.10, 0.20, and 0.50. Activation sizes used are 20 × 20, 10 × 10, and 2 × 2. CNRs used are 0.20, 0.40, 0.60, and 0.80. A: 20 × 20 activation size, kernel size = 0.10 B: 20 × 20 activation size, kernel size = 0.20 C: 20 × 20 activation size, kernel size = 0.50 D: 10 × 10 activation size, kernel size = 0.10 E: 10 × 10 activation size, kernel size = 0.20 F: 10 × 10 activation size, kernel size = 0.50 G: 2 × 2 activation size, kernel size = 0.10 H: 2 × 2 activation size, kernel size = 0.20 I: 2 × 2 activation size, kernel size = 0.50.

The relative performance of CCA + MSC and CCA + CA shows a rather complex relationship. In the 20 × 20 and 10 × 10 cases at kernel size = 0.20, CCA + MSC is better than CCA + CA when the false positive rates range from 0.01 and 0.05. It is inferior to CCA + CA when the false positive rate is below 0.01. Considering that significance levels of p = 0.05 and p = 0.01 are commonly used for activation detection, CCA + MSC should show improvement over CCA + CA in a practical situation. In the 2 × 2 case, CCA + MSC appears to be better than CCA + CA up to a false positive rate of about 0.01, then becomes worse until about 0.02, and the two techniques perform similarly after that.

CCA vs CCA + MSC vs CCA + CA on real data

The real fMRI data was analyzed using CCA, CCA + CA, and CCA + MSC (Figure 5). Activations are expected to be seen on the left M1/S1 region due to the right median nerve stimulation. The CCA + MSC was applied with different kernel sizes of 0.05, 0.10, 0.15, and 0.20. Based on Figure 2, a threshold of z = 2 was applied to achieve a significance of p = 0.01. As shown in Figure 4, the expected activations can be detected using CCA + CA and CCA + MSC with or without a filter in the expected areas while no activation can be seen with standard CCA. Table 1 summarizes the activation volume and average z-scores of the averaged data of both filtered and unfiltered fMRI data. The optimal kernel size for real fMRI is 0.05 and 0.10, which is smaller than the kernel size of 0.20 for the simulated data. Both kernel sizes of 0.05 and 0.10 cases show activations with the 0.10 case showing slightly smaller activations than CCA + CA and the 0.05 case showing larger activations when compared to CCA + CA. In the cases with no filter applied, the detected activations are smaller than their filtered counterparts (Table 1) likely due to the filter enhancing CNR of the SPI, but the non-filtered results generally show the same trends as the filtered results. Individual results are summarized in Table 2 and essentially follow the same trends seen in Table 1. The results show that CCA, CCA + CA, and CCA + MSC (kernel size = 0.10) are statistically different when unfiltered (ANOVA, F = 15.4, p < 0.05) or filtered (ANOVA, F = 10.9, p < 0.05). The Tukey test further reveals that CCA + MSC is significantly better than CCA in both filtered and unfiltered cases (p = 0.05). The performance of CCA + MSC is statistically similar with CCA + CA (p = 0.05).

Figure 5
figure 5

Activation of median nerve stimulation detected with CCA, CCA + CA, CCA + MSC. Significance levels were controlled at p = 0.01 for all images. Z thresholds were changed for each technique based on the significance level. A: CCA, Z = 4.8, FWHM = 4 mm B: CCA + CA, Z = 2.6, cluster size threshold = 6 voxels, FWHM = 4 mm C: CCA + MSC, Z = 2, kernel size = 0.05, FWHM = 4 mm D: CCA + MSC, Z = 2, kernel size = 0.10, FWHM = 4 mm E: CCA + MSC, Z = 2, kernel size = 0.15, FWHM = 4 mm F: CCA + MSC, Z = 2, kernel size = 0.20, FWHM = 4 mm G: CCA, Z = 4.8, no filter applied H: CCA + CA, Z = 2.6, cluster size threshold = 4 voxels, no filter applied I: CCA + MSC, Z = 2, kernel size = 0.05, no filter applied J: CCA + MSC, Z = 2, kernel size = 0.10, no filter applied K: CCA + MSC, Z = 2, kernel size = 0.15, no filter applied L: CCA + MSC, Z = 2, kernel size = 0.20, no filter applied.

Table 1 Activation volume and average Z-scores for averaged real fMRI data
Table 2 Activation volume and average Z-score for individual subjects on real fMRI data


In this study, the adoption of MSC into fMRI analysis was examined by comparing it to CCA and CCA + CA. The ROC curves (Figure 4) indicate that the proposed MSC technique show an improvement over CCA and an improvement over CCA + CA when the false positive rate is above 0.01 in most cases. The false positive rate comparisons (Figure 2) showed significant improvement over CCA which indicates that CCA + MSC controls noise very well. This allows a lower statistical threshold to be used when identifying activations (Figure 3). Another potential benefit of MSC is for highly focused activation detection since no cluster threshold is applied.

The performance of CCA + MSC depends on the kernel size being used (Figure 1), but determining an optimal range of kernel sizes is not a trivial issue. If the kernel size used is too large, no activations would be detected. Conversely, if the kernel size used is too small, the proposed technique does not show an improvement when compared to CCA and CCA + CA. A proper kernel size needs to be used in order to see an improvement over CCA and CCA + CA. When tested with simulated data, the ROC curves and true positive comparison indicate that a kernel size of roughly 0.20 should show an improvement over CCA (Figure 4), but no activations were detected with real fMRI data if that kernel size is used. Real fMRI data showed activations at kernel size of 0.05 and 0.10 (Figure 5). The difference in optimal kernel sizes between the real and simulated data may be explained by the different noise characteristics of the data. The simulated data is the ideal situation with only Gaussian white noise. Real fMRI data will have multiple noise types such as movement artifacts, physiological noise, noise from the MRI machine itself, etc. It is likely that the optimal kernel size depends on the structure of noise in the data and needs optimized for each individual data set. Future studies are required to examine this issue in more detail.

Significant improvement is seen with CCA + MSC over CCA on the simulated data in the ROC curves in the 20 × 20 and 10 × 10 cases at a kernel size of 0.20 (Figure 4B and E), but this improvement is sudden and is concentrated in the region of false positive rates up to 0.02. While this type of improvement is typically unexpected, it is consistent with Figure 2 where false positive rates are shown to be well controlled (lower than 0.02) at a large range of z thresholds for all cases except the 2 × 2 case. This explains why the data is concentrated in the region of false positive rate up to 0.02.

The false positive rate comparison for the 2 × 2 test pattern showed a curve that is similar to the CCA curve, which does not follow the 10 × 10 and 20 × 20 cases (Figure 2). This may be due to the 2 × 2 case having many more neighboring voxels adjacent the test pattern than the other cases. Four 20 × 20 test patterns were inserted into simulated fMRI images which results in 320 neighboring voxels. To maintain the same total area of the test pattern, four hundred 2 × 2 activations were used, resulting in 3200 neighboring voxels. The 2 × 2 test image has ten times the number of voxels that are directly adjacent the inserted activations when compared to the 20 × 20 test image. It is more likely for falsely activated voxels to be detected by cluster analysis techniques if it is attached to the test pattern than when isolated. More falsely activated voxels are expected to be detected for the 2 × 2 test pattern, thus increasing the false positive rate. Regardless, the proposed technique still shows improvement when compared to CCA in the 2 × 2 case.

CCA + MSC can show an improvement over CCA + CA at a false positive rate of greater than about 0.01 with a kernel size of 0.20 (Figure 4), but this improvement is not seen in the 2 × 2 case where CCA + CA was either superior or about the same as CCA + MSC. A cluster threshold of 4 voxels was used to set the significance value at p = 0.01, which also happens to be the exact size of the simulated activations in the 2 × 2 case. If the activation size was three voxels or a different significance level was used and the voxel cluster threshold was increased, the activations would have been removed by the cluster analysis. The CCA + MSC does not have a cluster threshold and has the potential of better detecting highly focused activations.

The selection of image characteristics used in the feature space should be examined in future studies as the feature space will likely have a large effect on the results as well as the range of acceptable kernel sizes. The image characteristics selected for the featured space used in this study was the estimated Z values found in the SPI and the mean voxel values surrounding a voxel. While there are many methods of constructing a feature space, the feature space used did show an improvement over CCA in the simulated and real fMRI data at the same significance level, it is unknown what image features or what combination of image feature will produce the best feature space for fMRI analysis. The feature space used in this study does not incorporate temporal features in the data or positional features for example. There may be other types and combinations of image features that can be used, and it would certainly be an area of further examination.

The proposed technique has the limitation that the significance levels cannot be easily theoretically calculated (or at least we have not been able to come up with a method of doing so). This presents a particular drawback when doing comparisons. Significance levels can be approximated using simulations as was done in this study, but still may present some challenges when very accurate comparisons are required.


The experiment performed in this study examines the application of MSC to CCA in fMRI activation detection. The results show that an improvement with CCA + MSC can be seen over the typical CCA and CCA + CA analysis technique. The proposed technique maintains a low false positive rate which allows the use of lower statistical thresholds while controlling for noise and helps activation detection in low CNR situations. This also helps in detecting small highly focused activations especially considering that CCA + MSC does not require the application of a cluster size threshold, which is required by most cluster analysis techniques. By nature, CCA + MSC also has the ability to incorporate different image characteristics into a feature space for analysis. These benefits can help to improve activation detection in fMRI data. However, studies in the optimization in kernel size and feature space are needed to further develop the proposed technique. Despite the aforementioned limitations, the proposed technique shows promise in improving fMRI activation detection.



Blood oxygen level dependent


Cross correlation analysis


Contrast to noise ratio


Functional magnetic resonance imaging


General linear model


Mean shift clustering


Receiver operating characteristic


Statistical parametric image.


  1. Bandettini P, Jesmanowicz A, Wong E, Hyde J: Processing strategies for time-course data sets in functional MRI of the human brain. Magn Reson Med. 1993, 30 (2): 161-173. 10.1002/mrm.1910300204.

    CAS  Article  PubMed  Google Scholar 

  2. Boynton G, Engel S, Glover G, Heeger D: Linear systems analysis of functional magnetic resonance imaging in human. J Neurosci. 1996, 16 (13): 4207-4221.

    CAS  PubMed  Google Scholar 

  3. Bullmore E, Brammer M, Williams SCR, Rabe-Hesketh S, Janot N, David A, Mellers J, Howard R, Sham P: Statistical methods of estimation and inference for functional MR image analysis. Magn Reson Med. 1996, 35 (2): 261-277. 10.1002/mrm.1910350219.

    CAS  Article  PubMed  Google Scholar 

  4. Calhoun V, Adali T, McGinty V, Pekar J, Watson T, Pearlson G, fMRI Activation in a Visual-Perception Task: Network of Areas Detected Using the General Linear Model and Independent Components Analysis. NeuroImage. 2001, 14 (5): 1080-1088. 10.1006/nimg.2001.0921.

    CAS  Article  PubMed  Google Scholar 

  5. Cohen M: Parametric Analysis of fMRI Data Using Linear Systems Methods. NeuroImage. 1997, 6 (2): 93-103. 10.1006/nimg.1997.0278.

    CAS  Article  PubMed  Google Scholar 

  6. Friston K, Holmes A, Worsley K, Poline J, Frith C, Frackowaik R: Statistical parametric maps in functional imaging: a general linear approach. Hum Brain Mapp. 2004, 2 (4): 189-210.

    Article  Google Scholar 

  7. Penny W, Friston K: Mixtures of general linear models for functional neuroimaging. IEEE Trans Med Imag. 2003, 22 (4): 504-514. 10.1109/TMI.2003.809140.

    Article  Google Scholar 

  8. Foreman S, Cohen J, Fitzgerald M, Eddy W, Mintun M, Noll D: Improved assessment of significant activation in functional magnetic resonance imaging (fMRI): use of a cluster-size threshold. Magn Res Med. 1995, 33 (5): 636-647. 10.1002/mrm.1910330508.

    Article  Google Scholar 

  9. Worsley K, Evans A, Marrett S, Neelin P: Three-dimensional statistical analysis for CBF activation studies in human brain. J Cereb Blood Flow Metab. 1992, 12 (6): 900-918. 10.1038/jcbfm.1992.127.

    CAS  Article  PubMed  Google Scholar 

  10. Xiong J, Gao J, Lancaster J, Fox P: Clustered Pixel Analysis for Functional MRI Activation Studies of the Human Brain. Hum Brain Mapp. 1995, 4 (4): 287-301.

    Article  Google Scholar 

  11. MacQueen J: Some Methods for Classification and Analysis of Multivariate Observatoins. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967, Berkeley, USA: University of California Press, 281-297.

    Google Scholar 

  12. Bezdek J, Ehrlich R, Full W: FCM: The fuzzy C-Means Clustering Algorithm. Comput Geosci. 1984, 10 (2): 191-203.

    Article  Google Scholar 

  13. Baumgartner R, Scarth G, Teichtmeister C, Somorjai R, Moser E: Fuzzy Clustering of Gradient-Echo Functional MRI in the Human Visual Cortex Part I: reproducibility. J. Magn Reson Imag. 1997, 7 (6): 1094-1101. 10.1002/jmri.1880070623.

    CAS  Article  Google Scholar 

  14. Baumgartner R, Windischberger C, Moser E: Quantification in functional magnetic resonance imaging: fuzzy clustering vs correlation analysis. Magn Reson Imag. 1998, 16 (2): 115-125. 10.1016/S0730-725X(97)00277-4.

    CAS  Article  Google Scholar 

  15. Moser E, Diemling M, Baumgartner R: Fuzzy clustering of gradient-echo functional MRI in the human visual cortex. Part II: quantification. J Magn Reson Imag. 1997, 7 (6): 1102-1108. 10.1002/jmri.1880070624.

    CAS  Article  Google Scholar 

  16. Singh M, Patel P, Khosla D, Kim T: Segmentation of functional MRI by K-Means Clustering. IEEE Trans Nucl Sci. 1996, 43 (3): 2030-2036. 10.1109/23.507264.

    Article  Google Scholar 

  17. Fukunaga K, Hostetler L: The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition. IEEE Trans Inf Theory. 1975, 21 (1): 32-40. 10.1109/TIT.1975.1055330.

    Article  Google Scholar 

  18. Cheng Y: Mean shift, Mode seeking, and Clustering. IEEE Trans Pattern Anal Mach Intell. 1995, 17 (8): 790-799. 10.1109/34.400568.

    Article  Google Scholar 

  19. Dorin C, Peter M: Mean shift: a robust approach towards feature space analysis. IEEE Trans Pattern Anal Mach Intell. 2002, 24 (5): 603-619. 10.1109/34.1000236.

    Article  Google Scholar 

  20. Mayer A, Greenspan H: An Adaptive Mean-Shift Framework for MRI Brain Segmentation. IEEE Trans Med Imag. 2009, 28 (8): 1238-1250.

    Article  Google Scholar 

  21. Connolly C: The relationship between colour metrics and the appearance of three-dimensional coloured objects. Color Res Appl. 1996, 21: 331-337. 10.1002/(SICI)1520-6378(199610)21:5<331::AID-COL2>3.0.CO;2-Z.

    Article  Google Scholar 

  22. Wyszecki G, Stilles W: Color Science: Concepts and Methods, Quantitative Data, and Formulae. 1982, New York: J Wiley

    Google Scholar 

  23. Parzen E: On Estimation of a Probability Density Function and Mode. Ann Math Statist. 1962, 33 (3): 1065-1076. 10.1214/aoms/1177704472.

    Article  Google Scholar 

  24. Cox R: AFNI: Software for Analysis and Visualition of Functional Magnetic Resonance Neuroimages. Comput Biomed Res. 1996, 29 (3): 169-173.

    Article  Google Scholar 

Pre-publication history

Download references


Funding for this study was provided by National Institutes of Health (R21 MH 082187–01, R01 DC004290-11), the National Natural Science Foundation of China (81371640), and the International Science and Technology Cooperation Plan of Suzhou, China (SH201210). None of these organizations had a further role in study design, data collection, data analysis, data interpretation, the writing of this manuscript, or the decision to submit the manuscript for publication.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Leo Ai.

Additional information

Competing interests

None of the authors have any conflict of interest or competing interests with regard to the findings presented in this manuscript, financial, or otherwise.

Authors’ contributions

Study design; LA, GX, JX; Data collection: LA, JX; Analysis; LA, JX; Interpretation; LA, GX, JX. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Cite this article

Ai, L., Gao, X. & Xiong, J. Application of mean-shift clustering to Blood oxygen level dependent functional MRI activation detection. BMC Med Imaging 14, 6 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Mean-shift
  • fMRI
  • BOLD
  • Clustering