Skip to main content

Table 4 Median spearman correlation coefficients of visual scores and performance measure rankings

From: An evaluation of performance measures for arterial brain vessel segmentation

Overall correlation results (Visual scores 1–10) Correlation results of good quality simulated segmentation variations (Visual scores 1–5) Correlation results of bad quality simulated segmentation variations (Visual scores 6–10)
Rank Performance Measure rho Rank Performance measure rho Rank Performance measure rho
1 bAHD 0.956 1 bAHD 0.817 1 bAHD 0.894
2 AHD 0.950 2 AHD 0.800 2 AHD 0.880
3 RI 0.936 3 VOI 0.758 3 VOI 0.872
3 ACC 0.936 4 GCE 0.757 3 GCE 0.872
3 GCE 0.936 5 ACC 0.754 5 ARI 0.865
3 VOI 0.936 5 RI 0.754 5 ACC 0.865
7 ARI 0.932 7 KAP 0.742 5 RI 0.865
7 KAP 0.932 7 ARI 0.742 8 KAP 0.864
7 PBD 0.932 7 PBD 0.742 8 PBD 0.864
7 DICE 0.932 7 DICE 0.742 8 DICE 0.864
7 ICC 0.932 7 ICC 0.742 8 JAC 0.864
7 JAC 0.932 7 JAC 0.742 8 CNF 0.864
7 CNF 0.932 7 CNF 0.742 8 ICC 0.864
14 PRC 0.858 14 PRC 0.709 14 PRC 0.802
15 SP 0.820 15 SP 0.683 15 SP 0.714
15 SB 0.820 15 SB 0.683 15 SB 0.714
17 MI 0.755 17 MHD 0.621 17 VS 0.532
18 MHD 0.728 18 MI 0.595 18 MI 0.426
19 VS 0.722 19 VS 0.555 19 MHD 0.343
20 HD95 0.418 20 HD95 0.359 20 HD95 0.259
21 AUC 0.378 21 AUC 0.271 21 AUC 0.142
22 SNS 0.314 22 SNS 0.212 22 SNS 0.104
  1. The median correlation of visual scores and performance measure rankings are given for the 10 patients. Together with the overall results analysed over all visual scores ranging from 1–10 (column 1), the results of 2 additional subsets based on the lower (1–5) and upper (6–10) range of the visual scores are reported (columns 2 and 3, respectively). The performance measure names are sorted based on their Spearman correlation coefficient from highest to lowest. Average Hausdorff distance and balanced average Hausdorff distance perform best in the overall analysis as well as in the good and bad quality subsets. In the good quality subset, the difference between average distance-based measures (bAHD and AHD) and overlap based measures is more prominent than in the bad quality subset. This can be interpreted by the relative inability of overlap based measures to distinguish between certain types of errors as shown in Fig. 4. This inability becomes more evident in segmentations of good quality. The group of overlap based measures (Dice, Jaccard, Conformity) have the same correlation in all analyses. Please note that the overall correlation results are inherently higher than the results of the two subsets because the underlying score range of all segmentations (1–10) is wider than the score ranges of the subsets (1–5 and 6–10 respectively). rho: median Spearman correlation coefficient