Skip to main content

Table 4 Median spearman correlation coefficients of visual scores and performance measure rankings

From: An evaluation of performance measures for arterial brain vessel segmentation

Overall correlation results (Visual scores 1–10)

Correlation results of good quality simulated segmentation variations (Visual scores 1–5)

Correlation results of bad quality simulated segmentation variations (Visual scores 6–10)

Rank

Performance Measure

rho

Rank

Performance measure

rho

Rank

Performance measure

rho

1

bAHD

0.956

1

bAHD

0.817

1

bAHD

0.894

2

AHD

0.950

2

AHD

0.800

2

AHD

0.880

3

RI

0.936

3

VOI

0.758

3

VOI

0.872

3

ACC

0.936

4

GCE

0.757

3

GCE

0.872

3

GCE

0.936

5

ACC

0.754

5

ARI

0.865

3

VOI

0.936

5

RI

0.754

5

ACC

0.865

7

ARI

0.932

7

KAP

0.742

5

RI

0.865

7

KAP

0.932

7

ARI

0.742

8

KAP

0.864

7

PBD

0.932

7

PBD

0.742

8

PBD

0.864

7

DICE

0.932

7

DICE

0.742

8

DICE

0.864

7

ICC

0.932

7

ICC

0.742

8

JAC

0.864

7

JAC

0.932

7

JAC

0.742

8

CNF

0.864

7

CNF

0.932

7

CNF

0.742

8

ICC

0.864

14

PRC

0.858

14

PRC

0.709

14

PRC

0.802

15

SP

0.820

15

SP

0.683

15

SP

0.714

15

SB

0.820

15

SB

0.683

15

SB

0.714

17

MI

0.755

17

MHD

0.621

17

VS

0.532

18

MHD

0.728

18

MI

0.595

18

MI

0.426

19

VS

0.722

19

VS

0.555

19

MHD

0.343

20

HD95

0.418

20

HD95

0.359

20

HD95

0.259

21

AUC

0.378

21

AUC

0.271

21

AUC

0.142

22

SNS

0.314

22

SNS

0.212

22

SNS

0.104

  1. The median correlation of visual scores and performance measure rankings are given for the 10 patients. Together with the overall results analysed over all visual scores ranging from 1–10 (column 1), the results of 2 additional subsets based on the lower (1–5) and upper (6–10) range of the visual scores are reported (columns 2 and 3, respectively). The performance measure names are sorted based on their Spearman correlation coefficient from highest to lowest. Average Hausdorff distance and balanced average Hausdorff distance perform best in the overall analysis as well as in the good and bad quality subsets. In the good quality subset, the difference between average distance-based measures (bAHD and AHD) and overlap based measures is more prominent than in the bad quality subset. This can be interpreted by the relative inability of overlap based measures to distinguish between certain types of errors as shown in Fig. 4. This inability becomes more evident in segmentations of good quality. The group of overlap based measures (Dice, Jaccard, Conformity) have the same correlation in all analyses. Please note that the overall correlation results are inherently higher than the results of the two subsets because the underlying score range of all segmentations (1–10) is wider than the score ranges of the subsets (1–5 and 6–10 respectively). rho: median Spearman correlation coefficient