Skip to main content

Table.3 Detailed network configuration of U-Net GAN discriminator and U-Net GAN + SA discriminator

From: A method for improving semantic segmentation using thermographic images in infants

Layers

Output size

U-Net

U-Net GAN + SA

Input

320 × 256 × 1

  

Convolution

320 × 256 × 8

3 × 3, 8 d

3 × 3, 8 d

Downscale

160 × 128 × 16

5 × 5, 16 d, CBR

3 × 3, 16 d, CBR

1 × 1, 16 d

7 × 7, 16 d, SA

1 × 1, 16 d

Downscale

80 × 64 × 32

5 × 5, 32 d, CBR

3 × 3, 32 d, CBR

1 × 1, 32 d

77, 32 d, SA

1 × 1, 32 d

Downscale

40 × 32 × 64

5 × 5, 64 d, CBR

3 × 3, 64 d, CBR

1 × 1, 64 d

7 × 7, 64 d, SA

1 × 1, 64 d

Encoder out (\(D_{enc} \left( x \right)\))

5

ReLU

Average Pooling

Linear, 5d

ReLU

Average Pooling

Linear, 5 d

Upscale

80 × 64 × 32

5 × 5, 32 d, CBR

3 × 3, 32 d, CBR

1 × 1, 32 d

7 × 7, 32 d, SA

1 × 1, 32 d

Upscale

160 × 128 × 16

5 × 5, 16 d, CBR

3 × 3, 16 d, CBR

1 × 1, 16 d

7 × 7, 16 d, SA

1 × 1, 16 d

Upscale

0 × 256 × 8

5 × 5, 8 d, CBR

33, 8 d, CBR

1 × 1, 8 d

7 × 7, 8 d, SA

1 × 1, 8 d

Convolution (\(D_{dec} \left( x \right)\))

320 × 256 × 2

3 × 3, 2 d

3 × 3, 2 d