I'm able to train a U-net with labeled images that have a binary classification. But I'm having a hard time figuring out how to configure the final layers in Keras/Theano for multi-class classification (4 classes). I have 634 images and corresponding 634 masks that are <code>unit8</code> and 64 x 64 pixels. My masks, instead of being black (0) and white (1), have color labeled objects in 3 categories plus background as follows: <ul> <li>black (0), background</li> <li>red (1), object class 1</li> <li>green (2), object class 2</li> <li>yellow (3), object class 3</li> </ul> Before training runs, the array containing masks is one-hot encoded as follows: <pre class="prettyprint"><code>mask_train = to_categorical(mask_train, 4) </code></pre> This makes <code>mask_train.shape</code> go from <code>(634, 1, 64, 64)</code> to <code>(2596864, 4)</code>. My model closely follows the Unet architecture, however the final layers seem problematic, as I'm unable to flatten the structure so as to match the one-hot encoded array. <pre class="prettyprint"><code>[...] up3 = concatenate([UpSampling2D(size=(2, 2))(conv7), conv2], axis=1) conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(up3) conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv8) up4 = concatenate([UpSampling2D(size=(2, 2))(conv8), conv1], axis=1) conv9 = Conv2D(64, (3, 3), activation='relu', padding='same')(up4) conv10 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv9) # here I used number classes = number of filters and softmax although # not sure if a dense layer should be here instead conv11 = Conv2D(4, (1, 1), activation='softmax')(conv10) model = Model(inputs=[inputs], outputs=[conv11]) # here categorical cross entropy is being used but may not be correct model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy']) return model </code></pre> Do you have any suggestions on how to modify the final portions of the model so this trains successfully? I get a variety of shape mismatch errors, and the few times I managed to make it run, the loss did not change throughout epochs.

You should have your target as <code>(634,4,64,64)</code> if you're using channels_first. Or <code>(634,64,64,4)</code> if channels_last. Each channel of your target should be one class. Each channel is an image of 0's and 1's, where 1 means that pixel is that class and 0 means that pixel is not that class. Then, your target is 634 groups, each group containing four images, each image having 64x64 pixels, where pixels 1 indicate the presence of the desired feature. I'm not sure the result will be ordered correctly, but you can try: <pre class="prettyprint"><code>mask_train = to_categorical(mask_train, 4) mask_train = mask_train.reshape((634,64,64,4)) #I chose channels last here because to_categorical is outputing your classes last: (2596864,4) #moving the channel: mask_train = np.moveaxis(mask_train,-1,1) </code></pre> If the ordering doesn't work properly, you can do it manually: <pre class="prettyprint"><code>newMask = np.zeros((634,4,64,64)) for samp in range(len(mask_train)): im = mask_train[samp,0] for x in range(len(im)): row = im[x] for y in range(len(row)): y_val = row[y] newMask[samp,y_val,x,y] = 1 </code></pre>

Bit late but you should try <pre class="prettyprint"><code>mask_train = to_categorical(mask_train, num_classes=None) </code></pre> That will result in <code>(634, 4, 64, 64)</code> for <code>mask_train.shape</code> and a binary mask for each individual class (one-hot encoded). Last conv layer, activation and loss looks good for multiclass segmentation.

How to implement multi-class semantic segmentation?

Tags:

python

machine-learning

deep-learning

keras

image-segmentation

I'm able to train a U-net with labeled images that have a binary classification.

But I'm having a hard time figuring out how to configure the final layers in Keras/Theano for multi-class classification (4 classes).

I have 634 images and corresponding 634 masks that are unit8 and 64 x 64 pixels.

My masks, instead of being black (0) and white (1), have color labeled objects in 3 categories plus background as follows:

black (0), background
red (1), object class 1
green (2), object class 2
yellow (3), object class 3

Before training runs, the array containing masks is one-hot encoded as follows:

mask_train = to_categorical(mask_train, 4)

This makes mask_train.shape go from (634, 1, 64, 64) to (2596864, 4).

My model closely follows the Unet architecture, however the final layers seem problematic, as I'm unable to flatten the structure so as to match the one-hot encoded array.

[...]
up3 = concatenate([UpSampling2D(size=(2, 2))(conv7), conv2], axis=1)
conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(up3)
conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv8)

up4 = concatenate([UpSampling2D(size=(2, 2))(conv8), conv1], axis=1)
conv9 = Conv2D(64, (3, 3), activation='relu', padding='same')(up4)
conv10 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv9)

# here I used number classes = number of filters and softmax although
# not sure if a dense layer should be here instead
conv11 = Conv2D(4, (1, 1), activation='softmax')(conv10)

model = Model(inputs=[inputs], outputs=[conv11])

# here categorical cross entropy is being used but may not be correct
model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

return model

Do you have any suggestions on how to modify the final portions of the model so this trains successfully? I get a variety of shape mismatch errors, and the few times I managed to make it run, the loss did not change throughout epochs.

925

asked May 10 '17 18:05

pepe

2 Answers

You should have your target as (634,4,64,64) if you're using channels_first.
Or (634,64,64,4) if channels_last.

Each channel of your target should be one class. Each channel is an image of 0's and 1's, where 1 means that pixel is that class and 0 means that pixel is not that class.

Then, your target is 634 groups, each group containing four images, each image having 64x64 pixels, where pixels 1 indicate the presence of the desired feature.

I'm not sure the result will be ordered correctly, but you can try:

mask_train = to_categorical(mask_train, 4)
mask_train = mask_train.reshape((634,64,64,4)) 
#I chose channels last here because to_categorical is outputing your classes last: (2596864,4)

#moving the channel:
mask_train = np.moveaxis(mask_train,-1,1)

If the ordering doesn't work properly, you can do it manually:

newMask = np.zeros((634,4,64,64))

for samp in range(len(mask_train)):
    im = mask_train[samp,0]
    for x in range(len(im)):
        row = im[x]
        for y in range(len(row)):
            y_val = row[y]
            newMask[samp,y_val,x,y] = 1

144

answered Sep 20 '22 16:09

Daniel Möller

Bit late but you should try

mask_train = to_categorical(mask_train, num_classes=None)

That will result in (634, 4, 64, 64) for mask_train.shape and a binary mask for each individual class (one-hot encoded).

Last conv layer, activation and loss looks good for multiclass segmentation.

answered Sep 17 '22 16:09

Daniel

Related questions
                            
                                Why doesn't Python always require spaces around keywords?
                            
                                Python generator for paged API resource
                            
                                Python 2.7 Unit test: Assert logger warning thrown
                            
                                python iterating over dictionaries
                            
                                Understanding numpy.linalg.norm() in IPython
                            
                                Best Practices for using 'multiprocessing' package in python
                            
                                Generating a CSR with Python (crypto)
                            
                                Best way of removing duplicates from a list by object attribute
                            
                                Julia performance compared to Python+Numba LLVM/JIT-compiled code
                            
                                Speed of K-Nearest-Neighbour build/search with SciKit-learn and SciPy
                            
                                TypeError: encoding or errors without a string argument
                            
                                Python variables lose scope inside generator?
                            
                                django model save - override method not invoked during migrations
                            
                                Plot single data with two Y axes (two units) in matplotlib
                            
                                Update joined table via SQLAlchemy ORM using session.query
                            
                                HandShake Failure in python(_ssl.c:590)
                            
                                How to cache reads?
                            
                                Tensor multiplication with numpy tensordot
                            
                                Difference between original xgboost (Learning API) and sklearn XGBClassifier (Scikit-Learn API)
                            
                                Sorting an Array in TensorFlow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With