I would like a canonical answer on the best way to convert input rgb images to grayscale in Keras. This answer hints that perhaps such a thing would be best achieved with a Lambda, but that feels inefficient to me. It seems to me that Average Pooling layers should be able to do the trick, but I can't seem to figure that out. Is there an RGB to Grayscale layer that I am just missing in the docs? It seems like that is a pretty commonplace operation.
RGB images are converted to grayscale using the formula gray=(red+green+blue)/3 or gray=0.299red+0.587green+0.114blue if "Weighted RGB to Grayscale Conversion" is checked in Edit>Options>Conversions.
Change a picture to grayscale or to black-and-whiteRight-click the picture that you want to change, and then click Format Picture on the shortcut menu. Click the Picture tab. Under Image control, in the Color list, click Grayscale or Black and White.
There are a few formulas to transform a color image into a grayscale image. They're very well determined, and the choice often depends on whether you'd like brighter or darker results, better contrast, etc.
Three common formulas are here. Let's take the "luminosity" formula.
result = 0.21 R + 0.72 G + 0.07 B
This can only be achieved by a lambda layer. And it's not inneficient, it's just necessary math.
def converter(x):
#x has shape (batch, width, height, channels)
return (0.21 * x[:,:,:,:1]) + (0.72 * x[:,:,:,1:2]) + (0.07 * x[:,:,:,-1:])
Add this lambda layer to the model:
Lambda(converter)
Although the AveragePooling seems to be the way, these layers are meant to reduce the "spatial" dimensions, not the "channels". You'd need a lot of workaround and reshaping to make one of these pooling layers apply to channels.
If you prefer to use a ready formula from tensorflow, again, use a lambda layer, now with this function, based on the answer you provided:
Lambda(lambda x: tf.image.rgb_to_grayscale(x))
Other options for converter
:
#perhaps faster? perhaps slower?
def converter(x):
weights = K.constant([[[[0.21 , 0.72 , 0.07]]]])
return K.sum(x*weights, axis=-1,keepdims=True)
As Stepan Novikov commented. If your idea is simply to preprocess images, you can use other tools and avoid the trouble.
You only need to do this inside the model if it's important to you to keep track of the gradients in this operation.
There's a much easier way in Keras>=2.1.6 to convert between RGB and grayscale. When you are augmenting your image data using the ImageDataGenerator Class, you can use the flow_from_directory method to create a generator object which can be used to train your model using the fit_generator method.
What is great about the flow_from_directory method is that it has several parameters to do more image processing, one of which is color_mode which can be set to 'rgb' or 'grayscale'. I'm not sure why this parameter is included in the generator object and not in the ImageDataGenerator object parameters but it does the trick.
If you are willing to take a small effort to setup a generator (docs: https://keras.io/preprocessing/image/#imagedatagenerator-methods) this and several other useful pre-processing parameters become available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With