Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting Images to Black and White for Image Recognition in R

I'm trying to gain some experience with automatic text recognition and i'm using the package tesseract to perform ocr on some images (i.e. some screenshots I took).

To improve the performance of my program's recognition of the prices in the image below, I implemented some preprocessing on the image using the magick package by increasing the contrast of the image by changing brightness and saturation parameters.

However, I think the performance could be further increased by converting to a black and white image.

How can this be efficiently achieved in R?

Original Image original image

After preprocessing image after my preprcessing

like image 330
Francesco Dal Pont Avatar asked Jan 31 '18 22:01

Francesco Dal Pont


People also ask

Can you do image processing in R?

'ImageMagick' is one of the famous open source libraries available for editing and manipulating Images of different types (Raster & Vector Images). magick is an R-package binding to 'ImageMagick' for Advanced Image-Processing in R, authored by Jeroen Ooms.

Why do we need grayscale conversion during image processing?

The main reason why grayscale representations are often used for extracting descriptors instead of operating on color images directly is that grayscale simplifies the algorithm and reduces computational requirements.

What is R in image?

image() function creates a grid of colored or gray-scale rectangles with colors corresponding to the values in z. This can be used to display three-dimensional or spatial data aka images.


1 Answers

You can convert the colorspace with magick::image_quantize:

library(magick)
#> Linking to ImageMagick 6.9.9.25
#> Enabled features: cairo, fontconfig, freetype, fftw, lcms, pango, rsvg, webp
#> Disabled features: ghostscript, x11

i <- image_read('https://i.stack.imgur.com/nn9k0.png')

i

i %>% image_quantize(colorspace = 'gray')

Depending on your desired image structure, you could also use image_convert to do the same thing:

i %>% image_convert(colorspace = 'gray')
# or
i %>% image_convert(type = 'Grayscale')

or to convert to true black and white (not grayscale),

i %>% image_convert(type = 'Bilevel')

which in this case returns an image with salt and pepper noise, which may or may not be useful.

Note, however, that while this might be good practice for OCR, it would be a lot simpler to get this data by webscraping, e.g. with rvest should it be permissible (presumably the same issues apply to grabbing these images). Better, should it contain the information you need, is to use the appropriate RyanAir API.

like image 129
alistaire Avatar answered Sep 20 '22 14:09

alistaire