Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - image to DataFrame

I want to convert an RGB image into a DataFrame, so that I have the co-ordinates of each pixel and their RGB value.

         x   y   red  green  blue
0        0   0   154      0     0
1        1   0   149    111     0
2        2   0   153      0     5
3        0   1   154      0     9
4        1   1   154     10    10
5        2   1   154      0     0

I can extract the RGB into a DataFrame quite easily

colourImg = Image.open("test.png")
colourPixels = colourImg.convert("RGB")
colourArray = np.array(colourPixels.getdata())

df = pd.DataFrame(colourArray, columns=["red","green","blue"])

But I don't know how to get the X & Y coordinates in there. I could write a loop, but on a large image that takes a long time.

like image 284
Terence Eden Avatar asked Apr 04 '18 10:04

Terence Eden


People also ask

How do you insert an image into a data frame?

Use HTML Module to Render Image in a Pandas DataFrame We will assign this list to the dataframe as a column. We have created a function that converts the path to the HTML's img tag by concatenating <img src= with the path . The function also resizes the image to a small scale using the width attribute.

Can you store an image in a Pandas DataFrame?

It is possible to wrap the images in a python list. That makes it possible to store it in a pandas DataFrame.

Can DataFrame contain image?

This image data source is used to load image files from a directory, it can load compressed image (jpeg, png, etc.) into raw image representation via ImageIO in Java library. The loaded DataFrame has one StructType column: “image”, containing image data stored as image schema.


1 Answers

Try using np.indices unfortunately it ends up with a array where the coordinate is the first dimension, but you can do a bit of np.moveaxis to fix that.

colourImg = Image.open("test.png")
colourPixels = colourImg.convert("RGB")
colourArray = np.array(colourPixels.getdata()).reshape(colourImg.size + (3,))
indicesArray = np.moveaxis(np.indices(colourImg.size), 0, 2)
allArray = np.dstack((indicesArray, colourArray)).reshape((-1, 5))


df = pd.DataFrame(allArray, columns=["y", "x", "red","green","blue"])

It's not the pretiest, but it seems to work (edit: fixed x,y being the wrong way around).

like image 146
davidsheldon Avatar answered Oct 02 '22 23:10

davidsheldon