I want to convert an RGB image into a DataFrame, so that I have the co-ordinates of each pixel and their RGB value.
x y red green blue
0 0 0 154 0 0
1 1 0 149 111 0
2 2 0 153 0 5
3 0 1 154 0 9
4 1 1 154 10 10
5 2 1 154 0 0
I can extract the RGB into a DataFrame quite easily
colourImg = Image.open("test.png")
colourPixels = colourImg.convert("RGB")
colourArray = np.array(colourPixels.getdata())
df = pd.DataFrame(colourArray, columns=["red","green","blue"])
But I don't know how to get the X & Y coordinates in there. I could write a loop, but on a large image that takes a long time.
Use HTML Module to Render Image in a Pandas DataFrame We will assign this list to the dataframe as a column. We have created a function that converts the path to the HTML's img tag by concatenating <img src= with the path . The function also resizes the image to a small scale using the width attribute.
It is possible to wrap the images in a python list. That makes it possible to store it in a pandas DataFrame.
This image data source is used to load image files from a directory, it can load compressed image (jpeg, png, etc.) into raw image representation via ImageIO in Java library. The loaded DataFrame has one StructType column: “image”, containing image data stored as image schema.
Try using np.indices
unfortunately it ends up with a array where the coordinate is the first dimension, but you can do a bit of np.moveaxis
to fix that.
colourImg = Image.open("test.png")
colourPixels = colourImg.convert("RGB")
colourArray = np.array(colourPixels.getdata()).reshape(colourImg.size + (3,))
indicesArray = np.moveaxis(np.indices(colourImg.size), 0, 2)
allArray = np.dstack((indicesArray, colourArray)).reshape((-1, 5))
df = pd.DataFrame(allArray, columns=["y", "x", "red","green","blue"])
It's not the pretiest, but it seems to work (edit: fixed x,y being the wrong way around).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With