Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to produce a unitary thumbnail for a billion of png images?

In the application, there are about 1 billion of png images (size 1024*1024 and about 1MB each), it needs combining the 1 billion images to a huge image, then produces a size 1024*1024 unitary thumbnail for it. Or maybe we don't need to really combine the images to a huge one, but just do some magic algorithm to produce the unitary thumbnail in the computer memory? Meanwhile this process needs to be done as fast as possible, better in seconds, or at least in a few minutes. Does anyone have idea?

enter image description here

like image 621
Suge Avatar asked Feb 14 '17 07:02

Suge


1 Answers

The idea of loading a billion images into a single montage process is ridiculous. Your question is unclear, but your approach should be to determine how many pixels each original image will amount to in your final image, then extract the necessary number of pixels from each image in parallel. Then assemble those pixels into a final image.

So, if each image will be represented by one pixel in your final image, you need to get the mean of each image which you can do like this:

convert image1.png image2.png ... -format "%[fx:mean.r],%[fx:mean.g],%[fx:mean.b]:%f\n" info:

Sample Output

0.423529,0.996078,0:image1.png
0.0262457,0,0:image2.png

You can do that then very fast in parallel with GNU Parallel, using something like

find . -name \*.png -print0 | parallel -0 convert {} -format "%[fx:mean.r],%[fx:mean.g],%[fx:mean.b]:%f\n" info:

Then you can make a final image and put the individual pixels in.

Scanning even 1,000,000 PNG files is likely to take many hours...

You don't say how big your images are, but if they are of the order of 1MB each, and you have 1,000,000,000 then you need to do a petabyte of I/O to read them, so even with a 500MB/s ultra-fast SSD, you will be there 23 days.

like image 157
Mark Setchell Avatar answered Sep 28 '22 00:09

Mark Setchell