I need to write webapp where user will be able to perform search based on images colors. My question is how to store color data? I think that best solution will be to reduce image colors and prepare histogram for each r, g and b channel but I don't have idea how to design database. I want to use MySQL DBMS. Could someone point me into right direction?
Regards
A couple of ideas come to mind for storing histogram data. The obvious choice is to have one table (or three for separate R/G/B channels) that represents the (normalized) histogram, with a column for each bin. If you're in 24 bit color (8 bits/channel), you could break each channel into 16 bins ([0-15], ..., [240-255]), and in each column store the percentage of pixels that fell into that bin.
Something like this:
id imgID R_0_15 ... R_240_255 G_0_15 ... G_240_255 B_0_15 ... B_240_255
1 1234 0.1 0.23 0.023 0.234 0.11 0.01
With this design, the entire (normalized) histogram for each image would be represented as a single row in the table.
Queries would be a bit challenging--you'd have to generate them dynamically to plug in the right column names for the value range of interest.
Perhaps a better way would be a HistogramBins table with a row entry for each image and each bin:
id imgID component bin_min bin_max percentage
1 1234 R 0 15 0.1
....omitted rows...
1 1234 R 240 255 0.23
...etc...
With that storage format, queries could be prepared rather than dynamically computed. It's not clear to me whether the components should be broken out as I did or if you should store one row for "bin 1" of all three color components. I'd probably want to write some queries and see what felt best for your application.
Also, the reason I keep saying 'normalized' is that this scheme would make your binning independent of image size.
Hope this helps get you started. Let us know what you end up with!
RGB values have no meaning to human perception but they can be easily converted to Hue, Saturation, Luminance which is more sensible to people. Unfortunately, saturation and luminance are pretty intuitive: richer:paler and lighter:darker, but we have no natural ordering for colors so hue is expressed as an arbitrary number of degrees around a circle. In practice, asking people to make fine hue discriminations, especially when searching for something yet unseen is pretty hard. Therefore, you might want to limit your categories to the vertices of the hexagon in figure "a".
Then you run into the question of what is the representative color of a photograph? Is the image that is half blue sky and half tan sand blue or tan? Are you picking a dominant hue? You might want to apply a huge Gaussian blur and then average the resultant hues. You probably need to refine your question and goals further.
Even HSL has its descriptive limitations. I mention "tan" above as the color of sand. Most readers probably have no problem at all perceiving or naming it, but unless you have too much experience playing with color, it is pretty non-obvious that the hue of tan is orange but pale (less saturated) and bright (higher value). And about a third of the hue circle is devoted to greens, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With