Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible to store images in Elasticsearch?

Is it possible to store images in Elasticsearch clusters? If yes, then is there a resource about the work flow? I checked the following link: https://github.com/kzwang/elasticsearch-image

Since we have to handle large image files (over 500GB), we are planning to use HDFS.

like image 832
prem Avatar asked May 25 '15 14:05

prem


People also ask

Can Elasticsearch store files?

Elasticsearch is a distributed document store. Instead of storing information as rows of columnar data, Elasticsearch stores complex data structures that have been serialized as JSON documents.

What type of data can I store in Elasticsearch?

Elasticsearch stores data as JSON documents. Each document correlates a set of keys (names of fields or properties) with their corresponding values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).

Which database is best for storing images?

Store in Couchbase a metadata JSON document for each object, maybe a small thumbnail image at most. In that document is data you need about that object in your application quickly, but also a pointer to a purpose built object store like S3, a file system or HDFS. You will get the best of all worlds.

Is key value database good for images?

Good Key-Value Database Use Cases Include: Cache management. Blockchain implementation. Multimedia storage or large objects (video, images, audio, etc.)


1 Answers

Storing whole images in Elasticsearch will not be very beneficial, because if the image is scaled/cropped and then used as a query, it will give incorrect results. What you need depends on why you want to index these images.

In my case, I need to find if an image after some scaling or cropping, has a close match in my database. I am extracting local descriptors (SIFT/SURF) of images and using them to build an Elasticsearch index. This will reduce the image index size as instead of storing the whole image, only a few features are stored. I will be storing all these images on S3 for now and Elasticsearch will store ids for these images along with the features extracted from them.

Regarding elasticsearch-image: This plugin has not been updated in a while and the most recent responses to issues were from last year. This plugin integrates LIRE with Elasticsearch, where LIRE provides the functionality of a multiple image fingerprints extractor.

Possible solutions:

  1. Integrate the library OpenCv (to compute feature vectors for an image) and Elasticsearch and build your own index using these image features instead of storing a whole image. For the product architecture, you can get some hints here.

  2. Use an older version of Elasticsearch with a compatible version of elasticsearch-image.

  3. Upgrade elasticsearch-image to work with the latest version of Elasticsearch.

  4. You can also use SOLR along with LireSolr plugin to integrate with the LireSolr library.

UPDATE:- This is update on task of Image retrieval where you need to search for close image matches. I would recommend you to go through this link https://paperswithcode.com/task/image-retrieval. The best solution - Deep Local Features is already integrated in tensorflow.

like image 162
saurabheights Avatar answered Sep 28 '22 07:09

saurabheights