Is it possible to store images in Elasticsearch clusters? If yes, then is there a resource about the work flow? I checked the following link: https://github.com/kzwang/elasticsearch-image
Since we have to handle large image files (over 500GB), we are planning to use HDFS.
Elasticsearch is a distributed document store. Instead of storing information as rows of columnar data, Elasticsearch stores complex data structures that have been serialized as JSON documents.
Elasticsearch stores data as JSON documents. Each document correlates a set of keys (names of fields or properties) with their corresponding values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).
Store in Couchbase a metadata JSON document for each object, maybe a small thumbnail image at most. In that document is data you need about that object in your application quickly, but also a pointer to a purpose built object store like S3, a file system or HDFS. You will get the best of all worlds.
Good Key-Value Database Use Cases Include: Cache management. Blockchain implementation. Multimedia storage or large objects (video, images, audio, etc.)
Storing whole images in Elasticsearch will not be very beneficial, because if the image is scaled/cropped and then used as a query, it will give incorrect results. What you need depends on why you want to index these images.
In my case, I need to find if an image after some scaling or cropping, has a close match in my database. I am extracting local descriptors (SIFT/SURF) of images and using them to build an Elasticsearch index. This will reduce the image index size as instead of storing the whole image, only a few features are stored. I will be storing all these images on S3 for now and Elasticsearch will store ids for these images along with the features extracted from them.
Regarding elasticsearch-image: This plugin has not been updated in a while and the most recent responses to issues were from last year. This plugin integrates LIRE with Elasticsearch, where LIRE provides the functionality of a multiple image fingerprints extractor.
Possible solutions:
Integrate the library OpenCv (to compute feature vectors for an image) and Elasticsearch and build your own index using these image features instead of storing a whole image. For the product architecture, you can get some hints here.
Use an older version of Elasticsearch with a compatible version of elasticsearch-image.
Upgrade elasticsearch-image to work with the latest version of Elasticsearch.
You can also use SOLR along with LireSolr plugin to integrate with the LireSolr library.
UPDATE:- This is update on task of Image retrieval where you need to search for close image matches. I would recommend you to go through this link https://paperswithcode.com/task/image-retrieval. The best solution - Deep Local Features is already integrated in tensorflow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With