I am using elasticsearch as a document database and each record I create has a guid id that the system uses for the record id. Business people want to offer a feature to let the user have their own auto file name convention based on date and how many records were created so far this day/month.
What I need is to prevent duplicate user file names. Is there a way to setup an indexed field to be unique? Like a sql unique constraint?
One solution will be to use uniqueId field value for specifying document ID and use op_type=create while storing the documents in ES. With this you can make sure your uniqueId field will have unique value and will not be overridden by another same valued document.
To find the name of a constraint in PostgreSQL, use the view pg_constraint in the pg_catalog schema. Join the view pg_catalog. pg_constraint with the view pg_class ( JOIN pg_class t ON t. oid = c.
Because Elasticsearch is not a relational database, joins do not exist as a native functionality like in an SQL database. It focuses more on search efficiency as opposed to storage efficiency. The stored data is practically flattened out or denormalized to drive fast search use cases.
Every Elasticsearch index made up of one or more shards which are internally Lucene index which has a hard limit of ~2 billion(precisely equal to INT_MAX value-128) on maximum no of the document as explained in this link and this link.
You'd need to use the field that is supposed to be unique as id for your documents. By default a new document with existing id would override the existing document with same id, but you can switch to op_type=create
in order to get back an error if a document with same id already exists.
There's no way to have the same behaviour with arbitrary fields though, only the _id
field works that way. I would probably consider handling this logic in the application layer instead of within elasticsearch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With