Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch: compare dotted version strings

I'm looking for a way to save a dotted version as string (e.g "1.2.23") in elastic and then use range query on this field. e.g

{
 "query": {
   "range": {
    "version": {"gte": "1.2.3", "lt": "1.3"}
   }
  }
}

I have only 3 components (major, minor, build). I need to be able to determine that

  • 1.20.3 > 1.2.3
  • 1.02.4 > 1.2.3
  • 1.3 > 1.2.3

I thought about the following approaches:

  1. Pad with zeros (e.g "1.2.3" -> "000001.000002.000003"). This assumes I know the max length of each component
  2. Split into 3 different integer fields (i.e "major", "minor", "build"). Writing queries for this seems to be a pain, but I'd be happy to get suggestions for this.
  3. Perhaps some sort of a custom analyser? I saw this: Elasticsearch analysis plugin for natural sort which might be a good start.

Any other ideas or recommendations?

like image 622
Eldad Avatar asked Jun 11 '16 12:06

Eldad


1 Answers

If you have some latitude in your indexing code to massage those semantic versions into something else, I would suggest to transform each version into a unique integer and then it's very easy to compare those numbers with a single range query.

The algorithm is simple:

  1. You split the version on dots: 1.2.34 => 1, 2, 34
  2. You multiply the major version by 1000000: 1 => 1000000
  3. You multiply the minor version by 1000: 2 => 2000
  4. You sum all three numbers up: 1000000 + 2000 + 34 => 1002034
  5. You store that resulting number into your ES documents and use it to compare versions in range queries
like image 124
Val Avatar answered Sep 28 '22 14:09

Val