Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do a wildcard or regex match on _id in elasticsearch?

From below sample elasticsearch data I want to apply wildcard say *.000ANT.* on _id so as to fetch all docs whose _id contains 000ANT. Please help.

"hits": [
  {
    "_index": "data_collector",
    "_type": "agents",
    "_id": "Org000LAN_example1.com",
    "_score": 1,
    "fields": {
      "host": [
        "example1.com"
      ]
    }
  },
  {
    "_index": "data_collector",
    "_type": "agents",
    "_id": "000BAN_example2.com",
    "_score": 1,
    "fields": {
      "host": [
        "example2.com"
      ]
    }
  },
  {
    "_index": "data_collector",
    "_type": "agents",
    "_id": "000ANT_example3.com",
    "_score": 1,
    "fields": {
      "host": [
        "example3.com"
      ]
    }
  }
]
like image 690
AabinGunz Avatar asked Jun 15 '15 11:06

AabinGunz


People also ask

What regex does Elasticsearch use?

Elasticsearch uses Apache Lucene's regular expression engine to parse these queries.

What is wildcard search in Elasticsearch?

A wildcard operator is a placeholder that matches one or more characters. For example, the * wildcard operator matches zero or more characters. You can combine wildcard operators with other characters to create a wildcard pattern.

What is _ID in Elasticsearch?

_id fieldedit Each document has an _id that uniquely identifies it, which is indexed so that documents can be looked up either with the GET API or the ids query. The _id can either be assigned at indexing time, or a unique _id can be generated by Elasticsearch. This field is not configurable in the mappings.

How do you do a wildcard in Kibana?

There are two wildcard expressions you can use in Kibana – asterisk (*) and question mark (?). * matches any character sequence (including the empty one) and ? matches single characters. Since these queries are performed across a large number of terms, they can be extremely slow.


2 Answers

This is just an extension on Andrei Stefan's answer

{
  "query": {
    "script": {
      "script": "doc['_id'][0].indexOf('000ANT') > -1"
    }
  }
}

Note: I do not know the performance impact of such a query, most probably this is a bad idea. Use with caution and avoid if possible.

like image 88
mido Avatar answered Oct 18 '22 17:10

mido


You can use a wildcard query like this, though it's worth noting that it is not advised to start a wildcard term with * as performance will suffer.

{
  "query": {
    "wildcard": {
      "_uid": "*000ANT*"
    }
  }
}

Also note that if the wildcard term you're searching for matches the type name of your documents, using uid will not work, as uid is simply the contraction of the type and the id: type#id

like image 28
Val Avatar answered Oct 18 '22 16:10

Val