I want to check for documents that have media_url == '' || media_url == null
. I have a query:
{
"engagements": [
"blah"
],
"query": {
"from": 0,
"size": 2,
"sort": [
{
"bookmarked": {
"order": "desc"
}
},
{
"created_at": {
"order": "desc"
}
}
],
"facets": {},
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"car_id": "78778"
}
},
{
"range": {
"created_at": {
"gte": "2015-04-12T04:00:00.000Z",
"lte": "2015-05-13T03:59:59.999Z"
}
}
},
{
"term": {
"media_url": ""
}
}
],
"should": [
{
"term": {
"bookmarked": false
}
}
]
}
}
}
},
"aggregations": {
"word_frequencies": {
"terms": {
"field": "text",
"size": 150
}
}
},
"highlight": {
"fields": {
"text": {
"fragment_size": 1500
}
}
}
},
"api": "_search"
}
However, if I do what I do above, then records that are set to null wouldn't be returned. What should I do to return records with either '' or null as their media_url value?
A null value cannot be indexed or searched. When a field is set to null , (or an empty array or an array of null values) it is treated as though that field has no values. Replace explicit null values with the term NULL . An empty array does not contain an explicit null , and so won't be replaced with the null_value .
Term queryedit. Returns documents that contain an exact term in a provided field. You can use the term query to find documents based on a precise value such as a price, a product ID, or a username. Avoid using the term query for text fields.
Elasticsearch provides a full Query DSL (Domain Specific Language) based on JSON to define queries. Think of the Query DSL as an AST (Abstract Syntax Tree) of queries, consisting of two types of clauses: Leaf query clauses.
Perhaps you can try using the "or" filter. http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-or-filter.html
{
"or": [
{
"term": {
"media_url": ""
}
},
{
"term": {
"media_url": null
}
}
]
}
Edit: Here's the full query (untested since I don't have an example document/index template)
{
"engagements": [
"blah"
],
"query": {
"from": 0,
"size": 2,
"sort": [
{
"bookmarked": {
"order": "desc"
}
},
{
"created_at": {
"order": "desc"
}
}
],
"facets": {},
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"car_id": "78778"
}
},
{
"range": {
"created_at": {
"gte": "2015-04-12T04:00:00.000Z",
"lte": "2015-05-13T03:59:59.999Z"
}
}
},
{
"or": [
{
"term": {
"media_url": ""
}
},
{
"term": {
"media_url": null
}
}
]
}
],
"should": [
{
"term": {
"bookmarked": false
}
}
]
}
}
}
},
"aggregations": {
"word_frequencies": {
"terms": {
"field": "text",
"size": 150
}
}
},
"highlight": {
"fields": {
"text": {
"fragment_size": 1500
}
}
}
},
"api": "_search"
}
You can use the missing filter to take care of null value or field itself is missing. You can combine the same with an empty string term to achieve what you want.
{
"or": [
{
"term": {
"media_url": ""
}
},
{
"missing": {
"field": "media_url"
}
}
]
}
Use the above instead of the single term query for "media_url" in the must clause of your Boolean filter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With