Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering search results by operating hour with elasticsearch

I'm using elasticsearch to index and search locations, and I'm running into 1 particular issue with filtering by operating hour which I don't know how to work out

Basically, each location will have operating hour (for every day of the week) and each day may have more than 1 "sets" of operating hour (we use 2 for now).

For example: Monday: open 9am / close 12pm open 1pm / close 9pm

Given the current time and the current day of the week, I need to search for the "open" locations.

I don't know how should I index these operating hour together with the location details, and how to use them to filter out the results yet, any help, suggestion would be really appreciated

Regards

like image 230
mr1031011 Avatar asked Aug 17 '11 15:08

mr1031011


People also ask

Does Elasticsearch in real time?

The overview of documents and indices indicates that when a document is stored in Elasticsearch, it is indexed and fully searchable in near real-time--within 1 second.

How do I retrieve more than 10000 results events in Elasticsearch?

By default, you cannot use from and size to page through more than 10,000 hits. This limit is a safeguard set by the index. max_result_window index setting. If you need to page through more than 10,000 hits, use the search_after parameter instead.


1 Answers

A better way to do this would be to use nested documents.

First: set up your mapping to specify that the hours document should be treated as nested:

curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1'  -d '
{
   "mappings" : {
      "location" : {
         "properties" : {
            "hours" : {
               "include_in_root" : 1,
               "type" : "nested",
               "properties" : {
                  "open" : {
                     "type" : "short"
                  },
                  "close" : {
                     "type" : "short"
                  },
                  "day" : {
                     "index" : "not_analyzed",
                     "type" : "string"
                  }
               }
            },
            "name" : {
               "type" : "string"
            }
         }
      }
   }
}
'

Add some data: (note the multiple values for opening hours)

curl -XPOST 'http://127.0.0.1:9200/foo/location?pretty=1'  -d '
{
   "name" : "Test",
   "hours" : [
      {
         "open" : 9,
         "close" : 12,
         "day" : "monday"
      },
      {
         "open" : 13,
         "close" : 17,
         "day" : "monday"
      }
   ]
}
'

Then run your query, filtering by the current day and time:

curl -XGET 'http://127.0.0.1:9200/foo/location/_search?pretty=1'  -d '
{
   "query" : {
      "filtered" : {
         "query" : {
            "text" : {
               "name" : "test"
            }
         },
         "filter" : {
            "nested" : {
               "path" : "hours",
               "filter" : {
                  "and" : [
                     {
                        "term" : {
                           "hours.day" : "monday"
                        }
                     },
                     {
                        "range" : {
                           "hours.close" : {
                              "gte" : 10
                           }
                        }
                     },
                     {
                        "range" : {
                           "hours.open" : {
                              "lte" : 10
                           }
                        }
                     }
                  ]
               }
            }
         }
      }
   }
}
'

This should work.

Unfortunately, in 0.17.5, it throws an NPE - it is likely to be a simple bug which will be fixed shortly. I have opened an issue for this here: https://github.com/elasticsearch/elasticsearch/issues/1263

UPDATE Bizarrely, I now can't replicate the NPE - this query seems to work correctly both on version 0.17.5 and above. Must have been some temporary glitch.

clint

like image 121
DrTech Avatar answered Nov 15 '22 06:11

DrTech