I am new on elasticsearch. trying to use it on analytics calculations. I don't know, does it possible to do that but, I am trying to find customers who has 0 purchases. I stored, orders as nested object arrays for each customer. Here you may found an example mapping properties of customers index:
"first_name" => [
"type" => "text"
],
"last_name" => [
"type"=> "text"
],
"email" => [
"type"=> "text"
],
"total_spent" => [
"type"=> "text"
],
"aov" => [
"type"=> "float"
],
"orders_count" => [
"type"=> "integer"
],
"orders" => [
"type" => "nested",
"properties" => [
"order_id" => [
"type"=>"text"
],
"total_price" => [
"type"=>"float"
]
]
]
Example Customer indexes:
[
{
"_index":"customers_index",
"_type":"_doc",
"_id":"1",
"_score":1,
"_source":{
"first_name":"Stephen",
"last_name":"Long",
"email":"[email protected]",
"total_spent":"0.00",
"aov":0,
"orders":[]
}
},
{
"_index":"customers_index",
"_type":"_doc",
"_id":"2",
"_score":1,
"_source":{
"first_name":"Reece",
"last_name":"Dixon",
"email":"[email protected]",
"total_spent":"0.10",
"aov":"0.1",
"orders":[
{
"total_price":"0.10",
"placed_at":"2020-09-24T20:08:35.000000Z",
"order_id":2723671867546
}
]
}
},
{
"_index":"customers_index",
"_type":"_doc",
"_id":"3",
"_score":1,
"_source":{
"first_name":"John",
"last_name":"Marshall",
"email":"[email protected]",
"total_spent":"0.10",
"aov":"0.04",
"orders":[
{
"total_price":"0.10",
"placed_at":"2020-09-24T20:10:52.000000Z",
"order_id":2723675930778
},
{
"total_price":"0.30",
"placed_at":"2020-09-24T20:09:45.000000Z",
"order_id":2723673899162
},
{
"total_price":"0.10",
"placed_at":"2020-09-16T09:55:22.000000Z",
"order_id":2704717414554
}
]
}
}
]
First of all, I want to ask do you think this kind of mapping fits with nature of elastic search? As an example, I am able to group customers by specific date range and sum total_spent as aggregated data. However, What I want to learn is, does it possible to find customers who has no order with the filtering nested orders array for the specific date range? Do you think, this kind of queries, has some performance issues?
I have not familiar with nosql databases. I am an RDBMS guy. So, I am trying to understand the concept of Elastic Search as an analytics db.
Thanks for responses
Edit:
I am trying to count nested objects within filter of a specified date range between the objects. Does it possible and making sense to do that on elasticsearch? Simply, I want to view, customers who has 1 orders or multiple orders depending to input within a specified dates.
I know how to get, daily count of customers but What If I want to count customers who has 1 orders on the specified date range with in a group of daily statement?
Possible response that I expected:
{
...
"aggregations":[
{
"date":"2020-09-01",
"total_customers_zero_purchased":15
}
...
]
}
There are a bunch of questions raised here so I'll focus on the most important part.
First off, it's customary to make certain text fields of type .keyword
so we can later aggregate on them. This means:
PUT customers_index
{
"mappings": {
"properties": {
"email": {
"type": "keyword" <--
}
}
}
}
After that we can proceed to our query but have to note that when we iterate over date ranges, we need to specify a date field. Meaning:
filter
to limit its scope)Practically speaking, we cannot get a per-day rolling aggregation (because we don't know what we don't know) but only a single-day metric. For example
GET customers_index/_search
{
"size": 0,
"aggs": {
"multibucket_simulator": {
"filters": {
"filters": {
"all": {
"match_all": {}
}
}
},
"aggs": {
"all_customers": {
"cardinality": {
"field": "email"
}
},
"customers_who_purchased_at_date": {
"filter": {
"nested": {
"path": "orders",
"query": {
"range": {
"orders.placed_at": {
"gte": "2020-09-16T00:00:00.000000Z",
"lt": "2020-09-26T00:00:00.000000Z"
}
}
}
}
},
"aggs": {
"customer_count": {
"cardinality": {
"field": "email"
}
}
}
},
"total_customers_zero_purchased": {
"bucket_script": {
"buckets_path": {
"all": "all_customers.value",
"filtered": "customers_who_purchased_at_date>customer_count.value"
},
"script": "params.all - params.filtered"
}
}
}
}
}
}
yielding
"aggregations" : {
"multibucket_simulator" : {
...
"buckets" : {
"all" : {
...
"customers_who_purchased_at_date" : {
...
},
"all_customers" : {
...
},
"total_customers_zero_purchased" : { <---
"value" : 1.0
}
}
}
}
}
thereby answering the question:
How many customers did not purchase anything between 09/16 and 09/25 inclusive?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With