Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

exclude _id and _index field in elasticsearch result data

there are 5 fields in each document if simply hit the api. but i want only these two fields(user_id and loc_code) so I mentioned in fields list. but still it return some unnecessary data like _shards,hits,time_out etc.

making POST request in postman plugin in chrome using below query

<:9200>/myindex/mytype/_search
{
    "fields" : ["user_id", "loc_code"],
    "query":{"term":{"group_id":"1sd323s"}}
}   

// output

 {
        "took": 17,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "failed": 0
        },
        "hits": {
            "total": 323,
            "max_score": 8.402096,
            "hits": [
                {
                    "_index": "myindex",
                    "_type": "mytype",
                    "_id": "<someid>",
                    "_score": 8.402096,
                    "fields": {
                        "user_id": [
                            "<someuserid>"
                        ],
                        "loc_code": [
                            768
                        ]
                    }
                },
               ...
            ]
        }
    }

but I want only documents fields(two mentioned fields) neither I want _id,_index,_type. is there any way to do so

like image 789
user3628682 Avatar asked May 31 '14 08:05

user3628682


People also ask

How do I capture a specific field in Elasticsearch?

There are two recommended methods to retrieve selected fields from a search query: Use the fields option to extract the values of fields present in the index mapping. Use the _source option if you need to access the original data that was passed at index time.

What is _ID in Elasticsearch?

_id fieldedit Each document has an _id that uniquely identifies it, which is indexed so that documents can be looked up either with the GET API or the ids query. The _id can either be assigned at indexing time, or a unique _id can be generated by Elasticsearch. This field is not configurable in the mappings.

What is _source in Elasticsearch query?

The _source field contains the original JSON document body that was passed at index time. The _source field itself is not indexed (and thus is not searchable), but it is stored so that it can be returned when executing fetch requests, like get or search.

How do I search all fields in Elasticsearch?

Either the query_string query or the match query would be what you're looking for. query_string will use the special _all field if none is specified in default_field , so that would work out well. And with match you can just specify the _all as well. Save this answer.


1 Answers

A solution that may not be complete but helps a lot is to use filter_path. For example, suppose we have the following content in an index:

PUT foods/_doc/_bulk
{ "index" : { "_id" : "1" } }
{ "name" : "chocolate cake", "calories": "too much" }
{ "index" : { "_id" : "2" } }
{ "name" : "lemon pie", "calories": "a lot!"  }
{ "index" : { "_id" : "3" } }
{ "name" : "pizza", "calories": "oh boy..."  }

A search like this...

GET foods/_search
{
  "query": {
    "match_all": {}
  }
}

...will yield a lot of metadata:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "foods",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "lemon pie",
          "calories" : "a lot!"
        }
      },
      {
        "_index" : "foods",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "chocolate cake",
          "calories" : "too much"
        }
      },
      {
        "_index" : "foods",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "name" : "pizza",
          "calories" : "oh boy..."
        }
      }
    ]
  }
}

But if we give the search URL the parameter filter_path=hits.hits._score...

GET foods/_search?filter_path=hits.hits._source
{
  "query": {
    "match_all": {}
  }
}

...it will only return the source (although still deeply nested):

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "name" : "lemon pie",
          "calories" : "a lot!"
        }
      },
      {
        "_source" : {
          "name" : "chocolate cake",
          "calories" : "too much"
        }
      },
      {
        "_source" : {
          "name" : "pizza",
          "calories" : "oh boy..."
        }
      }
    ]
  }
}

You can even filter for fields:

GET foods/_search?filter_path=hits.hits._source.name
{
  "query": {
    "match_all": {}
  }
}

...and you will get this:

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "name" : "lemon pie"
        }
      },
      {
        "_source" : {
          "name" : "chocolate cake"
        }
      },
      {
        "_source" : {
          "name" : "pizza"
        }
      }
    ]
  }
}

And you can do a lot more if you will: just check the documentation.

like image 58
brandizzi Avatar answered Sep 22 '22 09:09

brandizzi