Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get duplicate field values in elastic search by field name without knowing its value

I have a field "EmployeeName" in an elastic search index - and I would like to execute a query that will return me all the cases where there are duplicate values of "EmployeeName". Can this be done?

I found more_like_this but this requires field value for "like_text". But my requirement is to get list of employees who are having duplicate names without knowing its value.

{
    "more_like_this" : {
        "fields" : ["EmployeeName"],
        "like_text" : "Mukesh",
        "min_term_freq" : 1,
        "max_query_terms" : 12
    }
}

Thanks in Advance

Regards Mukesh

like image 419
Mukesh Avatar asked Jun 17 '15 05:06

Mukesh


People also ask

How do you search a specific field in Elasticsearch?

To retrieve specific fields in the search response, use the fields parameter. Because it consults the index mappings, the fields parameter provides several advantages over referencing the _source directly. Specifically, the fields parameter: Returns each value in a standardized way that matches its mapping type.

How do I search for null values in Elasticsearch?

But if you want to search for all documents containing a null value, you can tell Elasticsearch to replace the null value with a default value. And sometimes the login_status is sent as null by default. If the login_status field is null, the login_status field is skipped.

How do I retrieve data from Elasticsearch?

You can use the search API to search and aggregate data stored in Elasticsearch data streams or indices. The API's query request body parameter accepts queries written in Query DSL. The following request searches my-index-000001 using a match query. This query matches documents with a user.id value of kimchy .


Video Answer


1 Answers

You can use Terms Aggregation for this.

POST <index>/<type>/_search?search_type=count
{
    "aggs": {
        "duplicateNames": {
            "terms": {
                "field": "EmployeeName",
                "size": 0,
                "min_doc_count": 2
            }
        }
    }
}

This will return all values of the field EmployeeName which occur in at least 2 documents.

like image 80
bittusarkar Avatar answered Nov 17 '22 17:11

bittusarkar