Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need to know how to search in ES using c# searching in arrays

Hello I am a newbie on ElasticSearch and need help. I'm working with c# (thought I could use a QueryRaw in String I think...). Below the scenario:

Data

{
    "id": "1",
    "title": "Small cars",
    "tagsColours": ["grey",
    "black",
    "white"],
    "tagsCars": ["Suzuki",
    "Ford"],
    "tagsKeywords": []
},
{
    "id": "2",
    "title": "Medium cars",
    "tagsColours": [],
    "tagsCars": ["VW",
    "Audi",
    "Peugeot"],
    "tagsKeywords": ["Sedan"]
},
{
    "id": "3",
    "title": "Big cars",
    "tagsColours": ["red",
    "black"],
    "tagsCars": ["Jeep",
    "Dodge"],
    "tagsKeywords": ["Van",
    "Big"]
}

Objective

Id' like to apply filters on tags columns based on users' selection. the values will be populated in the tagsXXX array columns withselected values.

  • if parameter array value is not empty then the result should contain at least one instance. Same for every parameter array. the more the parameters have values, the more specific search should be done
  • if at least there's one value coming from the parameter that matches amongst all values in any document's tag column array, then get that document. but if there's another value on another tagsXXX array then it should take it into account.
  • if the tag parameter array has no values, then disregard that filter

Desired responses

A) If user select only 1 tag Color (i.e= black) as formatted below:

{
    id: "",
    title: "",
    tagsColours: ["black"],
    tagsCars: [],
    tagsKeywords: []
}

I'd like to get documents with Id=2 and id=3 since they have black in their tagsColours and disregard tagsCars and tagsKeywords since there are no values on the parameters

B) If user select only 2 diff tags (i.e= colour=black and cars= audi, and mercedez benz) as formatted below:

{
    id: "",
    title: "",
    tagsColours: ["black",
    "yellow"],
    tagsCars: ["Audi",
    "Mercedes Benz"],
    tagsKeywords: []
}

I'd like to get documents with id=2 since it found black on tagsColours and it found Audi in tagsCars, AND it should not pull document id=1 because even when black is on tagsColours, none of paramters values (audi, mercedez benz) is on its tagsCars values

Hello everyone, I'm having issues when trying to search on ElasticSearch and look for in arrays with values, and when parameters have no values. If anyone could helpe me on this I'd appreciatte. I did this:

termsQuery = Query<StructuredData>.Terms(t => t.Field(f =>f.TagsColours).Terms(dataToSearch.TagsColours));
termsQuery = termsQuery && Query<StructuredData>.Terms(t => t.Field(f =>f.TagsCars).Terms(dataToSearch.TagsCars));

and I stopped here (did not add third filter) because I could not mix two filters together dataToSearch has the values from parameters (same structure object, cause .Search makes me do that here .Search()

var settings = new ConnectionSettings(node);

var response = new ElasticClient(settings)
.Search<StructuredData>(
s => s.AllIndices()
.AllTypes()
.From(0)
.Size(50)
.Query(_ => termsQuery)
);

But I'm having problems when using more than 1 filter.. any ideas? is ".Terms" the correct property?

like image 829
marianolp Avatar asked Jul 25 '17 13:07

marianolp


People also ask

How does search in Elasticsearch work?

Elasticsearch uses a data structure called an inverted index, which is designed to allow very fast full-text searches. An inverted index lists every unique word that appears in any document and identifies all of the documents each word occurs in.

What is Elasticsearch C#?

Elasticsearch is a scalable open-source full-text searching tool and also analytics engine. It is used to save, search, and analyze huge data faster and also in real time. First of all, Elasticsearch is Rest Service. We can communicate with any Elasticsearch Service, using four verbs or functions.


1 Answers

If you are using regular mappings on ES 5 > This will get you results you want. If not you will need to change the mapping.

 QueryContainer query = null;

            if(dataToSearch.TagsColours != null && dataToSearch.TagsCars.Length > 0)
            {
                query = Query<StructuredData>.Terms(t=>t.Field("tagsColours.keyword").Terms(dataToSearch.TagsColours));
            }

            if(dataToSearch.TagsColours != null && dataToSearch.TagsCars.Length > 0)
            {
                var q =  Query<StructuredData>.Terms(t=>t.Field("tagsCars.keyword").Terms(dataToSearch.TagsCars));
                query = query == null ? q : query && q; 
            }

            if(dataToSearch.TagsKeywords != null && dataToSearch.TagsKeywords.Length > 0)
            {
                var q =  Query<StructuredData>.Terms(t=>t.Field("tagsKeywords.keyword").Terms(dataToSearch.TagsKeywords));
                query = query == null ? q : query && q; 
            }

The problem you are having is that the term query is done on a non-analyzed value and default text fields use standard analyzer. As of 5 they added keyword sub field that uses the keyword analyzer it essentially just places the terms as is and you can do a search by raw values. The standard analyzer dose tokenization for words and lowercases all the terms so it was unable to find Audi because the term was audi. If you want to just lowercase the input string this will not solve the Mercedes Benz problem since in the standard terms this will became mercedes a benz terms two terms instead of one in other words terms will return results if you put mercedes or benz but not mercedes benz. If you want to da a case insensitive search with the match query you will need to add a custom analyzer.

like image 125
Filip Cordas Avatar answered Sep 30 '22 06:09

Filip Cordas