Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsarch C# Nest [5.x] attributes

I'm a bit struggling with the field attributes in ElasticSearch, especially since things have changed a bit with 5.x (to which I'm porting our code).

An example is this:

    [Text(Index = false)]
    public string Id                        { get; set; }
    [Keyword]
    public string Tags                      { get; set; }
    [Text]
    public string Title                     { get; set; }

I have a bunch of fields like this, but I'm trying to figure out the best attributes for fields that follow this:

  • A text field to be searchable AS-IS, not interpreted (a string ID for example). I want to be able to search the exact string, nothing else
  • An English text in which I want to be able to do a full search for words and proximity.
  • An enum where values may be stored as a finite list of strings and I need to use that as a search criteria
  • Tags which are a list of words but don't form sentences; I need to be able to search through those
  • Numbers that are to be stored and not searchable
  • Dates that are to be stored and searchable
  • Dates that are to be stored but not searchable

A lot of posts refer to ES' documentation, but I really don't see any clarity in the attribute documentation; it seems to be written with people that already understand the system in mind. If anyone has an excelsheet like breakdown of attribute and their effects (stored, searchable, analyzed, etc) that would be fantastic

like image 940
Thomas Avatar asked Jan 16 '17 16:01

Thomas


People also ask

What is Elasticsearch in C#?

Elasticsearch is a scalable open-source full-text searching tool and also analytics engine. It is used to save, search, and analyze huge data faster and also in real time. First of all, Elasticsearch is Rest Service. We can communicate with any Elasticsearch Service, using four verbs or functions.

Is Elasticsearch still used?

Since its release in 2010, Elasticsearch has quickly become the most popular search engine and is commonly used for log analytics, full-text search, security intelligence, business analytics, and operational intelligence use cases.

What programming language is Elasticsearch?

Elasticsearch is developed in Java and is dual-licensed under the source-available Server Side Public License and the Elastic license, while other parts fall under the proprietary (source-available) Elastic License.

Is Elasticsearch an ETL?

No, Elasticsearch is not an ETL tool. It is a free and open-source search engine for text, numeric, geospatial, structured, and unstructured data. Elasticsearch is mostly used in business intelligence, security intelligence, and operational intelligence. There are separate ETL tools available for Elasticsearch.


1 Answers

The documentation will only get better over time; contributions are most appreciated :)

To answer your questions:

  • A text field to be searchable AS-IS, not interpreted (a string ID for example). I want to be able to search the exact string, nothing else

use the KeywordAttribute, which creates a field with the Keyword data type.

  • An English text in which I want to be able to do a full search for words and proximity.

use the TextAttribute, which creates a field with the Text data type. By default, the analyzer used will be the Standard Analyzer. Depending on your domain and search criteria, you may use a different analyzer, either preconfigured or custom.

  • An enum where values may be stored as a finite list of strings and I need to use that as a search criteria

You may use a KeywordAttribute here if you want exact matches. You may want to search case insensitively however, in which case you could use a TextAttribute with a custom analyzer made up of a Keyword tokenizer and Lowercase token filter.

  • Tags which are a list of words but don't form sentences; I need to be able to search through those

if you're looking for unstructured search, then use the TextAttribute.

-Numbers that are to be stored and not searchable

use the NumberAttribute that maps to the numeric data types, with a NumberType that corresponds to the numeric type of the POCO e.g. for Int32 (int), use NumberType.Integer. For the number to be stored in _source but not searchable, set Index=false e.g.

[Number(NumberType.Integer, Index = false)]
public int MyNumber { get;set; }

Index corresponds to index on numeric types.

-Dates that are to be stored and searchable

use the DateAttribute which corresponds to the Date data type

-Dates that are to be stored but not searchable

use the DateAttribute with Index=false

Take a look at the documentation for the mapping parameters that are available to field mappings. The names of parameters in the Elasticsearch documentation are exposed in NEST with Pascal-cased names.

like image 91
Russ Cam Avatar answered Sep 18 '22 10:09

Russ Cam