Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

find substring with special chars in Elastic Search

I'm new to elastic search. I want to search by substring, that consists of numbers and symbols like "/" and "-". For example, I create an index with default settings and one indexed field:

curl -XPUT "http://localhost:9200/test/" -d ' {
    "mappings" : {
            "properties": {
                    "test_field": {
                            "type": "string"
                    }
            }
    }
} '

Then, I add some data into my index:

curl -XPOST "http://localhost:9200/test/test_field" -d '{ "test_field" : "14/21-35" }'
curl -XPOST "http://localhost:9200/test/test_field" -d '{ "test_field" : "1/1-35" }'
curl -XPOST "http://localhost:9200/test/test_field" -d '{ "test_field" : "1/2-25" }'

After refreshing an index I perform searching. So, i want to find data, in which "test_field" begins with "1/1". My request:

curl -X GET "http://localhost:9200/test/_search?pretty=true" -d '{"query":{"query_string":{"query":"1/1*"}}}'

returns no hits. If i remove the star symbol, then in response i see two hits: "1/1-35" and "1/2-25". If i try to escape slash symbol by backslash ("1\/1*"), results are the same respectively.

When there is "-" symbol in my query, then i must escape this Lucene special character. So i send next search request:

curl -X GET "http://localhost:9200/test/_search?pretty=true" -d '{"query":{"query_string":{"query":"*1\-3*"}}}'

and it returns with parsing error. If i double escape ("\\") minus, then i have no results.

I have no idea, how searching performs, when query consists of these characters. Maybe i'm doing something wrong.

I tryed to use nGram filter in my custom analyzer, but it doesn't suite to requirements of search engine.

If anyone encountered with this problem, please, answer.

like image 719
Denis Tataurov Avatar asked Sep 28 '12 09:09

Denis Tataurov


People also ask

How do I search a specific field in Elasticsearch?

There are two recommended methods to retrieve selected fields from a search query: Use the fields option to extract the values of fields present in the index mapping. Use the _source option if you need to access the original data that was passed at index time.

What is Query String Elasticsearch?

In Elasticsearch, query string queries are their own breed of query - loads of functionality for full text search rolled into one sweet little package. In this article, we'll take a closer look at why query string queries are special and how you can make use of them.

What query language does Elasticsearch use?

Elasticsearch provides a full Query DSL (Domain Specific Language) based on JSON to define queries.


1 Answers

Default analyzer will remove all special characters from your data at indexing time. You could use the keyword analyzer or simply not analyze your data at indexing time:

curl -XPUT "http://localhost:9200/test/" -d ' {
    "mappings" : {
            "properties": {
                    "test_field": {
                            "type": "string",
                            "index": "not_analyzed"
                    }
            }
    }
} '
like image 118
A21z Avatar answered Oct 06 '22 22:10

A21z