Unable to search a query with symbols in elasticsearch

Tags:

I have been trying to match a query using the elasticsearch python client but I am unable to match it even after using escape characters and setting up some custom analyzers and mapping them. I want to search using & and its not giving any response.

from elasticsearch import Elasticsearch

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])


doc1 = {
    'name': 'numb',
    'band': 'linkin_park',
    'year': '2006'
}

doc2 = {
    'name': 'Powerless &',
    'band': 'linkin_park',
    'year': '2006'
}
doc3 = {
    'name': 'Crawling !',
    'band': 'linkin_park',
    'year': '2006'
    }

doc =[doc1, doc2, doc3]
'''
create_index = {
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "type": "custom",
                    "filter": [
                        "lowercase"
                    ],
                    "tokenizer": "whitespace"
                }
            }
        }
    }
}

es.indices.create(index="idx_temp", body=create_index)
'''
for i in range(3):
    es.index(index="idx_temp", doc_type='_doc', id=i, body=doc[i])


my_mapping = {
  "properties": {
      "name": {
          "type": "text",
          "fields": {
              "keyword": {
                  "type": "keyword",
                  'ignore_above': 256
              }
          },
          "analyzer": "my_analyzer"
          "search_analyzer": "my_analyzer"
      },
      "band": {
          "type": "text",
          "fields": {
              "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
              }
          },
          "analyzer": "my_analyzer"
          "search_analyzer": "my_analyzer"
      },
      "year": {
          "type": "text",
          "fields": {
              "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
              }
          },
          "analyzer": "my_analyzer"
          "search_analyzer": "my_analyzer"
      }
  }
}

es.indices.put_mapping(index='idx_temp', body=my_mapping, doc_type='_doc', include_type_name=True)

res = es.search(index='idx_temp', body={
    "query": {
        "match": {
            "name": {
                "query": "powerless &",
                "fuzziness": 3

            }
        }
    }
})

for hit in res['hits']['hits']:
    print(hit['_source'])

The expected output was 'name': 'Poweeerless &', but i got 0 hits and no value returned.

440

asked Jul 04 '19 10:07

Yaboku

2 Answers

So I have fixed the problem by adding another field

 "search_quote_analyzer": "my_analyzer"

to the settings field after

"analyzer": "my_analyzer"
"search_analyzer": "my_analyzer"

And then I'm getting my output by searching with & in the query as

'name': 'Poweeerless &'

answered Sep 27 '22 16:09

Yaboku

I just tried it using your index settings, mapping, and query and was able to get the results. Below are 2 different things which I did.

Escape the special char &, when I was trying to index the doc using ES REST API directly, using below the body in postman:

{ "content": "Powerless \&" }

Then ES gave me the Unrecognized character escape '&' exception and even Postman, popular REST client was also giving me warning about not a proper string.

Then I changed above payload to below and was able to index the doc:

{
    "content": "Powerless \\&" :-> Notice I added a another `\` to escape the `&`
}

I changed the query to use the same field, which was having the value &, in your case it is name field, not the content field., As match query is analyzed and uses the same analyzer which is used for indexing time. And was able to get the result.

PS: I also verified your analyzer using _analyze api and it's generating the below tokens for text Powerless \\&

{
    "tokens": [
        {
            "token": "powerless",
            "start_offset": 0,
            "end_offset": 9,
            "type": "word",
            "position": 0
        },
        {
            "token": "\\&",
            "start_offset": 10,
            "end_offset": 12,
            "type": "word",
            "position": 1
        }
    ]
}

answered Sep 27 '22 15:09

Amit

Related questions
                            
                                Concurrency and Selenium - Multiprocessing vs Multithreading
                            
                                Retrieving python 3.6 handling of re.sub() with zero length matches in python 3.7
                            
                                Is there a way to use Kivy with OpenGL 1.1?
                            
                                Is there a way to interrupt shutil copytree operation in Python?
                            
                                Removing multiple phrases from string column efficiently
                            
                                Applying a filter on an image with Python
                            
                                How to Insert a Node between another node in a Linked List?
                            
                                joining output from regex search
                            
                                Python: Calculating frequency over time from a wav file in Python?
                            
                                Detect if python program is executed via Windows GUI (double-click) vs command prompt
                            
                                Is it possible to add undirected and directed edges to a graph object in networkx?
                            
                                Reading Gmail Email in Python
                            
                                How to send and consume json messages using confluent-kafka in Python
                            
                                Kubernetes Python client connection Issue
                            
                                Group nodes together in networkx
                            
                                Does oversampling happen before or after cross-validation using imblearn pipelines?
                            
                                AttributeError: 'DataFrame' object has no attribute 'droplevel' in pandas
                            
                                How to have a mix of both Celery Executor and Kubernetes Executor in Apache Airflow?
                            
                                Install from pipfile using pipenv install gives error
                            
                                Read YAML file as list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unable to search a query with symbols in elasticsearch

Tags:

indexing

python-3.x

lucene

elasticsearch

Yaboku

People also ask

2 Answers

Yaboku

Amit

Recent Activity

Donate For Us