Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JSONField workaround on elasticsearch : MapperParsingException

How can I map Postgres JsonField of a Django model to ElasticSearch indexing? Is there any workaround to make it work?

Reference : https://github.com/sabricot/django-elasticsearch-dsl/issues/36

  • models.py
class Web_Technology(models.Model):
    web_results = JSONField(blank=True,null=True,default=dict)
  • web_results field format
{"http://google.com": {"Version": "1.0", "Server": "AkamaiGHost"}}
  • documents.py
from elasticsearch_dsl import Index
from django_elasticsearch_dsl import Document, fields
from django_elasticsearch_dsl.registries import registry

from .models import Web_Technology

@registry.register_document
class WebTechDoc(Document):

    web_results = fields.ObjectField()

    def prepare_web_results(self, instance):
        return instance.web_results
    class Index:
        name = 'webtech'

    class Django:
        model = Web_Technology
        fields = []

`→ python3 manage.py search_index --create -f
Creating index '<elasticsearch_dsl.index.Index object at 0x7f5f7b07ed30>'
Traceback (most recent call last):
  File "manage.py", line 15, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/__init__.py", line 381, in execute_from
_command_line
    utility.execute()
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/base.py", line 323, in run_from_argv
C    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.5/dist-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 128, in handle
    self._create(models, options)
  File "/usr/local/lib/python3.5/dist-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 84, in _create
    index.create()
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch_dsl/index.py", line 254, in create
    self._get_connection(using).indices.create(index=self._name, body=self.to_dict(), **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/client/utils.py", line 84, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/client/indices.py", line 105, in create
    "PUT", _make_path(index), params=params, body=body
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/transport.py", line 350, in perform_request
    timeout=timeout,
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/connection/http_urllib3.py", line 252, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/connection/base.py", line 181, in _raise_error
    status_code, error_message, additional_info
elasticsearch.exceptions.RequestError: RequestError(400, 'MapperParsingException[mapping [properties]]; nested: MapperParsingException[Root type mapping not empty after parsing! Remaining fields:   [web_results : {type=object}]]; ', 'MapperParsingException[mapping [properties]]; nested: MapperParsingException[Root type mapping not empty after parsing! Remaining fields:   [web_results : {type=object}]]; ')

If there are none workarounds to make it work, then suggest me a other fast search indexer which supports JsonField.

ElasticSearch Logs:

[2019-09-10 19:41:22,399][DEBUG][action.admin.indices.create] [cimexnode] [webtech] failed to create
org.elasticsearch.index.mapper.MapperParsingException: mapping [properties]
        at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:394)
        at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:374)
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:204)
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:167)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: Root type mapping not empty after parsing! Remaining fields:   [web_results : {type=object}]
        at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:278)
        at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:192)
        at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:449)
        at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:307)
        at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:391)
        ... 6 more
like image 999
Arbazz Hussain Avatar asked Sep 10 '19 07:09

Arbazz Hussain


1 Answers

If the method mentioned in the link you posted works (I haven't tested it on JSONField), then you're overriding the wrong method: The method the elasticsearch app uses to prep the field is prepare_FOO where FOO is the field name.

So you need to call your method prepare_web_results() instead of prepare_content_json() since your field is web_results. Now your method prepare_content_json is useless as it will never be called.

If your JSONField has a fixed structure, you should return an object field with the corresponding structure:

class WebTechDoc(Document):

    web_results = fields.ObjectField(properties={
        "url": fields.TextField(),
        "version": fields.TextField(),
        "server": fields.TextField()})

    def prepare_web_results(self, instance):
        results = instance.web_results
        url = results.keys()[0]
        return {
            "url": url,
            "version": results[url]["Version"],
            "server": results[url]["Server"]
        }

Or if you're less concerned about where exactly the search result comes from, you could just map the dictionary to a string and put it in a TextField() instead of an ObjectField(): return f"{instance.web_results}"

like image 93
dirkgroten Avatar answered Oct 25 '22 04:10

dirkgroten