Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to rebuild_index elasticsearch with Django Haystack: 'Connection refused'

I'm facing a strange issue, using Django Haystack and ElasticSearch, preventing me from indexing content.

ElasticSearch is properly running on the machine:

# curl -X GET 127.0.0.1:9200
{
  "status" : 200,
  "name" : "White Pilgrim",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.7.4",
    "build_hash" : "0d3159b9fc8bc8e367c5c40c09c2a57c0032b32e",
    "build_timestamp" : "2015-12-15T11:25:18Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

But when I try to build my index using Haystack:

# python manage.py rebuild_index

I get a ProtocolError(('Connection aborted.', error(111, 'Connection refused'))) error:

ERROR:root:Error updating auth using default
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 188, in handle_label
    self.update_backend(label, using)
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 233, in update_backend
    do_update(backend, index, qs, start, end, total, verbosity=self.verbosity, commit=self.commit)
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 96, in do_update
    backend.update(index, current_qs, commit=commit)
  File "/usr/local/lib/python2.7/dist-packages/haystack/backends/elasticsearch_backend.py", line 193, in update
    bulk(self.conn, prepped_docs, index=self.index_name, doc_type='modelresult')
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 188, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 160, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 89, in _process_bulk_chunk
    raise e
ConnectionError: ConnectionError(('Connection aborted.', error(111, 'Connection refused'))) caused by: ProtocolError(('Connection aborted.', error(111, 'Connection refused')))
Traceback (most recent call last):
  File "manage.py", line 11, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 353, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 345, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 348, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 399, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/rebuild_index.py", line 26, in handle
    call_command('update_index', **options)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 119, in call_command
    return command.execute(*args, **defaults)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 399, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 183, in handle
    return super(Command, self).handle(*items, **options)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 548, in handle
    label_output = self.handle_label(label, **options)
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 188, in handle_label
    self.update_backend(label, using)
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 233, in update_backend
    do_update(backend, index, qs, start, end, total, verbosity=self.verbosity, commit=self.commit)
  File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 96, in do_update
    backend.update(index, current_qs, commit=commit)
  File "/usr/local/lib/python2.7/dist-packages/haystack/backends/elasticsearch_backend.py", line 193, in update
    bulk(self.conn, prepped_docs, index=self.index_name, doc_type='modelresult')
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 188, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 160, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 89, in _process_bulk_chunk
    raise e
elasticsearch.exceptions.ConnectionError: ConnectionError(('Connection aborted.', error(111, 'Connection refused'))) caused by: ProtocolError(('Connection aborted.', error(111, 'Connection refused')))

The strange part is: by this time (after the error), elasticsearch stays unreachable, even the way I did earlier:

# curl -v -X GET 127.0.0.1:9200
* Rebuilt URL to: 127.0.0.1:9200/
* Hostname was NOT found in DNS cache
*   Trying 127.0.0.1...
* connect to 127.0.0.1 port 9200 failed: Connexion refusée
* Failed to connect to 127.0.0.1 port 9200: Connexion refusée
* Closing connection 0
curl: (7) Failed to connect to 127.0.0.1 port 9200: Connexion refusée

If I want it to be up and running again, I need to restart it:

# sudo service elasticsearch restart

This occurs on a Raspbian distribution. On my MacOX development machine, everything is OK, indexing works well.

Have you got some ideas?

Memory informations

$ free

             total       used       free     shared    buffers     cached
Mem:        883148     667632     215516      17312      61468     421536
-/+ buffers/cache:     184628     698520
Swap:       102396          0     102396

$ cat /proc/meminfo

MemTotal:         883148 kB
MemFree:          215492 kB
MemAvailable:     675152 kB
Buffers:           61472 kB
Cached:           421536 kB
SwapCached:            0 kB
Active:           366556 kB
Inactive:         253416 kB
Active(anon):     128720 kB
Inactive(anon):    25556 kB
Active(file):     237836 kB
Inactive(file):   227860 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        102396 kB
SwapFree:         102396 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        136864 kB
Mapped:            30956 kB
Shmem:             17312 kB
Slab:              33560 kB
SReclaimable:      24684 kB
SUnreclaim:         8876 kB
KernelStack:        1696 kB
PageTables:         1948 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      543968 kB
Committed_AS:    1287328 kB
VmallocTotal:    1171456 kB
VmallocUsed:        1140 kB
VmallocChunk:     965220 kB
CmaTotal:           8192 kB
CmaFree:              36 kB

$ vmstat -s

       883148 K total memory
       667624 K used memory
       366456 K active memory
       253416 K inactive memory
       215524 K free memory
        61476 K buffer memory
       421536 K swap cache
       102396 K total swap
            0 K used swap
       102396 K free swap
       764663 non-nice user cpu ticks
            0 nice user cpu ticks
       403552 system cpu ticks
    592012076 idle cpu ticks
       394292 IO-wait cpu ticks
           96 IRQ cpu ticks
         8609 softirq cpu ticks
            0 stolen cpu ticks
      3905593 pages paged in
     13767628 pages paged out
            0 pages swapped in
            0 pages swapped out
    503949992 interrupts
    198893627 CPU context switches
   1460307293 boot time
        61556 forks

Accessing ES via raw Python

I also tried to access ES via raw Python (without Haystack). And it seems to work:

$ sudo service elasticsearch restart
$ python

Python 2.7.9 (default, Mar  8 2015, 00:52:26)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from elasticsearch import Elasticsearch
>>> es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
>>> es
<Elasticsearch([{'host': 'localhost', 'port': 9200}])>
>>> es.get(index='myindex', doc_type='topic', id=1)
elasticsearch.exceptions.NotFoundError: TransportError(404, u'{"_index":"myindex","_type":"topic","_id":"1","found":false}')

So OK, it doesn't find anything (because nothing has been indexed, hey!), but it seems to be able to query the engine.

ES logs

The default log file doens't tell me anything useful:

2016-04-25 17:38:14,503][INFO ][node                     ] [White Pilgrim] version[1.7.4], pid[23455], build[0d3159b/2015-12-15T11:25:18Z]
[2016-04-25 17:38:14,506][INFO ][node                     ] [White Pilgrim] initializing ...
[2016-04-25 17:38:15,208][INFO ][plugins                  ] [White Pilgrim] loaded [], sites []
[2016-04-25 17:38:15,542][INFO ][env                      ] [White Pilgrim] using [1] data paths, mounts [[/ (/dev/root)]], net usable_space [8.6gb], net total_space [14.4gb], types [ext4]
[2016-04-25 17:38:32,689][INFO ][node                     ] [White Pilgrim] initialized
[2016-04-25 17:38:32,691][INFO ][node                     ] [White Pilgrim] starting ...
[2016-04-25 17:38:33,241][INFO ][transport                ] [White Pilgrim] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.50:9300]}
[2016-04-25 17:38:33,365][INFO ][discovery                ] [White Pilgrim] elasticsearch/NWl6pT-_QemJtTvKYkSGCQ
[2016-04-25 17:38:37,286][INFO ][cluster.service          ] [White Pilgrim] new_master [White Pilgrim][NWl6pT-_QemJtTvKYkSGCQ][raspberrypi][inet[/192.168.1.50:9300]], reason: zen-disco-join (elected_as_master)
[2016-04-25 17:38:37,449][INFO ][http                     ] [White Pilgrim] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.50:9200]}
[2016-04-25 17:38:37,451][INFO ][node                     ] [White Pilgrim] started
[2016-04-25 17:38:37,664][INFO ][gateway                  ] [White Pilgrim] recovered [1] indices into cluster_state
[2016-04-25 17:47:28,176][INFO ][cluster.metadata         ] [White Pilgrim] [myindex] deleting index
[2016-04-25 17:47:28,964][INFO ][cluster.metadata         ] [White Pilgrim] [myindex] creating index, cause [api], templates [], shards [5]/[1], mappings []
[2016-04-25 17:47:30,412][INFO ][cluster.metadata         ] [White Pilgrim] [myindex] create_mapping [modelresult]

So I tried to get more informations, editing /etc/elasticsearch/logging.yml configuration file, and asking TRACE level (es.logger.level: TRACE).

Everything still looks OK for ES in its log file, even after it crashes!

Extended logs (last lines, copied just after ES crash):

[2016-04-28 11:15:24,817][DEBUG][cluster.service          ] [Franklin Hall] publishing cluster state version 11
[2016-04-28 11:15:24,818][DEBUG][cluster.service          ] [Franklin Hall] set local cluster state to version 11
[2016-04-28 11:15:24,819][TRACE][cluster.routing          ] [Franklin Hall] no need to schedule reroute due to delayed unassigned, next_delay_setting [0], registered [9223372036854775807]
[2016-04-28 11:15:24,827][DEBUG][river.cluster            ] [Franklin Hall] processing [reroute_rivers_node_changed]: execute
[2016-04-28 11:15:24,827][DEBUG][river.cluster            ] [Franklin Hall] processing [reroute_rivers_node_changed]: no change in cluster_state
[2016-04-28 11:15:24,827][TRACE][gateway.local.state.meta ] [Franklin Hall] [posepartage] writing state, reason [version changed from [1] to [2]]
[2016-04-28 11:15:24,849][TRACE][cluster.service          ] ack received from node [[Franklin Hall][cOT-S342Q7OMtv4z2ZkS9Q][raspberrypi][inet[/192.168.1.50:9300]]], cluster_state update (version: 11)
[2016-04-28 11:15:24,850][TRACE][cluster.service          ] all expected nodes acknowledged cluster_state update (version: 11)
[2016-04-28 11:15:24,852][DEBUG][cluster.service          ] [Franklin Hall] processing [put-mapping [modelresult]]: took 240ms done applying updated cluster_state (version: 11)
[2016-04-28 11:15:24,852][TRACE][indices.breaker          ] [Franklin Hall] [REQUEST] Adjusted breaker by [16432] bytes, now [16432]
[2016-04-28 11:15:24,863][TRACE][indices.breaker          ] [Franklin Hall] [REQUEST] Adjusted breaker by [-16432] bytes, now [0]
[2016-04-28 11:15:25,427][TRACE][index.shard              ] [Franklin Hall] [myindex][2] index [modelresult][auth.user.1][org.elasticsearch.index.mapper.ParseContext$Document@1108529]
[2016-04-28 11:15:25,427][TRACE][index.shard              ] [Franklin Hall] [myindex][0] index [modelresult][auth.user.3][org.elasticsearch.index.mapper.ParseContext$Document@1108529]
[2016-04-28 11:15:25,427][TRACE][index.shard              ] [Franklin Hall] [myindex][1] index [modelresult][auth.user.2][org.elasticsearch.index.mapper.ParseContext$Document@1108529]

System logs

I also tried to explore system logs, to get informations about ES crash (as it doesn't seem to log anything by itself about this point…).

/var/log/daemon.log:

Apr 28 11:37:48 raspberrypi systemd[1]: Starting Elasticsearch...
Apr 28 11:37:48 raspberrypi systemd[1]: Started Elasticsearch.
Apr 28 11:38:52 raspberrypi systemd[1]: elasticsearch.service: main process exited, code=killed, status=6/ABRT
Apr 28 11:38:52 raspberrypi systemd[1]: Unit elasticsearch.service entered failed state.

/var/log/messages:

Apr 28 11:38:52 raspberrypi rsyslogd-2007: action 'action 17' suspended, next retry is Thu Apr 28 11:39:52 2016 [try http://www.rsyslog.com/e/2007 ]

ARGH!

One step further, I changed StandardOutput=null to StandardOutput=journal in /usr/lib/systemd/system/elasticsearch.service, executed systemctl daemon-reload and then journalctl -u elasticsearch.service -f. This way, I got more details on the error:

avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:04,569][TRACE][indices.breaker          ] [Supreme Intelligence] [REQUEST] Adjusted breaker by [-16432] bytes, now [0]
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,337][TRACE][index.shard              ] [Supreme Intelligence] [myindex][2] index [modelresult][auth.user.1][org.elasticsearch.index.mapper.ParseContext$Document@a2c034]
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,337][TRACE][index.shard              ] [Supreme Intelligence] [myindex][1] index [modelresult][auth.user.2][org.elasticsearch.index.mapper.ParseContext$Document@1fcee4a]
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,337][TRACE][index.shard              ] [Supreme Intelligence] [myindex][0] index [modelresult][auth.user.3][org.elasticsearch.index.mapper.ParseContext$Document@80719c]
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,488][TRACE][index.engine.lucene.iw   ] [Supreme Intelligence] [myindex][1] elasticsearch[Supreme Intelligence][scheduler][T#1] IW: nrtIsCurrent: infoVersion matches: false; DW changes: false; BD changes: false
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,495][TRACE][index.engine.lucene.iw   ] [Supreme Intelligence] [myindex][1] elasticsearch[Supreme Intelligence][refresh][T#1] IW: nrtIsCurrent: infoVersion matches: false; DW changes: false; BD changes: false
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,496][TRACE][index.shard              ] [Supreme Intelligence] [myindex][1] refresh with source: schedule
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,500][TRACE][index.engine.lucene.iw   ] [Supreme Intelligence] [myindex][1] elasticsearch[Supreme Intelligence][refresh][T#1] IW: nrtIsCurrent: infoVersion matches: false; DW changes: false; BD changes: false
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,503][TRACE][index.engine.lucene.iw   ] [Supreme Intelligence] [myindex][1] elasticsearch[Supreme Intelligence][refresh][T#1] IW: flush at getReader
avril 28 12:09:05 raspberrypi elasticsearch[3481]: [2016-04-28 12:09:05,504][TRACE][index.engine.lucene.iw   ] [Supreme Intelligence] [myindex][1] elasticsearch[Supreme Intelligence][refresh][T#1] DW: startFullFlush
avril 28 12:09:05 raspberrypi elasticsearch[3481]: Core dumps have been disabled. To enCore dumps have been disabled. To en#
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # A fatal error has been detected by the Java Runtime Environment:
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #  Internal Error (cppInterpreter_arm.S:2625), pid=3481, tid=625996896
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #  fatal error:         *** Unimplemented opcode: 232 = <unknown>
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # JRE version: OpenJDK Runtime Environment (7.0_95) (build 1.7.0_95-b00)
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # Java VM: OpenJDK Zero VM (24.95-b01 mixed mode linux-arm )
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # Derivative: IcedTea 2.6.4
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # Distribution: Raspbian GNU/Linux 8.0 (jessie), package 7u95-2.6.4-1~deb8u1+rpi1
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # An error report file with more information is saved as:
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # /tmp/hs_err_pid3481.log
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # If you would like to submit a bug report, please include
avril 28 12:09:05 raspberrypi elasticsearch[3481]: # instructions on how to reproduce the bug and visit:
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #   http://icedtea.classpath.org/bugzilla
avril 28 12:09:05 raspberrypi elasticsearch[3481]: #
avril 28 12:09:05 raspberrypi systemd[1]: elasticsearch.service: main process exited, code=killed, status=6/ABRT
avril 28 12:09:05 raspberrypi systemd[1]: Unit elasticsearch.service entered failed state.
like image 976
user650108 Avatar asked Apr 25 '16 15:04

user650108


2 Answers

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (cppInterpreter_arm.S:2625), pid=3481, tid=625996896
#  fatal error:         *** Unimplemented opcode: 232 = <unknown>
#
# JRE version: OpenJDK Runtime Environment (7.0_95) (build 1.7.0_95-b00)
# Java VM: OpenJDK Zero VM (24.95-b01 mixed mode linux-arm )
# Derivative: IcedTea 2.6.4
# Distribution: Raspbian GNU/Linux 8.0 (jessie), package 7u95-2.6.4-1~deb8u1+rpi1
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid3481.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#

That seems a bug in openjdk 1.7. There are people encountered the same issue: https://discuss.elastic.co/t/openjdk-on-arm-crashes-elastic/27283

The solution is to upgrade your jvm.

like image 73
alpert Avatar answered Sep 21 '22 23:09

alpert


The fact that elasticsearch stops listening on a port is a testament that the problem lies there (it might not be the root cause but it's the current lead). You need to get more info about what is happening there.

like image 39
ivan_pozdeev Avatar answered Sep 21 '22 23:09

ivan_pozdeev