Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch disappearing / crashing -- cannot find log files or any other information

I'm running ElasticSearch on my EC2 servers. I'm having trouble understanding the logging / debugging of ElasticSearch.

Occasionally, the service disappears, after having been running fine for days (I expect that it's crashing). Eg:

# curl -XGET http://localhost:9200/
curl: (7) couldn't connect to host

Per this ElasticSearch page, I expected that my logs would be available in /var/log/elasticsearch... however...

# ls /var/data/elasticsearch
ls: /var/data/elasticsearch: No such file or directory

I have not changed the /config/logging.yml file in any way. ElasticSearch is running as root, so there should be no problem with permissions.

What am I missing?


Per imotov's response, I found my log files at /root/elasticsearch/elasticsearch-0.20.5/logs... but as he predicted, they were very unhelpful.

I checked my Java version:

java version "1.6.0_14"
Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)

It looks like this is a very old version, and I'm working on upgrading it.

I also had a look at my memory graph

enter image description here

We can see that there was a weird spike just before ES died. I'm trying to discover the source, here. The only other process should be node; they're capped at 4Gb and 1.2Gb, respectively (once ES died and just node was left, the RAM dropped to 1.2Gb as you can see). So either ElasticSearch exceeded it's 4Gb, or something else caused a spike in memory...

FWIW, I'm using m1.large instances (8GB RAM).

like image 721
Zane Claes Avatar asked Mar 05 '13 00:03

Zane Claes


2 Answers

If you run a relatively new version of elasticsearch you should be able to find location of the log files by running the following command

curl -s "localhost:9200/_nodes/_local/settings?pretty=true" | fgrep path.logs

But if elasticsearch suddenly disappears, chances are you will not find anything in these logs. If you are running an older version of java (a year old or more), it's quite possible that your java process is crashing. When it happens the JVM typically leaves files named hs_err_pidXXXX.log on your hard drive (typically in /tmp/ on Linux. See if you can find any and try upgrading Java to the latest version.

The second possible reason for sudden disappearance of elasticsearch is oom-killer. It's a process that kicks in and kills the biggest process (elasticsearch in your case) when system runs out of memory. Check syslog and kern.log around the time elasticsearch last time disappeared. If you see messages like this one

Out of memory: Kill process 1234 (java) score 567 or sacrifice child

you might need to reduce elasticsearch heap size to leave more memory to OS or upgrade to a bigger instance.

like image 169
imotov Avatar answered Nov 18 '22 04:11

imotov


I experienced same troubles :

  • Updated JAVA did not help
  • Moving from openJDK to oracle changed nothing
  • Memory consumption was finally OK
  • No error message anywhere, just JVM crashing randomly

My solution was to remove sigar lib :

cd    
sudo mv /usr/share/elasticsearch/lib/sigar .
sudo service elasticsearch restart

Since, everything goes smoothly, the only impact: I lost CPU monitoring. ElasticSearch 1.5.1 is running on a VPS with Ubuntu 14.04 (OpenVZ).

UPDATE 1: I still have to remove sigar lib with elasticsearch 1.5.2

UPDATE 2: to figure out, add these options when starting elasticsearch :

-XX:HeapDumpPath=./java_pid.hprof

Path to directory or filename for heap dump. Manageable.

-XX:-HeapDumpOnOutOfMemoryError

Dump heap to file when java.lang.OutOfMemoryError is thrown.

Will help for debugging

like image 1
David Avatar answered Nov 18 '22 04:11

David