I am having some trouble with running hadoop jobs in both pseudo cluster and in cluster mode under ubuntu 16.04.
While running a vanila hadoop/hdfs installation - my hadoop user gets logged out and all of the processes that are run by this user are closed. I don't see anything indicating in logs (/var/log/systemd, journalctl or dmesg) that explains why the user gets logged out.
Seems like I am not the only who has problems with this or similar issue:
https://stackoverflow.com/questions/38288162/in-ubuntu-16-04-running-hadoop-jar-laptop-gets-rebooted
Note: creating special hadoop user hadn't actually solved the problem in my case - but limited the logouts to the dedicated user.
https://askubuntu.com/questions/784591/ubuntu-16-04-kills-session-when-resource-usage-is-extremely-high
Is it possible that some problem around the UserGroupInformation class (that can under some circumstances cause a logout), with maybe some changes in systemd in ubuntu 16.04 can cause this behavior?
The last lines of hadoop log that I get before the logout:
...
16/07/13 16:45:37 DEBUG ipc.ProtobufRpcEngine: Call: getJobReport took 4ms
16/07/13 16:45:37 DEBUG security.UserGroupInformation: PrivilegedAction
as:hduser (auth:SIMPLE)
from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
16/07/13 16:45:37 DEBUG ipc.Client: IPC Client (1360814716) connection to
laptop/127.0.1.1:37339 from hduser sending #375
16/07/13 16:45:37 DEBUG ipc.Client: IPC Client (1360814716) connection to
laptop/127.0.1.1:37339 from hduser got value #375
16/07/13 16:45:37 DEBUG ipc.ProtobufRpcEngine: Call: getJobReport took 2ms
Terminated
hduser@laptop:~$ 16/07/13 16:45:37 DEBUG ipc.Client: stopping client from
cache: org.apache.hadoop.ipc.Client@4e7ab839
exit
journalctl:
Jul 12 16:06:44 laptop systemd-logind[978]: Removed session 7.
Jul 12 16:06:44 laptop systemd-logind[978]: Removed session 6.
Jul 12 16:06:44 laptop systemd-logind[978]: Removed session 5.
Jul 12 16:06:44 laptop systemd-logind[978]: Removed session 8.
syslog:
Jul 12 16:06:43 laptop systemd[4172]: Stopped target Default.
Jul 12 16:06:43 laptop systemd[4172]: Reached target Shutdown.
Jul 12 16:06:44 laptop systemd[4172]: Starting Exit the Session...
Jul 12 16:06:44 laptop systemd[4172]: Stopped target Basic System.
Jul 12 16:06:44 laptop systemd[4172]: Stopped target Sockets.
Jul 12 16:06:44 laptop systemd[4172]: Stopped target Paths.
Jul 12 16:06:44 laptop systemd[4172]: Stopped target Timers.
Jul 12 16:06:44 laptop systemd[4172]: Received SIGRTMIN+24 from PID
10101 (kill).
Jul 12 16:06:44 laptop systemd[1]: Stopped User Manager for UID 1001.
Jul 12 16:06:44 laptop systemd[1]: Removed slice User Slice of hduser.
I also had the problem. It took me time, but I found the solution here: https://unix.stackexchange.com/questions/293069/all-services-of-a-user-are-killed-when-running-multiple-services-under-this-user
Basically, some hadoop processes just stop, because why not. But systemd seems to kill all user's process when he see a service's process dying.
The fix is to add
[login]
KillUserProcesses=no
to /etc/systemd/logind.conf
and reboot.
I had multiple ubuntu's version to debug the problem, and the fix seems to works only on ubuntu 16.04.
I had the same issue. I was using Apache APEX which is hadoop native. While killing any APEX application my system used to log me out.
Solution : Replace the kill file (present in /bin/kill) of Ubuntu 16 with kill file of Ubuntu 14.
Everything works smoothly like before upgrade for me.
I had the same problem too. Finally, I found /bin/kill in ubuntu16.04 has bug in killing process group can solve this problem.
If pid is less than -1, then sig is sent to every process in the process group whose ID is -pid
Because of the bug in procps-ng-3.3.10, kill the process group whose ID starts with 1, invoked by bin/yarn application -kill AppID
, will cause the user logouts.
The problem is solved after replacing /bin/kill
with the new kill
compiled from procps-ng-3.3.12.
tar xJf procps-ng-3.3.12.tar.xz
cd procps-ng-3.3.12
./configure
sudo cp .lib/kill /bin/kill
sudo chown root:root /bin/kill
sudo cp proc/.libs/libprocps.so.6.0.0 /lib/x86_64-linux/gnu/
sudo chown root:root /lib/x86_64-linux-gnu/libprocps.so.6.0.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With