Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stop systemd from killing user slices on reboot

Tags:

systemd

sles

My solution (so far) was to comment pam_systemd.so from common-session. Everything runs in the system.slice with no control groups. I am not sure of the impact of that yet but at least things run, stay running, and get shutdown cleanly.

Our software is in-house developed and run on SLES. It is java, oracle, a tomcat web page for sysadmin, etc. We have a script that we have been using that starts all these processes. Has been working great until systemd.

The "env" script gather info from config files and then calls other scripts to start java, oracle, etc. These other script "su" to the user like "oracle".

I have a unit for this "env" script and start works. Stop works if I run "systemctl stop env".

My issue is that on reboot the first thing is ALL users are killed and so are all the DBs, java process, etc. Basically crashing the DBs since they really aren't stopped nicely. THEN the stop tries to run and can't because stuff is down.

I have tried to add KillUserProcesses=no, enable-linger, KillExcludeUsers=, systemd-run --scope, and they still get killed.

Is there any way to have systemd NOT insta-kill users on reboot or am stuck having to figure out units for all the sub scripts?

The stuff below is just to replicate the issue - not the actual scripts running.

I was able to replicate it with the below on SLES12SP2 (systemd 228). I built an Arch machine and it didn't do the kills.

One thing I noticed that was different is the sleep 600 was a user slice on sles12 but a system slice on arch.

systemd-cgls on SLES12:

`-user.slice
  |-user-1000.slice
  | |[email protected]
  | | `-init.scope
  | |   |-1362 /usr/lib/systemd/systemd --user
  | |   `-1371 (sd-pam)                                                          
  | `-session-c1.scope
  |   `-1383 sleep 600

and on Arch:

└─system.slice
  ├─env.service
  │ └─276 sleep 600

A user slice and session aren't even created with the su on Arch.

My service file:

[Unit]
Description=Starts and stops applications needed for an environment
Wants=network.target httpd.service
After=network.target httpd.service sshd.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/pro/bin/sys/services/envStart.sh start
ExecStop=/pro/bin/sys/services/envStart.sh stop
ExecReload=/pro/bin/sys/services/envStart.sh restart
TimeoutSec=3600

[Install]
WantedBy=multi-user.target

The envStart script:

#!/bin/bash

case $1 in
    start)
        /pro/bin/sys/services/sleep.sh start
    ;;
    stop)
        /pro/bin/sys/services/sleep.sh stop
    ;;
esac

and the sleep script:

#!/bin/bash

case $1 in
    start)
        echo "starting sleep"
        su sleepuser -c "sleep 600 &"
    ;;
    stop)
        echo "stopping sleep"
        sleep 300
    ;;
esac
like image 890
Cade Robinson Avatar asked Mar 20 '17 20:03

Cade Robinson


1 Answers

I had the same/similar problem. It was the user switch that was the problem for me, causing all processes to start in the user.slice instead of in the system.slice. Apparently nothing "important" is supposed to be running in the user.slice and systemd just kills all(?) processes there at shutdown/reboot. I solved it by removing all user switches (su/sudo) in my start scripts and using the user directive in the unit file (User=xxx).

like image 103
Tomas Avatar answered Dec 21 '22 03:12

Tomas