I need to have a systemd service which runs continuously. System under question is an embedded linux built by Yocto. If the service stops for any reason (either failure or just completed), it should be restarted automatically If restarted more than X times, system should reboot.
What options are there for having this? I can think of the following two, but both seem suboptimal 1) having a cron job which will literally do the check above and keep the number of retries somewhere in /tmp or other tmpfs 2) having the service itself track the number times it has been started (again in some tmpfs location) and rebooting if necessary. Systemd would just have to continuously try to start the service if it's not running
edit: as suggested by an answer, I modified the service to use the StartLimitAction
as given below. It causes the unit to correctly restart, but at no point does it reboot the system, even if I continuously kill the script:
[Unit]
Description=myservice system
[Service]
Type=simple
WorkingDirectory=/home/root
ExecStart=/home/root/start_script.sh
Restart=always
StartLimitAction=reboot
StartLimitIntervalSec=600
StartLimitBurst=5
[Install]
WantedBy=multi-user.target
It depends on what process is running as the service. If you ran the above script process as a systemd service you could do it. Like a self-updater mechanism written into the process itself. It could even restart itself.
If set to always, the service will be restarted regardless of whether it exited cleanly or not, got terminated abnormally by a signal, or hit a timeout. Excerpt from https://www.freedesktop.org/software/systemd/man/systemd.service.html. So if you set on-failure , it won't get restarted on clean exit.
To check a service's status, use the systemctl status service-name command. I like systemd's status because of the detail given. For example, in the above listing, you see the full path to the unit file, the status, the start command, and the latest status changes.
unit StartLimitIntervalSec=, StartLimitBurst= Configure unit start rate limiting. By default, units which are started more than 5 times within 10 seconds are not permitted to start any more times until the 10 second interval ends. With these two options, this rate limiting may be modified.
This in your service file should do something very close to your requirements:
[Service]
Restart=always
[Unit]
StartLimitAction=reboot
StartLimitIntervalSec=60
StartLimitBurst=5
It will restart the service if it stops, except if there are more than 5 restarts in 60 seconds: in that case it will reboot.
You may also want to look at WatchdogSec
value, but this software watchdog functionality requires support from the service itself (very easy to add though, see the documentation for WatchDogSec).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With