Maintaining a long-running task on Linux

Question

My system includes a task which opens a network socket, receives pushed data from the network, processes it, and writes it out to disk or pings other machines depending on the messages. This task is intended to run forever, and the service is designed to have this task always running. But sometimes it crashes.

What's the best practice for keeping a task like this alive? Assume it's okay for the task to be dead for up to 30 seconds before we restart it.

Some obvious ideas include having a watchdog process that checks to make sure the process is still running. Watchdog could be triggered by cron. But how does it know if the process is alive or not? Write a pidfile? touch a heartbeat file? An ideal solution wouldn't continuously spin up more processes if the machine gets bogged down to the point where the watchdog is running faster than the heartbeat.

Are there standard linux tools for this? I can imagine a solution that uses a message queue, but I'm not sure if that's a good idea or not.

Andrew Edgecombe · Accepted Answer

Depending on the nature of the task that you wish to monitor, one method is to write a simple wrapper to start up your task in a fork().

The wrapper task can then do a waitpid() on the child and restart it if it is terminated.

This does depend on modifying the source for the task that you wish to run.

Maintaining a long-running task on Linux

Tags:

linux

process

Leopd

1 Answers

Andrew Edgecombe

Recent Activity

Donate For Us

Maintaining a long-running task on Linux

Tags:

linux

process

Leopd

1 Answers

Andrew Edgecombe

Related questions

Recent Activity

Donate For Us