Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

APScheduler shut down randomly

Scheduler running fine in production, then all of a sudden it shut down. Clearly DB might have been offline for a bit (web apps never missed a beat so it was transient).

Log reported...

[2019-11-25 07:59:14,907: INFO/ercscheduler] Scheduler has been shut down
[2019-11-25 07:59:14,908: DEBUG/ercscheduler] Looking for jobs to run
[2019-11-25 07:59:14,909: WARNING/ercscheduler] Error getting due jobs from job store 'default': (psycopg2.OperationalError) could not connect to server: Network is unreachable
        Is the server running on host "localhost" (127.0.0.1) and accepting
        TCP/IP connections on port 6432?

(Background on this error at: http://sqlalche.me/e/e3q8)
[2019-11-25 07:59:14,909: DEBUG/ercscheduler] Next wakeup is due at 2019-11-25 13:59:24.908318+00:00 (in 10.000000 seconds)
[2019-11-25 07:59:14,909: INFO/ercscheduler] listener closed
[2019-11-25 07:59:14,909: INFO/ercscheduler] server has terminated
[2019-11-25 08:00:10,747: INFO/ercscheduler] Adding job tentatively -- it will be properly scheduled when the scheduler starts
[2019-11-25 08:00:10,797: INFO/ercscheduler] Adding job tentatively -- it will be properly scheduled when the scheduler starts
[2019-11-26 15:27:48,392: INFO/ercscheduler] Adding job tentatively -- it will be properly scheduled when the scheduler starts
[2019-11-26 15:27:48,392: INFO/ercscheduler] Adding job tentatively -- it will be properly scheduled when the scheduler starts

How do I make the scheduler more fault tolerant? I have to restart the daemon again to get it going.

like image 421
LiteWait Avatar asked Nov 26 '19 21:11

LiteWait


1 Answers

I found something very similar to your issue on the APScheduler Github repo. https://github.com/agronholm/apscheduler/issues/109

This issue here seems to be mitigated and merged in version 3.3.

All you have to do is upgrade to at least to 3.3. If you would like to alter the default 10 seconds interval then you have to set the jobstore_retry_interval when you create the scheduler instance.

If you cannot upgrade, then i would try monkey patching the corresponding function in APScheduler.

def monkey_patched_process_jobs(self):

     # You have alter the way job processing done in this function.

     pass

# replacing the function with the patched one
BackgroundScheduler._process_jobs = monkey_patched_process_jobs

scheduler = BackgroundScheduler()

Keep in mind that this is not ideal, i would only do monkey patching if i am unable to upgrade due to breaking changes.


How this functionality works under the hood

This is a snippet from the APScheduler Git repo

try:
    due_jobs = jobstore.get_due_jobs(now)
except Exception as e:
    # Schedule a wakeup at least in jobstore_retry_interval seconds
    self._logger.warning('Error getting due jobs from job store %r: %s',
                         jobstore_alias, e)
    retry_wakeup_time = now + timedelta(seconds=self.jobstore_retry_interval)
    if not next_wakeup_time or next_wakeup_time > retry_wakeup_time:
        next_wakeup_time = retry_wakeup_time

    continue

self.jobstore_retry_interval is set in the following manner:

self.jobstore_retry_interval = float(config.pop('jobstore_retry_interval', 10))
like image 177
Kristof Gilicze Avatar answered Nov 14 '22 17:11

Kristof Gilicze