Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Behavior of Python's time.sleep(0) under linux - Does it cause a context switch?

I'd never thought about this, so I wrote this script:

import time

while True:
    print "loop"
    time.sleep(0.5)

Just as a test. Running this with strace -o isacontextswitch.strace -s512 python test.py gives you this output on the loop:

write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)  

select() is a system call, so yes, you are context switching (ok technically a context switch is not actually necessary when you change to kernel space, but if you have other processes running, what you're saying here is that unless you have data ready to read on your file descriptor, other processes can run until then) into the kernel in order to perform this. Interestingly, the delay is in selecting on stdin. This allows python to interrupt your input on events such as ctrl+c input, should they wish, without having to wait for the code to time out - which I think is quite neat.

I should note that the same applies to time.sleep(0) except that the time parameter passed in is {0,0}. And that spin locking is not really ideal for anything but very short delays - multiprocessing and threads provide the ability to wait on event objects.

Edit: So I had a look to see exactly what linux does. The implementation in do_select (fs\select.c) makes this check:

if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
    wait = NULL;
timed_out = 1;
}

if (end_time && !timed_out)
    slack = select_estimate_accuracy(end_time);

In other words, if an end time is provided and both parameters are zero (!0 = 1 and evaluates to true in C) then the wait is set to NULL and the select is considered timed out. However, that doesn't mean the function returns back to you; it loops over all the file descriptors you have and calls cond_resched, thereby potentially allowing another process to run. In other words, what happens is entirely up to the scheduler; if your process has been hogging CPU time compared to other processes, chances are a context switch will take place. If not, the task you are in (the kernel do_select function) might just carry on until it completes.

I would re-iterate, however, that the best way to be nicer to other processes generally involves using other mechanisms than a spin lock.


I think you have already the answer from @Ninefingers, but in this answer we will try to dive into python source code.

First the python time module is implemented in C and to see the time.sleep function implementation you can take a look at Modules/timemodule.c. As you can see (and without getting in all platform specific details) this function will delegate the call to the floatsleep function.

Now floatsleep is designed to work in different platform but still the behavior was designed to be the similar whenever it's possible, but as we are interested only in unix-like platform let's check that part only shall we:

...
Py_BEGIN_ALLOW_THREADS
sleep((int)secs);
Py_END_ALLOW_THREADS

As you can see floatsleep is calling C sleep and from sleep man page:

The sleep() function shall cause the calling thread to be suspended from execution until either the number of realtime seconds specified by the argument seconds has elapsed or ...

But wait a minute didn't we forgot about the GIL?

Well this is where Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros came in action (check Include/ceval.h if you are interested about the definition of this two macros), the C code above can be translated using this two macros to:

Save the thread state in a local variable.
Release the global interpreter lock.
... Do some blocking I/O operation ... (call sleep in our case)
Reacquire the global interpreter lock.
Restore the thread state from the local variable.

More information can be found about this two macro in the c-api doc.

Hope this was helpful.


You are basically attempting to usurp the job of the OS CPU scheduler. It would likely be much better to simply call os.nice(100) to inform the scheduler that you're very low priority so it can do its job properly.