Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I asyncio schedule a filesystem stat operation?

Converting some code to using asyncio, I'd like to give back control to the asyncio.BaseEventLoop as quickly as possible. This means to avoid blocking waits.

Without asyncio I'd use os.stat() or pathlib.Path.stat() to obtain e.g. the filesize. Is there a way to do this efficiently with asyncio?

Can I just wrap the stat() call so it is a future similar to what's described here?

like image 720
cfi Avatar asked Jun 24 '16 07:06

cfi


1 Answers

os.stat() translates to a stat syscall:

$ strace python3 -c 'import os; os.stat("/")'
[...]
stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[...]

which is blocking, and there's no way to get a non-blocking stat syscall.

asyncio provides non-blocking I/O by using non-blocking system calls, which already exists (see man fcntl, with its O_NONBLOCK flag, or ioctl), so asyncio is not making syscalls asynchronous, it exposes already asynchronous syscalls in a nice way.

It's still possible to use the nice ThreadPoolExecutor abstraction to make your blocking stat calls in parallel using a pool of threads.

But you may first consider some other parameters:

  • According to strace -T, stat is fast: stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000007>, probably faster than starting and synchronizing threads.
  • stat is probably in much cases IO bound, so using more CPUs won't help
  • Doing parallel I/O may break a nice sequential access to a random access, phisical hard drive may be slower in this context.

But there's also a lot of possibilities for your stats to be faster using a thread pool, like if you're hitting a distributed file system.

You may also take a look at functools.lru_cache: if you're doing multiple stat on the same file or directory, and you're sure it has not changed, caching the result avoids a syscall.

To conclude, "keep it simple", "os.stat" is the efficient way to get a filesize.

like image 196
Julien Palard Avatar answered Sep 26 '22 12:09

Julien Palard