Converting some code to using asyncio, I'd like to give back control to the asyncio.BaseEventLoop
as quickly as possible. This means to avoid blocking waits.
Without asyncio I'd use os.stat()
or pathlib.Path.stat()
to obtain e.g. the filesize. Is there a way to do this efficiently with asyncio?
Can I just wrap the stat()
call so it is a future similar to what's described here?
os.stat()
translates to a stat
syscall:
$ strace python3 -c 'import os; os.stat("/")'
[...]
stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[...]
which is blocking, and there's no way to get a non-blocking stat
syscall.
asyncio
provides non-blocking I/O by using non-blocking system calls, which already exists (see man fcntl
, with its O_NONBLOCK
flag, or ioctl
), so asyncio
is not making syscalls asynchronous, it exposes already asynchronous syscalls in a nice way.
It's still possible to use the nice ThreadPoolExecutor abstraction to make your blocking stat
calls in parallel using a pool of threads.
But you may first consider some other parameters:
strace -T
, stat
is fast: stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000007>
, probably faster than starting and synchronizing threads.stat
is probably in much cases IO bound, so using more CPUs won't helpBut there's also a lot of possibilities for your stat
s to be faster using a thread pool, like if you're hitting a distributed file system.
You may also take a look at functools.lru_cache
: if you're doing multiple stat
on the same file or directory, and you're sure it has not changed, caching the result avoids a syscall.
To conclude, "keep it simple", "os.stat" is the efficient way to get a filesize.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With