Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to check what part of my code leaves file handles open

Is there a way to track the python process to check where a file is being opened. I have too many files open when I use lsof on my running process but I'm not sure where they are being opened.

ls /proc/$pid/fd/ | wc -l

I suspect one of the libraries I'm using might have not handled the files properly. Is there a way to isolate exactly which line in my python code the files are being opened?

In my code I work with 3rd party libraries to process thousands of media files and since they are being left open I receive the error

OSError: [Errno 24] Too many open files

after running for a few minutes. Now I know raising the limit of open files is an option but this will just push the error to a later point of time.

like image 668
Anand C U Avatar asked Dec 30 '22 17:12

Anand C U


1 Answers

The easiest way to trace the open calls is to use an audit hook in Python. Note that this method would only trace Python open calls and not the system calls.

Let fdmod.py be a module file with a single function foo:

def foo():
    return open("/dev/zero", mode="r")

Now the main code in file fd_trace.py, which is tracing all open calls and importing fdmod, is defined follows:

import sys
import inspect
import fdmod

def open_audit_hook(name, *args):
    if name == "open":
        print(name, *args, "was called:")
        caller = inspect.currentframe()
        while caller := caller.f_back:
            print(f"\tFunction {caller.f_code.co_name} "
                  f"in {caller.f_code.co_filename}:"
                  f"{caller.f_lineno}"
            )
sys.addaudithook(open_audit_hook)

# main code
fdmod.foo()
with open("/dev/null", "w") as dev_null:
    dev_null.write("hi")
fdmod.foo()

When we run fd_trace.py, we will print the call stack whenever some component is calling open:

% python3 fd_trace.py
open ('/dev/zero', 'r', 524288) was called:
        Function foo in /home/tkrennwa/fdmod.py:2
        Function <module> in fd_trace.py:17
open ('/dev/null', 'w', 524865) was called:
        Function <module> in fd_trace.py:18
open ('/dev/zero', 'r', 524288) was called:
        Function foo in /home/tkrennwa/fdmod.py:2
        Function <module> in fd_trace.py:20

See sys.audithook and inspect.currentframe for details.

like image 106
tkrennwa Avatar answered Jan 02 '23 07:01

tkrennwa