I need to dynamically load several potentially unsafe modules for testing purpose.
Regarding security, my script is executed by a low-access user.
Although, I still need a way to elegantly make the import process timeout as I have no guarantee that the module script will terminate. By example, it could contain a call to input
or an infinite loop.
I am currently using Thread.join
with a timeout
, but this does not fully solve the issue since the script is then still alive in the background and there is no way to kill a thread.
from threading import Thread
import importlib.util
class ReturnThread(Thread):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._return = None
def run(self):
if self._target is not None:
self._return = self._target(*self._args, **self._kwargs)
def join(self, *args, **kwargs):
super().join(*args, **kwargs)
return self._return
def loader(name, path):
spec = importlib.util.spec_from_file_location(name, path)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module) # This may run into an infinite loop
return module
module_loader = ReturnThread(loader, ('module_name', 'module/path'))
module_loader.start()
module = module_loader.join(timeout=0.1)
# The thread might still be alive here
if module is None:
...
else:
...
How can I import a module, but return None
if the script timeouts?
You can't reliably kill importing a module. You are essentially executing live code in your own interpreter, so all bets are off.
First of all, there is no way to safely import unsafe modules from an untrusted source. It doesn't matter if you are using a low-access user. NEVER IMPORT UNTRUSTED CODE. The moment the code is imported it could have exploited security holes in your system well beyond the Python process itself. Python is a general purpose programming language, not a sandboxed environment, and any code you import has the full run of your system
Instead of using a low-access user, at the very least run this is a virtual machine. The virtual machine environment can be set up from a known-good snapshot, without network access, and be shut down when a time limit has been reached. You can then compare the snapshots to see what, if anything, the code has attempted to do. Any security breach at that level is short-lived and without value. Also see Best practices for execution of untrusted code over on Software Engineering Stack Exchange.
Next, because you can't control what the imported code does, it can trivially interfere with any attempts to time out the code. The first thing the imported code could do is revoke the protections you put in place! Imported code can access all of Python's global state, including the code that triggered the import. The code could set the thread switch interval to the maximum value (internally, an unsigned long modelling milliseconds, so the max is ((2 ** 32) - 1)
milliseconds, just a smidgen under 71 minutes 35 seconds) to mess with scheduling.
Exiting a thread in Python is handled by raising a exception:
Raise the
SystemExit
exception. When not caught, this will cause the thread to exit silently.
(Bold emphasis mine.)
From pure Python code, you can only exit a thread from code running in that thread, but there is a way around this, see below.
But you can't guarantee that the code you are importing isn't just catching and handling all exceptions; if that's the case, the code will just keep on running. At that point it becomes a weapons race; can your thread manage to insert the exception at the point the other thread is inside an exception handler? Then you can exit that thread, otherwise, you lose. You'd have to keep trying until you succeed.
If the code you import waits on blocking I/O (such as an input()
call) then you can't interrupt that call. Raising an exception does nothing, and you can't use signals (as Python handles those on the main thread only). You'd have to find and close every open I/O channel they could be blocked on. This is outside of the scope of my answer here, there are just too many ways to start I/O operations.
If the code started something implemented in native code (a Python extension) and that blocks, all bets are off entirely.
The code you import could have done anything by the time you managed to stop them. Imported modules could have been replaced. Source code on disk can have been altered. You can't be certain that no other threads have been started. Anything is possible in Python, so assume that it has happened.
With those caveats in mind, so you accept that
then you can time out imports by running them in a separate thread, and then raise a SystemExit
exception in the thread. You can raise exceptions in another thread by calling the PyThreadState_SetAsyncExc
C-API function via the ctypes.pythonapi
object. The Python test suite actually uses this path in a test, I used that as a template for my solution below.
So here is a full implementation that does just that, and raises an custom UninterruptableImport
exception (a subclass of ImportError
) if the import could not be interrupted. If the import raised an exception, then that exception is re-raised in the thread that started the import process:
"""Import a module within a timeframe
Uses the PyThreadState_SetAsyncExc C API and a signal handler to interrupt
the stack of calls triggered from an import within a timeframe
No guarantees are made as to the state of the interpreter after interrupting
"""
import ctypes
import importlib
import random
import sys
import threading
import time
_set_async_exc = ctypes.pythonapi.PyThreadState_SetAsyncExc
_set_async_exc.argtypes = (ctypes.c_ulong, ctypes.py_object)
_system_exit = ctypes.py_object(SystemExit)
class UninterruptableImport(ImportError):
pass
class TimeLimitedImporter():
def __init__(self, modulename, timeout=5):
self.modulename = modulename
self.module = None
self.exception = None
self.timeout = timeout
self._started = None
self._started_event = threading.Event()
self._importer = threading.Thread(target=self._import, daemon=True)
self._importer.start()
self._started_event.wait()
def _import(self):
self._started = time.time()
self._started_event.set()
timer = threading.Timer(self.timeout, self.exit)
timer.start()
try:
self.module = importlib.import_module(self.modulename)
except Exception as e:
self.exception = e
finally:
timer.cancel()
def result(self, timeout=None):
# give the importer a chance to finish first
if timeout is not None:
timeout += max(time.time() + self.timeout - self._started, 0)
self._importer.join(timeout)
if self._importer.is_alive():
raise UninterruptableImport(
f"Could not interrupt the import of {self.modulename}")
if self.module is not None:
return self.module
if self.exception is not None:
raise self.exception
def exit(self):
target_id = self._importer.ident
if target_id is None:
return
# set a very low switch interval to be able to interrupt an exception
# handler if SystemExit is being caught
old_interval = sys.getswitchinterval()
sys.setswitchinterval(1e-6)
try:
# repeatedly raise SystemExit until the import thread has exited.
# If the exception is being caught by a an exception handler,
# our only hope is to raise it again *while inside the handler*
while True:
_set_async_exc(target_id, _system_exit)
# short randomised wait times to 'surprise' an exception
# handler
self._importer.join(
timeout=random.uniform(1e-4, 1e-5)
)
if not self._importer.is_alive():
return
finally:
sys.setswitchinterval(old_interval)
def import_with_timeout(modulename, import_timeout=5, exit_timeout=1):
importer = TimeLimitedImporter(modulename, import_timeout)
return importer.result(exit_timeout)
If the code can't be killed, it'll be running in a daemon thread, meaning you can at least exit Python gracefully.
Use it like this:
module = import_with_timeout(modulename)
for a default 5 second timeout, and a 1 second wait to see if the import really is unkillable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With