Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python eval: is it still dangerous if I disable builtins and attribute access?

We all know that eval is dangerous, even if you hide dangerous functions, because you can use Python's introspection features to dig down into things and re-extract them. For example, even if you delete __builtins__, you can retrieve them with

[c for c in ().__class__.__base__.__subclasses__()    if c.__name__ == 'catch_warnings'][0]()._module.__builtins__ 

However, every example I've seen of this uses attribute access. What if I disable all builtins, and disable attribute access (by tokenizing the input with a Python tokenizer and rejecting it if it has an attribute access token)?

And before you ask, no, for my use-case, I do not need either of these, so it isn't too crippling.

What I'm trying to do is make SymPy's sympify function more safe. Currently it tokenizes the input, does some transformations on it, and evals it in a namespace. But it's unsafe because it allows attribute access (even though it really doesn't need it).

like image 305
asmeurer Avatar asked Mar 04 '16 19:03

asmeurer


People also ask

Is Python eval dangerous?

As mentioned, the eval() function is a powerful but dangerous weapon in Python. A dangerous weapon is not designed for newbies.

Why eval function is dangerous?

eval() is a dangerous function, which executes the code it's passed with the privileges of the caller. If you run eval() with a string that could be affected by a malicious party, you may end up running malicious code on the user's machine with the permissions of your webpage / extension.

What can I use instead of eval in Python?

literal_eval may be a safer alternative. literal_eval() would only evaluate literals, not algebraic expressions.

What is the purpose of using eval in Python?

Python's eval() allows you to evaluate arbitrary Python expressions from a string-based or compiled-code-based input. This function can be handy when you're trying to dynamically evaluate Python expressions from any input that comes as a string or a compiled code object.


2 Answers

I'm going to mention one of the new features of Python 3.6 - f-strings.

They can evaluate expressions,

>>> eval('f"{().__class__.__base__}"', {'__builtins__': None}, {}) "<class 'object'>" 

but the attribute access won't be detected by Python's tokenizer:

0,0-0,0:            ENCODING       'utf-8'         1,0-1,1:            ERRORTOKEN     "'"             1,1-1,27:           STRING         'f"{().__class__.__base__}"' 2,0-2,0:            ENDMARKER      ''  
like image 66
vaultah Avatar answered Sep 23 '22 06:09

vaultah


It is possible to construct a return value from eval that would throw an exception outside eval if you tried to print, log, repr, anything:

eval('''((lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args))))         (lambda f: lambda n: (1,(1,(1,(1,f(n-1))))) if n else 1)(300))''') 

This creates a nested tuple of form (1,(1,(1,(1...; that value cannot be printed (on Python 3), stred or repred; all attempts to debug it would lead to

RuntimeError: maximum recursion depth exceeded while getting the repr of a tuple 

pprint and saferepr fails too:

...   File "/usr/lib/python3.4/pprint.py", line 390, in _safe_repr     orepr, oreadable, orecur = _safe_repr(o, context, maxlevels, level)   File "/usr/lib/python3.4/pprint.py", line 340, in _safe_repr     if issubclass(typ, dict) and r is dict.__repr__: RuntimeError: maximum recursion depth exceeded while calling a Python object 

Thus there is no safe built-in function to stringify this: the following helper could be of use:

def excsafe_repr(obj):     try:         return repr(obj)     except:         return object.__repr__(obj).replace('>', ' [exception raised]>') 

And then there is the problem that print in Python 2 does not actually use str/repr, so you do not have any safety due to lack of recursion checks. That is, take the return value of the lambda monster above, and you cannot str, repr it, but ordinary print (not print_function!) prints it nicely. However, you can exploit this to generate a SIGSEGV on Python 2 if you know it will be printed using the print statement:

print eval('(lambda i: [i for i in ((i, 1) for j in range(1000000))][-1])(1)') 

crashes Python 2 with SIGSEGV. This is WONTFIX in the bug tracker. Thus never use print-the-statement if you want to be safe. from __future__ import print_function!


This is not a crash, but

eval('(1,' * 100 + ')' * 100) 

when run, outputs

s_push: parser stack overflow Traceback (most recent call last):   File "yyy.py", line 1, in <module>     eval('(1,' * 100 + ')' * 100) MemoryError 

The MemoryError can be caught, is a subclass of Exception. The parser has some really conservative limits to avoid crashes from stackoverflows (pun intended). However, s_push: parser stack overflow is output to stderr by C code, and cannot be suppressed.


And just yesterday I asked why doesn't Python 3.4 be fixed for a crash from,

% python3   Python 3.4.3 (default, Mar 26 2015, 22:03:40)  [GCC 4.9.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> class A: ...     def f(self): ...         nonlocal __x ...  [4]    19173 segmentation fault (core dumped)  python3 

and Serhiy Storchaka's answer confirmed that Python core devs do not consider SIGSEGV on seemingly well-formed code a security issue:

Only security fixes are accepted for 3.4.

Thus it can be concluded that it can never be considered safe to execute any code from 3rd party in Python, sanitized or not.

And Nick Coghlan then added:

And as some additional background as to why segmentation faults provoked by Python code aren't currently considered a security bug: since CPython doesn't include a security sandbox, we're already relying entirely on the OS to provide process isolation. That OS level security boundary isn't affected by whether the code is running "normally", or in a modified state following a deliberately triggered segmentation fault.