Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Python's string .format() be made safe for untrusted format strings?

Tags:

I'm working on a web app where users will be able to supply strings that the server will then substitute variables into.

Preferably I'd like to use PEP 3101 format() syntax and I'm looking at the feasibility of overriding methods in Formatter to make it secure for untrusted input.

Here are the risks I can see with .format() as it stands:

  • Padding lets you specify arbitrary lengths, so '{:>9999999999}'.format(..) could run the server out of memory and be a DOS. I'd need to disable this.
  • Format lets you access the fields inside objects, which is useful, but it's creepy that you can access dunder variables and start drilling into bits of the standard library. There's no telling where there might be a getattr() that has side effects or returns something secret. I would whitelist attribute/index access by overriding get_field().
  • I'd need to catch some exceptions, naturally.

My assumptions are:

  • None of the traditional C format string exploits apply to Python, because specifying a parameter is a bounds-checked access into a collection, rather than directly popping off the thread's stack.
  • The web framework I'm using escapes every variable that's substituted into a page template, and so long as it's the last stop before output, I'm safe from cross-site scripting attacks emerging from de-escaping.

What are your thoughts? Possible? Impossible? Merely unwise?


Edit: Armin Ronacher outlines a nasty information leak if you don't filter out dunder variable access, but seems to regard securing format() as feasible:

{local_foo.__init__.__globals__[secret_global]} 

Be Careful with Python's New-Style String Format | Armin Ronacher's Thoughts and Writings

Personally, I didn't actually go the untrusted format() route in my product, but am updating for the sake of completeness

like image 344
Craig Timpany Avatar asked Mar 12 '13 08:03

Craig Timpany


People also ask

Is string format secure?

Uncontrolled format string is a type of software vulnerability discovered around 1989 that can be used in security exploits. Originally thought harmless, format string exploits can be used to crash a program or to execute harmful code.

Are Python F-strings safe?

Python's f-strings are actually safer. String formatting may be dangerous when a format string depends on untrusted data. So, when using str. format() or % -formatting, it's important to use static format strings, or to sanitize untrusted parts before applying the formatter function.

What is string format () used for?

In java, String format() method returns a formatted string using the given locale, specified format string, and arguments. We can concatenate the strings using this method and at the same time, we can format the output concatenated string.

Can you format a string in Python?

Python uses C-style string formatting to create new, formatted strings. The "%" operator is used to format a set of variables enclosed in a "tuple" (a fixed size list), together with a format string, which contains normal text together with "argument specifiers", special symbols like "%s" and "%d".


1 Answers

Good instinct. Yes, an attacker being able to supply arbitrary format string is a vulnerability under python.

  • The denial of service is probably the most simple to address. In this case, limiting the size of the string or the number of operators within the string will mitigate this issue. There should be a setting where no reasonable user will need to generate a string with more variables than X, and this amount of computation isn't at risk of being exploited in a DoS attack.
  • Being able to access attributes within an object could be dangerous. However, I don't think that the Object parent class has any useful information. The object supplied to the format would have to contain something sensitive. In any case, this type of notation can limited with a regular expression.
  • If the format strings are user supplied then a user might need to know the error message for debugging. However, error mesages can contain senstive information such as local paths or class names. Make sure to limit the information that an attacker can obtain.

Look over the python format string specification and forbid functionality you don't want the user to have with a regex.

like image 72
rook Avatar answered Oct 23 '22 20:10

rook