Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does using a list as a string formatting parameter, even with no %s identifier, return the original string?

>>> 'string with no string formatting markers' % ['string']
'string with no string formatting markers'
>>> 'string with no string formatting markers' % ('string',)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

I would expect both cases to raise a TypeError, but this is not the case. Why not?

The Python documentation on this subject talks about strings, tuples and dictionaries, but says nothing about lists. I'm a bit confused about this behavior. I've been able to duplicate it in Python 2.7 and 3.2.

like image 548
Fredrick Brennan Avatar asked Dec 13 '12 16:12

Fredrick Brennan


People also ask

What is %s in string format?

%s specifically is used to perform concatenation of strings together. It allows us to format a value inside a string.

What does %s %d mean in Python?

They are used for formatting strings. %s acts a placeholder for a string while %d acts as a placeholder for a number. Their associated values are passed in via a tuple using the % operator.

Why is string formatting used in Python?

Python uses C-style string formatting to create new, formatted strings. The "%" operator is used to format a set of variables enclosed in a "tuple" (a fixed size list), together with a format string, which contains normal text together with "argument specifiers", special symbols like "%s" and "%d".

What is the point of string formatting?

String formatting uses a process of string interpolation (variable substitution) to evaluate a string literal containing one or more placeholders, yielding a result in which the placeholders are replaced with their corresponding values.


1 Answers

Reading carefully, the documentation states that:

If format requires a single argument, values may be a single non-tuple object. Otherwise, values must be a tuple with exactly the number of items specified by the format string, or a single mapping object (for example, a dictionary).

Now, in this case the format does not require a single argument and thus the documentation tells us that you should use a tuple or a mapping as argument; other cases fall in "undefined behaviour"(which is what is happening: the behaviour is not consistent in all cases).

This should probably be considered the final answer to the question: if the string does not have any format specifier, using a list(or any kind different from tuple or a mapping) should simply be considered a bug by itself leading to undefined behaviour.

From this follows that you ought to always use a tuple or dict as argument, otherwise you have to check for format specifiers by hand or handle odd behaviours.

In your case you can probably fix the problem using (['string'], ) instead of ['string'].


Possible "explanation" of why the resultant behaviour seems to be so random:

It seems like there was a buggy check in the original implementation of PyString_Format/PyUnicode_Format, instead of using PyMappingCheck on this line:

if (PyMapping_Check(args) && !PyTuple_Check(args) &&
     !PyObject_TypeCheck(args, &PyBaseString_Type))
    dict = args;

It was used this code:

if (Py_TYPE(args)->tp_as_mapping && !PyTuple_Check(args) &&
    !PyObject_TypeCheck(args, &PyBaseString_Type))
    dict = args;

which is not equivalent. For example set does not have tp_as_mapping set(at least in the Python2.7.3 source code that I have downloaded some weeks ago), while list does set it.

This might be the reason why list(and possibly other objects) do not raise the TypeError while, set, int and many others do.

As I stated before in this same answer I do get TypeError even with lists:

$ python2
Python 2.7.3 (default, Sep 26 2012, 21:53:58) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 'some string' % []
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

This probably shows that the above issue is not the only one here.

Looking at the source code I agree that, in theory, the number of arguments is not checked if the argument is not a tuple, but this would imply 'some string' % 5 -> 'some string' and not a TypeError, so there must be something fishy in that code.

like image 69
Bakuriu Avatar answered Nov 15 '22 21:11

Bakuriu