Simple use of Python's str.format() method:
>>> '{0}'.format('zero')
'zero'
Hex, octal, and binary literals do not work:
>>> '{0x0}'.format('zero')
KeyError: '0x0'
>>> '{0o0}'.format('zero')
KeyError: '0o0'
>>> '{0b0}'.format('zero')
KeyError: '0b0'
According to the replacement field grammar, though, they should:
replacement_field ::= "{" [field_name] ["!" conversion] [":" format_spec] "}" field_name ::= arg_name ("." attribute_name | "[" element_index "]")* arg_name ::= [identifier | integer] attribute_name ::= identifier element_index ::= integer | index_string index_string ::= <any source character except "]"> + conversion ::= "r" | "s" format_spec ::= <described in the next section>
The integer grammar is as follows:
longinteger ::= integer ("l" | "L") integer ::= decimalinteger | octinteger | hexinteger | bininteger decimalinteger ::= nonzerodigit digit* | "0" octinteger ::= "0" ("o" | "O") octdigit+ | "0" octdigit+ hexinteger ::= "0" ("x" | "X") hexdigit+ bininteger ::= "0" ("b" | "B") bindigit+ nonzerodigit ::= "1"..."9" octdigit ::= "0"..."7" bindigit ::= "0" | "1" hexdigit ::= digit | "a"..."f" | "A"..."F"
Have I misunderstood the documentation, or does Python not behave as advertised? (I'm using Python 2.7.)
This looks like a mistake in the grammar. And the text has nothing to clarify it; it just describes it as "a number or an identifier" and talks about how it's interpreted if a number.
Testing it out, the field is clearly not treated as an integer
:
>>> '{08}'.format(*range(10)) # should be SyntaxError
'8'
>>> '{010}'.format(*range(10)) # should be '8'
'10'
>>> '{-1}'.format(*range(10)) # should be '9', but looked up as a string
KeyError: '-1'
>>> '{1 }'.format(*range(10)) # should be '1', but looked up as a string
KeyError: '1 '
>>> '{10000000000000000000}'.format(1) # should be IndexError
ValueError: Too many decimal digits in format string
Looking at the code, it doesn't borrow from the Python parser to parse format strings; it uses custom parsing, and the code to interpret an arg_spec as a number uses a get_integer
function that just converts each digit and shifts and adds until the field is over or we get within a digit of PY_SSIZE_T_MAX
.
PEP 3101 suggests that this is intentional:
Simple field names are either names or numbers. If numbers, they must be valid base-10 integers …
It doesn't specifically say that it must not be too close to the maximum index value, nor that negative indices can't be used. But most of the other quirks could be explained by using the "valid base-10 integer" description instead of just "integer". In fact, just describing it as digit +
instead of integer
would solve all of the quirks.
The element_index
is parsed in exactly the same way as the arg_name
. #8985 say that element_index
intentionally "… uses the narrowest possible definition for integer indexes, in order to pass all other strings to mappings." Whether that's also intentional for arg_name
, or whether it's an unintended consequence of using the same code, I'm not sure.
The docs are unchanged in 3.4, and the code is effectively unchanged in the current trunk.
I'd suggest searching the bug tracker and the python-dev archives to see if this has been raised before. And, if not, figure out whether you think the docs or the code should be changed, file a bug, and ideally submit a patch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With