I would like to understand why this works fine:
>>> test_string = 'long brown fox jump over a lazy python'
>>> 'formatted "{test_string[0]}"'.format(test_string=test_string)
'formatted "l"'
Yet this fails:
>>> 'formatted "{test_string[-1]}"'.format(test_string=test_string)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers
>>> 'formatted "{test_string[11:14]}"'.format(test_string=test_string)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers
I know this could be used:
'formatted "{test_string}"'.format(test_string=test_string[11:14])
...but that is not possible in my situation.
I am dealing with a sandbox-like environment where a list of variables is passed to str.format() as dictionary of kwargs. These variables are outside of my control. I know the names and types of variables in advance and can only pass formatter string. The formatter string is my only input. It all works fine when I need to combine a few strings or manipulate numbers and their precision. But it all falls apart when I need to extract a substring.
This is explained in the spec of str.format():
The arg_name can be followed by any number of index or attribute expressions. An expression of the form '.name' selects the named attribute using
getattr(), while an expression of the form '[index]' does an index lookup using__getitem__().
That is, you can index the string using bracket notation, and the index you put inside the brackets will be the argument of the __getitem__() method of the string. This is indexing, not slicing. The bottom line is that str.format() simply doesn't support slicing of the replacement field (= the part between {}), as this functionality isn't part of spec.
Regarding negative indices, the grammar specifies:
element_index ::= digit+ | index_string
This means that the index can either be a sequence of digits (digit+) or a string. Since any negative index such as -1 is not a sequence of digits, it will be parsed as index_string. However, str.__getitem__() only supports arguments of type integer. Hence the error TypeError: string indices must be integers, not 'str'.
>>> test_string = 'long brown fox jump over a lazy python'
>>> f"formatted {test_string[0]}"
'formatted l'
>>> f"formatted {test_string[0:2]}"
'formatted lo'
>>> f"formatted {test_string[-1]}"
'formatted n'
str.format() but slice the argument of str.format() directly, rather than the replacement field>>> test_string = 'long brown fox jump over a lazy python'
>>> 'formatted {replacement}'.format(replacement=test_string[0:2])
'formatted lo'
>>> 'formatted {replacement}'.format(replacement=test_string[-1])
'formatted n'
The str.format() method uses different syntax than f-string literals.
With f-strings, it works as expected:
>>> test_string = 'long brown fox jump over a lazy python'
>>> f'formatted "{test_string[-1]}"'
'formatted "n"
Also, compare the syntax when using dict index:
>>> x = {'key': 'value'}
>>> f'{x["key"]}'
'value'
>>> '{x[key]}'.format(x=x)
'value'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With