Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is under the hood of x = 'y' 'z' in Python?

If you run x = 'y' 'z' in Python, you get x set to 'yz', which means that some kind of string concatenation is occurring when Python sees multiple strings next to each other.

But what kind of concatenation is this? Is it actually running 'y' + 'z' or is it running ''.join('y','z') or something else?

like image 592
Merlin -they-them- Avatar asked Oct 17 '14 20:10

Merlin -they-them-


1 Answers

The Python parser interprets that as one string. This is well documented in the Lexical Analysis documentation:

String literal concatenation

Multiple adjacent string literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same as their concatenation. Thus, "hello" 'world' is equivalent to "helloworld".

The compiled Python code sees just the one string object; you can see this by asking Python to produce an AST of such strings:

>>> import ast >>> ast.dump(ast.parse("'hello' 'world'", mode='eval').body) "Str(s='helloworld')" 

In fact, it is the very act of building the AST that triggers the concatenation, as the parse tree is traversed, see the parsestrplus() function in the AST C source.

The feature is specifically aimed at reducing the need for backslashes; use it to break up a string across physical lines when still within a logical line:

print('Hello world!', 'This string is spans just one '       'logical line but is broken across multiple physical '       'source lines.') 

Multiple physical lines can implicitly be joined into one physical line by using parentheses, square brackets or curly braces.

This string concatenation feature was copied from C, but Guido van Rossum is on record regretting adding it to Python. That post kicked of a long and very interesting thread, with a lot of support for removing the feature altogether.

like image 190
Martijn Pieters Avatar answered Oct 01 '22 07:10

Martijn Pieters