How does source encoding apply within string literals?

Question

PEP-263 specifies that encoding specified in the source is applied in the following order:

read the file

decode it into Unicode assuming a fixed per-file encoding

convert it into a UTF-8 byte string

tokenize the UTF-8 content

compile it, creating Unicode objects from the given Unicode data and creating string objects from the Unicode literal data by first reencoding the UTF-8 data into 8-bit string data using the given file encoding

So, if I take this code:

print 'abcdefgh'
print u'abcdefgh'

And convert it to ROT-13:

# coding: rot13

cevag 'nopqrstu'
cevag h'nopqrstu'

I would expect that it is first decoded and then becomes identical to the original, printing:

abcdefgh
abcdefgh

But instead, it prints:

nopqrstu
abcdefgh

So, the unicode literal works as expeced, but str remains unconverted. Why?

Eliminating some possibilities:

I confirmed that the problem is not in a later phase (printing to console), but immediately at parsing, becuase this code produces "ValueError: unsupported format character 'q' (0x71) at index 1":

x = '%q' % 1  # that is %d !

zvone · Accepted Answer

I guess the last point actually explains what happens quite accurately:

compile it, creating Unicode objects from the given Unicode data and creating string objects from the Unicode literal data by first reencoding the UTF-8 data into 8-bit string data using the given file encoding

After the first 4 steps, the contents of the source file are a tokenized unicode version of the following string:

print 'abcdefgh'
print u'abcdefgh'

After that, in step 5, the string object 'abcdefgh' is reencoded into 8-bit string data using the given file encoding (which is rot13), so the contents become:

print 'nopqrstu'
print u'abcdefgh'

How does source encoding apply within string literals?

Tags:

python

character-encoding

python-2.7

zvone

1 Answers

zvone

Recent Activity

Donate For Us

How does source encoding apply within string literals?

Tags:

python

character-encoding

python-2.7

zvone

1 Answers

zvone

Related questions

Recent Activity

Donate For Us