Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write different hex-values in Python2 and Python3

I'm currently porting a Python2 script to Python3 and have problems with this line:

print('\xfe')

When I run it with Python2 python test.py > test.out, than the file consists of the hex-values FE 0A, like expected.

But when I run it with Python3 python3 test.py > test.out, the file consists of the hex-values C3 BE 0A.

What's going wrong here? How can I receive the desired output FE 0A with Python3.

like image 614
Jakube Avatar asked Jan 07 '23 18:01

Jakube


1 Answers

The byte-sequence C3 BE is the UTF-8 encoded representation of the character U+00FE.

Python 2 handles strings as a sequence of bytes rather than characters. So '\xfe' is a str object containing one byte.

In Python 3, strings are sequences of (Unicode) characters. So the code '\xfe' is a string containing one character. When you print the string, it must be encoded to bytes. Since your environment chose a default encoding of UTF-8, it was encoded accordingly.

How to solve this depends on your data. Is it bytes or characters? If bytes, then change the code to tell the interpreter: print(b'\xfe'). If it is characters, but you wanted a different encoding then encode the string accordingly: print( '\xfe'.encode('latin1') ).

like image 189
dsh Avatar answered Jan 15 '23 15:01

dsh