Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Printing escaped Unicode in Python

>>> s = 'auszuschließen'
>>> print(s.encode('ascii', errors='xmlcharrefreplace'))
b'auszuschließen'
>>> print(str(s.encode('ascii', errors='xmlcharrefreplace'), 'ascii'))
auszuschließen

Is there a prettier way to print any string without the b''?

EDIT:

I'm just trying to print escaped characters from Python, and my only gripe is that Python adds "b''" when i do that.

If i wanted to see the actual character in a dumb terminal like Windows 7's, then i get this:

Traceback (most recent call last):
  File "Mailgen.py", line 378, in <module>
    marked_copy = mark_markup(language_column, item_row)
  File "Mailgen.py", line 210, in mark_markup
    print("TP: %r" % "".join(to_print))
  File "c:\python32\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2026' in position 29: character maps to <undefined>
like image 897
Cees Timmerman Avatar asked Jan 04 '12 13:01

Cees Timmerman


People also ask

How to print Unicode characters in Python?

If you see utf-8, then your system supports unicode characters. To print any character in the Python interpreter, use a u to denote a unicode character and then follow with the character code. For instance, the code for β is 03B2, so to print β the command is print ('u03B2'). There are a couple of special characters that will combine symbols.

How to print symbols in Python?

A couple commonly used symbols in engineers include Omega and Delta. We can print these in python using unicode characters. From the Python interpreter we can type: All of these are unicode characters.

How to print hexadecimal characters in Python?

In Python, Unicode characters are represented as a string type. These characters are printed using the print command. Before giving the hexadecimal value as an input value, the escape sequence \u is used before every hexadecimal value. Note that the hexadecimal value stored in the variable is taken as a string.

Why can't I print a Python character without converting it to bytes?

Python won't print it without converting it back into a string and the default conversion puts in the b and quotes. Using decode explicitly converts it back to a string; the default encoding is utf-8, and since your bytes only consist of ascii which is a subset of utf-8 it is guaranteed to work.


1 Answers

>>> s='auszuschließen…'
>>> s
'auszuschließen…'
>>> print(s)
auszuschließen…
>>> b=s.encode('ascii','xmlcharrefreplace')
>>> b
b'auszuschlie&#223;en&#8230;'
>>> print(b)
b'auszuschlie&#223;en&#8230;'
>>> b.decode()
'auszuschlie&#223;en&#8230;'
>>> print(b.decode())
auszuschlie&#223;en&#8230;

You start out with a Unicode string. Encoding it to ascii creates a bytes object with the characters you want. Python won't print it without converting it back into a string and the default conversion puts in the b and quotes. Using decode explicitly converts it back to a string; the default encoding is utf-8, and since your bytes only consist of ascii which is a subset of utf-8 it is guaranteed to work.

like image 53
Mark Ransom Avatar answered Oct 01 '22 03:10

Mark Ransom