UnicodeEncodeError: 'charmap' codec can't encode - character maps to <undefined>, print function [duplicate]

People also ask

How do I fix UnicodeEncodeError in Python?

Only a limited number of Unicode characters are mapped to strings. Thus, any character that is not-represented / mapped will cause the encoding to fail and raise UnicodeEncodeError. To avoid this error use the encode( utf-8 ) and decode( utf-8 ) functions accordingly in your code.

What is Charmap codec in Python?

The Python "UnicodeEncodeError: 'charmap' codec can't encode characters in position" occurs when we use an incorrect codec to encode a string to bytes. To solve the error, specify the correct encoding when opening the file or encoding the string, e.g. utf-8 . Here is an example of how the error occurs.

Does Python use UTF-8?

UTF-8 is one of the most commonly used encodings, and Python often defaults to using it. UTF stands for “Unicode Transformation Format”, and the '8' means that 8-bit values are used in the encoding.

I see three solutions to this:

Change the output encoding, so it will always output UTF-8. See e.g. Setting the correct encoding when piping stdout in Python, but I could not get these example to work.
Following example code makes the output aware of your target charset.
```
# -*- coding: utf-8 -*-
import sys

print sys.stdout.encoding
print u"Stöcker".encode(sys.stdout.encoding, errors='replace')
print u"Стоескер".encode(sys.stdout.encoding, errors='replace')
```
This example properly replaces any non-printable character in my name with a question mark.

If you create a custom print function, e.g. called myprint, using that mechanisms to encode output properly you can simply replace print with myprint whereever necessary without making the whole code look ugly.
Reset the output encoding globally at the begin of the software:

The page http://www.macfreek.nl/memory/Encoding_of_Python_stdout has a good summary what to do to change output encoding. Especially the section "StreamWriter Wrapper around Stdout" is interesting. Essentially it says to change the I/O encoding function like this:

In Python 2:
```
if sys.stdout.encoding != 'cp850':
  sys.stdout = codecs.getwriter('cp850')(sys.stdout, 'strict')
if sys.stderr.encoding != 'cp850':
  sys.stderr = codecs.getwriter('cp850')(sys.stderr, 'strict')
```
In Python 3:
```
if sys.stdout.encoding != 'cp850':
  sys.stdout = codecs.getwriter('cp850')(sys.stdout.buffer, 'strict')
if sys.stderr.encoding != 'cp850':
  sys.stderr = codecs.getwriter('cp850')(sys.stderr.buffer, 'strict')
```
If used in CGI outputting HTML you can replace 'strict' by 'xmlcharrefreplace' to get HTML encoded tags for non-printable characters.

Feel free to modify the approaches, setting different encodings, .... Note that it still wont work to output non-specified data. So any data, input, texts must be correctly convertable into unicode:
```
# -*- coding: utf-8 -*-
import sys
import codecs
sys.stdout = codecs.getwriter("iso-8859-1")(sys.stdout, 'xmlcharrefreplace')
print u"Stöcker"                # works
print "Stöcker".decode("utf-8") # works
print "Stöcker"                 # fails
```

Based on Dirk Stöcker's answer, here's a neat wrapper function for Python 3's print function. Use it just like you would use print.

As an added bonus, compared to the other answers, this won't print your text as a bytearray ('b"content"'), but as normal strings ('content'), because of the last decode step.

def uprint(*objects, sep=' ', end='\n', file=sys.stdout):
    enc = file.encoding
    if enc == 'UTF-8':
        print(*objects, sep=sep, end=end, file=file)
    else:
        f = lambda obj: str(obj).encode(enc, errors='backslashreplace').decode(enc)
        print(*map(f, objects), sep=sep, end=end, file=file)

uprint('foo')
uprint(u'Antonín Dvořák')
uprint('foo', 'bar', u'Antonín Dvořák')

For debugging purposes, you could use print(repr(data)).

To display text, always print Unicode. Don't hardcode the character encoding of your environment such as Cp850 inside your script. To decode the HTTP response, see A good way to get the charset/encoding of an HTTP response in Python.

To print Unicode to Windows console, you could use win-unicode-console package.

I dug deeper into this and found the best solutions are here.

http://blog.notdot.net/2010/07/Getting-unicode-right-in-Python

In my case I solved "UnicodeEncodeError: 'charmap' codec can't encode character "

original code:

print("Process lines, file_name command_line %s\n"% command_line))

New code:

print("Process lines, file_name command_line %s\n"% command_line.encode('utf-8'))

Related questions
                            
                                How do I create a dictionary with keys from a list and values defaulting to (say) zero? [duplicate]
                            
                                Get list of all routes defined in the Flask app
                            
                                How do you read a file into a list in Python? [duplicate]
                            
                                Not able to install Python packages [SSL: TLSV1_ALERT_PROTOCOL_VERSION]
                            
                                mongodb: insert if not exists
                            
                                How do I include related model fields using Django Rest Framework?
                            
                                Items in JSON object are out of order using "json.dumps"?
                            
                                'id' is a bad variable name in Python
                            
                                Why doesn't a python dict.update() return the object?
                            
                                How to uninstall Anaconda completely from macOS
                            
                                How can I generate a unique ID in Python? [duplicate]
                            
                                Best way to structure a tkinter application?
                            
                                Calculate difference in keys contained in two Python dictionaries
                            
                                Perform commands over ssh with Python
                            
                                Bulk insert with SQLAlchemy ORM
                            
                                How can I build multiple submit buttons django form?
                            
                                Fail during installation of Pillow (Python module) in Linux
                            
                                Is it ok to use dashes in Python files when trying to import them?
                            
                                Pipenv: Command Not Found
                            
                                Cosine Similarity between 2 Number Lists

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

UnicodeEncodeError: 'charmap' codec can't encode - character maps to <undefined>, print function [duplicate]

Tags:

python

encode

encoding

decode

People also ask

Recent Activity

Donate For Us