Using python 3.2 in Windows 7 I am getting the following in IDLE: <pre class="prettyprint"><code>>>compile('pass', r'c:\temp\工具\module1.py', 'exec') UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character </code></pre> Can anybody explain why the compile statement tries to convert the unicode filename using mbcs? I know that sys.getfilesystemencoding returns 'mbcs' in Windows, but I thought that this is not used when unicode file names are provided. for example: <pre class="prettyprint"><code>f = open(r'c:\temp\工具\module1.py') </code></pre> works. For a more complete test save the following in a utf8 encoded file and run it using the standard python.exe version 3.2 <pre class="prettyprint"><code># -*- coding: utf8 -*- fname = r'c:\temp\工具\module1.py' # I do have the a file named fname but you can comment out the following two lines f = open(fname) print('ok') cmp = compile('pass', fname, 'exec') print(cmp) </code></pre> Output: <pre class="prettyprint"><code>ok Traceback (most recent call last): File "module8.py", line 6, in <module> cmp = compile('pass', fname, 'exec') UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: inval id character </code></pre>

Here a solution that worked for me: Issue 427: UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-6: ordinal not in range (128): <blockquote> If you look the PyScripter help file in the topic "Encoded Python Source Files" (last paragraph) it tells you how to configure Python to support other encodings by modifying the site.py file. This file is in the lib subdirectory of the Python installation directory. Find the function setencoding and make sure that the support locale aware default string encodings is on. (see below) </blockquote> <pre class="prettyprint"><code>def setencoding(): """Set the string encoding used by the Unicode implementation. The default is 'ascii', but if you're willing to experiment, you can change this.""" encoding = "ascii" # Default value set by _PyUnicode_Init() if 0: <<<--- set this to 1 --------------------------------- # Enable to support locale aware default string encodings. import locale loc = locale.getdefaultlocale () if loc[1]: encoding = loc[1] if 0: # Enable to switch off string to Unicode coercion and implicit # Unicode to string conversion. encoding = "undefined" if encoding != "ascii": # On Non-Unicode builds this will raise an AttributeError... sys.setdefaultencoding (encoding) # Needs Python Unicode build ! </code></pre>

UnicodeEncodeError when using the compile function

Tags:

python

python-3.x

windows

unicode

Using python 3.2 in Windows 7 I am getting the following in IDLE:

>>compile('pass', r'c:\temp\工具\module1.py', 'exec')
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character

Can anybody explain why the compile statement tries to convert the unicode filename using mbcs? I know that sys.getfilesystemencoding returns 'mbcs' in Windows, but I thought that this is not used when unicode file names are provided.

for example:

f = open(r'c:\temp\工具\module1.py')

works.

For a more complete test save the following in a utf8 encoded file and run it using the standard python.exe version 3.2

# -*- coding: utf8 -*-
fname = r'c:\temp\工具\module1.py'
# I do have the a file named fname but you can comment out the following two lines
f = open(fname)
print('ok')
cmp = compile('pass', fname, 'exec')
print(cmp)

Output:

ok
Traceback (most recent call last):
  File "module8.py", line 6, in <module>
    cmp = compile('pass', fname, 'exec')
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: inval
id character

230

asked Jan 10 '12 04:01

PyScripter

2 Answers

From Python issue 10114, it seems that the logic is that all filenames used by Python should be valid for the platform where they are used. It is encoded using the filesystem encoding to be used in the C internals of Python.

I agree that it probably shouldn't throw an error on Windows, because any Unicode filename is valid. You may wish to file a bug report with Python for this. But be aware that the necessary changes might not be trivial, because any C code using the filename has to have something to do if it can't be encoded.

169

answered Oct 23 '22 07:10

Thomas K

Here a solution that worked for me: Issue 427: UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-6: ordinal not in range (128):

If you look the PyScripter help file in the topic "Encoded Python Source Files" (last paragraph) it tells you how to configure Python to support other encodings by modifying the site.py file. This file is in the lib subdirectory of the Python installation directory. Find the function setencoding and make sure that the support locale aware default string encodings is on. (see below)

def setencoding():
  """Set the string encoding used by the Unicode implementation.  The
  default is 'ascii', but if you're willing to experiment, you can
  change this."""
  encoding = "ascii" # Default value set by _PyUnicode_Init()
  if 0:  <<<--- set this to 1 ---------------------------------
      # Enable to support locale aware default string encodings.
      import locale
      loc = locale.getdefaultlocale ()
      if loc[1]:
          encoding = loc[1]
  if 0:
      # Enable to switch off string to Unicode coercion and implicit
      # Unicode to string conversion.
      encoding = "undefined"
  if encoding != "ascii":
      # On Non-Unicode builds this will raise an AttributeError...
      sys.setdefaultencoding (encoding) # Needs Python Unicode
build !

answered Oct 23 '22 07:10

Framester

Related questions
                            
                                Efficiently Row Standardize a Matrix
                            
                                Linker Error Lunatic Python lua.require('socket') -> undefined symbol: lua_getmetatable
                            
                                Python CGI FieldStorage test harness
                            
                                Python - Adding a Tkinter Graph to a PyQt Widget
                            
                                Change 64bit Registry from 32bit Python
                            
                                How can I examine the network communications of the Python HTTP Client?
                            
                                MySQLdb Python connector building and installation command 'gcc-4.2' failed with exit status 255
                            
                                How to look at only the 3rd value in all lists in a list
                            
                                dir() function in commandline vs IDLE
                            
                                Capturing From 2 Cameras (OpenCV, Python)
                            
                                Qt/QSqlQuery: Binary data is interpreted as string when binding to BLOB field
                            
                                Python - a subprocess writing to stdin so that the main program can read it from the stdin
                            
                                detecting bind mounts on linux
                            
                                Finding shapes in an image using opencv
                            
                                Efficient memoization in Python
                            
                                matplotlib not showing first label on x axis for the Bar Plot
                            
                                How to disable keyword / text suggestion in Spyder 4?
                            
                                multiple variables in list comprehension?
                            
                                Finding the Current Active Window in Mac OS X using Python
                            
                                Flask-SQLAlchemy: Can't reconnect until invalid transaction is rolled back

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With