When running this code with python myscript.py
from Windows console cmd.exe
(i.e. outside of Sublime Text), it works:
# coding: utf8
import json
d = json.loads("""{"mykey": {"readme": "Café"}}""")
print d['mykey']['readme']
Café
When running it inside Sublime Text 2 with CTRL+B, it fails:
Either like this (by default):
print d['mykey']['readme']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
[Finished in 0.1s with exit code 1]
or like this, after applying the solution from this answer of printing UTF-8 in Python 3 using Sublime Text 3 (i.e. adding "env": {"PYTHONIOENCODING": "utf8"},
in the build system):
[Decode error - output not utf-8]
[Decode error - output not utf-8]
[Finished in 0.1s]
adding "encoding": "utf-8"
in the Python Sublime-build file doesn't help either
How to print
properly in Sublime Text 2 (for Windows) console, if it contains some UTF8 char?
Note: this is not a duplicate of printing UTF-8 in Python 3 using Sublime Text 3, I already linked to this question before.
Here is the Python.sublime-build
file:
{ "cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python" }
(I tried with and without "env": ...
, with and without "encoding": ...
)
A possible quick fix :
# coding: utf8
import json
d = json.loads("""{"mykey": {"readme": "Café"}}""", encoding='latin1')
print d['mykey']['readme'].encode('latin1')
This is a long answer full of gory details, but the TL;DR version is that this appears to be a bug in Sublime Text 2 (in particular in it's exec
command).
There are instructions below on how to patch Sublime in order to potentially solve the problem (it worked in all of my tests at least) if upgrading to Sublime Text 3 is not an option, as Sublime 3 has an enhanced exec
command.
Something to note is that the error you're seeing in the form of:
[Decode error - output not utf-8]
is generated by Sublime as it's adding data to the output panel and not by Python. Even with the fix outlined below, it may still be necessary (based on system setup and/or platform in use) to include the env
setting as mentioned in your question, since that tells Python to generate its output in UTF-8 regardless of what it thinks it should do.
For the purposes of the following tests, I installed Sublime Text 2 and Python 2.7.14 on my Windows 7 machine. This machine already has Python 3 installed on it and added to the PATH
, so I installed this version into C:\Python27-64
as indicated in your sample build file and left it out of the path.
With the exception of installing PackageResourceViewer and bumping up the default font size, Sublime is otherwise stock.
The test script is the following, slightly modified from the version outlined in your question:
# coding: utf8
import sys
print(sys.version)
print("Café")
Since everything is stock, the Build System in Tools > Build System
is set to Automatic
, and trying to run the build with Ctrl+B produces the following output:
3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)]
[Decode error - output not utf-8]
[Finished in 0.1s]
This makes sense because as mentioned above Python 3 is on my path but Python 2 is not, and so it it's picking Python 3.
The default Python.sublime-build
is the following:
{
"cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python"
}
Using PackgeResourceViewer, I opened up the file and modified it to invoke the Python 2 interpreter directly:
{
"cmd": ["C:\\Python27-64\\python.exe", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python"
}
With this in place, the build results look like this:
2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 20:25:58) [MSC v.1500 64 bit (AMD64)]
Café
[Finished in 0.1s]
Notice that it's running Python 2, but it's also properly displaying the data now, without having to modify anything.
That's somewhat curious and I must admit I went down a few rabbit holes on this because it seemed to work right off the bat. However, if you comment out the print of sys.version
:
# coding: utf8
import sys
#print(sys.version)
print("Café")
It stops working:
[Decode error - output not utf-8]
[Decode error - output not utf-8]
[Finished in 0.1s]
Alternatively, if you modify slightly the text that's being printed so that it doesn't end on the accented character:
# coding: utf8
import sys
# print(sys.version)
print("Café au lait")
Now it works as you might expect:
Café au lait
[Finished in 0.1s]
I believe this to be a bug in the exec
command that ships with Sublime Text in the Default
package. In particular, it decodes data just prior to it being inserted into the build results, and so is potentially sensitive to where the buffer cutoffs happen when the data is being read.
Conversely, Sublime Text 3 has a modified version of the exec
command which (among other enhancements) uses an incremental decoder at the point where the data is read from the pipe, and doesn't exhibit this issue.
Modifying the exec
command in Sublime 2 to also use incremental decoding appears to fix the problem, although I will admit that I didn't do any exhaustive testing of this.
I have created a public gist that contains a modified version of the exec.py
file that provides the exec
command used by the build system, along with instructions on how to apply it.
If you use that, your existing build system (and even the default) should work find for you, barring what I mentioned above that you may still need to use the env
setting in the build to force the Python interpreter to actually output UTF-8 in case it's not.
I have found a possible fix: add the encoding
parameter in the Python.sublime-build
file:
{
"cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"encoding": "cp1252",
...
Note: "encoding": "latin1"
seems to work as well, but - I don't know why - "encoding": "utf8"
does not work, even if the .py file is UTF8, even if Python 3 uses UTF8, etc. Mystery!
Edit: This works now:
{
"cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"encoding": "utf8",
"env": {"PYTHONIOENCODING": "utf-8", "LANG": "en_US.UTF-8"},
}
Linked topic:
Setting the correct encoding when piping stdout in Python and this answer in particular
How to change the preferred encoding in Sublime Text 3 for MacOS for the env
trick.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With