Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF-8 in AppVeyor, Python 3.6

1. Summary

I don't find, how I can use non-ascii symbols for AppVeyor builds.


2. Settings

My simple SashaAppVeyorEncoding.py file:

print('Саша наилучшая!')

My simple appveyor.yml file:

environment:

  matrix:

    - PYTHON: "C:\\Python36-x64"
      PYTHON_VERSION: "3.6.3"
      PYTHON_ARCH: "64"
      PIP: "C:\\Python36-x64\\Scripts\\pip"

platform: x64

build_script:

  - cmd: "%PYTHON%\\python SashaAppVeyorEncoding.py"

Both files I save in UTF-8 encoding.


3. Expected behavior

If I run SashaAppVeyorEncoding.py file in my terminal or interpreter SublimeREPL, I get:

D:\SashaPythonista>python SashaAppVeyorEncoding.py
Саша наилучшая!

If my SashaAppVeyorEncoding.py file is not contain Cyrillic symbols:

print('Sasha superior!')

AppVeyor build successful passed:

Build started
git clone -q --branch=master https://github.com/Kristinita/SashaPythonista.git C:\projects\sashapythonista-7l3yk
git checkout -qf 3a0393a5b9548a5debabebfc5e28d17f3000a768
%PYTHON%\python SashaAppVeyorEncoding.py
Sasha superior!
Discovering tests...OK
Build success

4. Actual behavior

My AppVeyor build failed:

Build started
git clone -q --branch=master https://github.com/Kristinita/SashaPythonista.git C:\projects\sashapythonista-7l3yk
git checkout -qf 262cef287d45b1548640b9a773b680de90b7d138
%PYTHON%\python SashaAppVeyorEncoding.py
Traceback (most recent call last):
  File "SashaAppVeyorEncoding.py", line 1, in <module>
    print('\u0421\u0430\u0448\u0430 \u043d\u0430\u0438\u043b\u0443\u0447\u0448\u0430\u044f!')
  File "C:\Python36-x64\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>
Command exited with code 1

5. Not helped

  1. I add # -*- coding: utf-8 -*- in top of SashaAppVeyorEncoding.py file,
  2. I add chcp 65001 command to appveyor.yml file,
  3. I install win-unicode-console to appveyor.yml file,

My updated SashaAppVeyorEncoding.py file:

# -*- coding: utf-8 -*-
print('Саша наилучшая!')

My updated appveyor.yml file:

environment:

  matrix:

    - PYTHON: "C:\\Python36-x64"
      PYTHON_VERSION: "3.6.3"
      PYTHON_ARCH: "64"
      PIP: "C:\\Python36-x64\\Scripts\\pip"

platform: x64

install:

  - cmd: "%PIP% install win-unicode-console"
  - cmd: chcp 65001

build_script:

  - cmd: "%PYTHON%\\python SashaAppVeyorEncoding.py"

My updated AppVeyor build:

Build started
git clone -q --branch=master https://github.com/Kristinita/SashaPythonista.git C:\projects\sashapythonista-7l3yk
git checkout -qf 11df07d4c424cd8e28a1b0db0f43906aa63f42f1
Running Install scripts
%PIP% install win-unicode-console
Collecting win-unicode-console
  Downloading win_unicode_console-0.5.zip
Installing collected packages: win-unicode-console
  Running setup.py install for win-unicode-console: started
    Running setup.py install for win-unicode-console: finished with status 'done'
Successfully installed win-unicode-console-0.5
chcp 65001
Active code page: 65001
%PYTHON%\python SashaAppVeyorEncoding.py
Traceback (most recent call last):
  File "SashaAppVeyorEncoding.py", line 2, in <module>
    print('\u0421\u0430\u0448\u0430 \u043d\u0430\u0438\u043b\u0443\u0447\u0448\u0430\u044f!')
  File "C:\Python36-x64\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>
Command exited with code 1

6. Local environment

Operating system and version:
Windows 10 Enterprise LTSB 64-bit EN
Python:
3.6.3
chcp:
Active code page: 65001

like image 338
Саша Черных Avatar asked Oct 17 '22 16:10

Саша Черных


1 Answers

It looks like python doesn't print to the console on AppVeyor -- the output is redirected. Therefore locale.getpreferredencoding() is used to encode Unicode text to bytes while printing to stdout. cp1252 supports only a few from a million of Unicode characters. To change sys.stdout.encoding here, you could set PYTHONIOENCODING=utf-8 envvar -- utf-8 character encoding supports all Unicode characters.

like image 193
jfs Avatar answered Oct 21 '22 08:10

jfs