In what encoding are the elements of sys.argv
, in Python? are they encoded with the sys.getdefaultencoding()
encoding?
sys.getdefaultencoding(): Return the name of the current default string encoding used by the Unicode implementation.
PS: As pointed out in some of the answers, sys.stdin.encoding
would indeed be a better guess. I would love to see a definitive answer to this question, though, with pointers to solid sources!
PPS: As Wim pointed out, Python 3 solves this issue by putting str
objects in sys.argv (if I understand correctly). The question remains open for Python 2.x, though. Under Unix, the LC_CTYPE environment variable seems to be the correct thing to check, no? What should be done with Windows (so that sys.argv elements are correctly interpreted whatever the console)?
What is the data type of the elements of SYS argv? Elements of sys. argv is a Python list, all command-line arguments are passed into it in the correct order.
sys. argv is the list of commandline arguments passed to the Python program. argv represents all the items that come along via the command line input, it's basically an array holding the command line arguments of our program. Don't forget that the counting starts at zero (0) not one (1).
UTF-8 is one of the most commonly used encodings, and Python often defaults to using it. UTF stands for “Unicode Transformation Format”, and the '8' means that 8-bit values are used in the encoding.
I'm guessing that you are asking this because you ran into issue 2128. Note that this has been fixed in Python 3.0.
A few observations:
(1) It's certainly not sys.getdefaultencoding
.
(2) sys.stdin.encoding
appears to be a much better bet.
(3) On Windows, the actual value of sys.stdin.encoding
will vary, depending on what software is providing the stdio. IDLE will use the system "ANSI" code page, e.g. cp1252
in most of Western Europe and America and former colonies thereof. However in the Command Prompt window, which emulates MS-DOS more or less, the corresponding old DOS code page (e.g. cp850) will be used by default. This can be changed by using the CHCP (change code page) command.
(4) The documentation for the subprocess module doesn't provide any suggestions on what encoding to use for args and stdout.
(5) One trusts that assert sys.stdin.encoding == sys.stdout.encoding
never fails.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With