Here is a little program: <pre class="prettyprint"><code>import sys f = sys.argv[1] print type(f) print u"f=%s" % (f) </code></pre> Here is my running of the program: <pre class="prettyprint"><code>$ python x.py 'Recent/רשימת משתתפים.LNK' <type 'str'> Traceback (most recent call last): File "x.py", line 5, in <module> print u"f=%s" % (f) UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 7: ordinal not in range(128) $ </code></pre> The problem is that sys.argv[1] is thinking that it's getting an ascii string, which it can't convert to Unicode. But I'm using a Mac with a full Unicode-aware Terminal, so <code>x.py</code> is actually getting a Unicode string. How do I tell Python that sys.argv[] is Unicode and not Ascii? Failing that, how do I convert ASCII (that has unicode inside it) into Unicode? The obvious conversions don't work.

The <code>UnicodeDecodeError</code> error you see is due to you're mixing the Unicode string <code>u"f=%s"</code> and the <code>sys.argv[1]</code> bytestring: <ul> <li> both bytestrings: <pre class="prettyprint"><code> $ python2 -c'import sys; print "f=%s" % (sys.argv[1],)' 'Recent/רשימת משתתפים' </code></pre> This passes bytes transparently from/to your terminal. It works for any encoding. </li> <li> both Unicode: <pre class="prettyprint"><code> $ python2 -c'import sys; print u"f=%s" % (sys.argv[1].decode("utf-8"),)' 'Rec.. </code></pre> Here you should replace <code>'utf-8'</code> by the encoding your terminal uses. You might use <code>sys.getfilesystemencoding()</code> here if the terminal is not Unicode-aware. </li> </ul> Both commands produce the same output: <pre class="prettyprint"><code>f=Recent/רשימת משתתפים </code></pre> In general you should convert bytestrings that you consider to be text to Unicode as soon as possible.

How do I tell Python that sys.argv is in Unicode?

Tags:

python

terminal

macos

unicode

Here is a little program:

import sys

f = sys.argv[1]
print type(f)
print u"f=%s" % (f)

Here is my running of the program:

$ python x.py 'Recent/רשימת משתתפים.LNK'
<type 'str'>
Traceback (most recent call last):
  File "x.py", line 5, in <module>
    print u"f=%s" % (f)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 7: ordinal not in range(128)
$

The problem is that sys.argv[1] is thinking that it's getting an ascii string, which it can't convert to Unicode. But I'm using a Mac with a full Unicode-aware Terminal, so x.py is actually getting a Unicode string. How do I tell Python that sys.argv[] is Unicode and not Ascii? Failing that, how do I convert ASCII (that has unicode inside it) into Unicode? The obvious conversions don't work.

767

asked Feb 25 '11 04:02

vy32

1 Answers

The UnicodeDecodeError error you see is due to you're mixing the Unicode string u"f=%s" and the sys.argv[1] bytestring:

both bytestrings:
```
  $ python2 -c'import sys; print "f=%s" % (sys.argv[1],)' 'Recent/רשימת משתתפים'
```
This passes bytes transparently from/to your terminal. It works for any encoding.
both Unicode:
```
  $ python2 -c'import sys; print u"f=%s" % (sys.argv[1].decode("utf-8"),)' 'Rec..
```
Here you should replace 'utf-8' by the encoding your terminal uses. You might use sys.getfilesystemencoding() here if the terminal is not Unicode-aware.

Both commands produce the same output:

f=Recent/רשימת משתתפים

In general you should convert bytestrings that you consider to be text to Unicode as soon as possible.

answered Sep 30 '22 11:09

jfs

Related questions
                            
                                Python: Too long raw string, multiple lines
                            
                                Logical operation between two Boolean lists
                            
                                Convert pandas.DataFrame to list of dictionaries in Python
                            
                                pandas get minimum of one column in group when groupby another
                            
                                PyCharm running Python file always opens a new console
                            
                                RuntimeError: Timeout context manager should be used inside a task
                            
                                What type is a sklearn model?
                            
                                How to deal with Kivy installing error in Python 3.8?
                            
                                Django and Azure SQL key error 'deferrable' when start migrate command
                            
                                numpy create array of the max of consecutive pairs in another array
                            
                                find_element_by_* commands are deprecated in selenium
                            
                                How can I stop a While loop?
                            
                                python string join performance
                            
                                Create function through MySQLdb
                            
                                In Python, how do you change an instantiated object after a reload?
                            
                                Best DataMining Database
                            
                                python [lxml] - cleaning out html tags
                            
                                get index of character in python list
                            
                                Random module not working. ValueError: empty range for randrange() (1,1, 0)
                            
                                Python - can I detect unicode string language code?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With