Why does Python 2's raw_input output unicode strings?

Question

I tried the following on Codecademy's Python lesson

hobbies = []

# Add your code below!
for i in range(3):
    Hobby = str(raw_input("Enter a hobby:"))
    hobbies.append(Hobby)

print hobbies

With this, it works fine but if instead I try

Hobby = raw_input("Enter a hobby:")

I get [u'Hobby1', u'Hobby2', u'Hobby3']. Where are the extra us coming from?

John Y · Accepted Answer

The question's subject line might be a bit misleading: Python 2's raw_input() normally returns a byte string, NOT a Unicode string.

However, it could return a Unicode string if it or sys.stdin has been altered or replaced (by an application, or as part of an alternative implementation of Python).

Therefore, I believe @ByteCommander is on the right track with his comment:

Maybe this has something to do with the console it's running in?

The Python used by Codecademy is ostensibly 2.7, but (a) it was implemented by compiling the Python interpreter to JavaScript using Emscripten and (b) it's running in the browser; so between those factors, there could very well be some string encoding and decoding injected by Codecademy that isn't present in plain-vanilla CPython.

Note: I have not used Codecademy myself nor do I have any inside knowledge of its inner workings.

sriramganesh · Answer

'u' means its a unicode. You can also specify raw_input().encode('utf8') to convert to string.

Edited: I checked in python 2.7 it returns byte string not unicode string. So problem is something else here.

Edited: raw_input() returns unicode if sys.stdin.encoding is unicode.

In codeacademy python environment, sys.stdin.encoding and sys.stdout.decoding both are none and default endcoding scheme is ascii.

Python will use this default encoding only if it is unable to find proper encoding scheme from environment.

Why does Python 2's raw_input output unicode strings?

Tags:

python

input

unicode

python-2.x

user1936752

2 Answers

John Y

sriramganesh

Recent Activity

Donate For Us

Why does Python 2's raw_input output unicode strings?

Tags:

python

input

unicode

python-2.x

user1936752

2 Answers

John Y

sriramganesh

Related questions

Recent Activity

Donate For Us