Will everything in the standard library treat strings as unicode in Python 3.0?

Question

I'm a little confused about how the standard library will behave now that Python (from 3.0) is unicode-based. Will modules such as CGI and urllib use unicode strings or will they use the new 'bytes' type and just provide encoded data?

pdc · Accepted Answer

Logically a lot of things like MIME-encoded mail messages, URLs, XML documents, and so on should be returned as bytes not strings. This could cause some consternation as the libraries start to be nailed down for Python 3 and people discover that they have to be more aware of the bytes/string conversions than they were for str/unicode ...

cdleary · Answer

One of the great things about this question (and Python in general) is that you can just mess around in the interpreter! Python 3.0 rc1 is currently available for download.

>>> import urllib.request
>>> fh = urllib.request.urlopen('http://www.python.org/')
>>> print(type(fh.read(100)))
<class 'bytes'>

S.Lott · Answer

There will be a two-step dance here. See Python 3000 and You.

Step 1 is to get running under 3.0.

Step 2 is to rethink your API's to, perhaps, do something more sensible.

The most likely course is that the libraries will switch to unicode strings to remain as compatible as possible with how they used to work.

Then, perhaps, some will switch to bytes to more properly implement the RFC standards for the various protocols.

Will everything in the standard library treat strings as unicode in Python 3.0?

Tags:

python

string

python-3.x

unicode

cgi

hacama

3 Answers

pdc

cdleary

S.Lott

Recent Activity

Donate For Us

Will everything in the standard library treat strings as unicode in Python 3.0?

Tags:

python

string

python-3.x

unicode

cgi

hacama

3 Answers

pdc

cdleary

S.Lott

Related questions

Recent Activity

Donate For Us