Is there any reason to prefer <code>unicode(somestring, 'utf8')</code> as opposed to <code>somestring.decode('utf8')</code>? My only thought is that <code>.decode()</code> is a bound method so python may be able to resolve it more efficiently, but correct me if I'm wrong.

I'd prefer <code>'something'.decode(...)</code> since the <code>unicode</code> type is no longer there in Python 3.0, while <code>text = b'binarydata'.decode(encoding)</code> is still valid.

It's easy to benchmark it: <pre class="prettyprint"><code>>>> from timeit import Timer >>> ts = Timer("s.decode('utf-8')", "s = 'ééé'") >>> ts.timeit() 8.9185450077056885 >>> tu = Timer("unicode(s, 'utf-8')", "s = 'ééé'") >>> tu.timeit() 2.7656929492950439 >>> </code></pre> Obviously, <code>unicode()</code> is faster. FWIW, I don't know where you get the impression that methods would be faster - it's quite the contrary.

unicode() vs. str.decode() for a utf8 encoded byte string (python 2.x)

2 Answers

I'd prefer 'something'.decode(...) since the unicode type is no longer there in Python 3.0, while text = b'binarydata'.decode(encoding) is still valid.

123

answered Oct 11 '22 12:10

dF.

It's easy to benchmark it:

>>> from timeit import Timer
>>> ts = Timer("s.decode('utf-8')", "s = 'ééé'")
>>> ts.timeit()
8.9185450077056885
>>> tu = Timer("unicode(s, 'utf-8')", "s = 'ééé'") 
>>> tu.timeit()
2.7656929492950439
>>>

Obviously, unicode() is faster.

FWIW, I don't know where you get the impression that methods would be faster - it's quite the contrary.

answered Oct 11 '22 12:10

bruno desthuilliers

Related questions
                            
                                How to convert a string from CP-1251 to UTF-8?
                            
                                Exception in Thread:must be a sequence, not instance
                            
                                How to check if value is nan in unittest?
                            
                                how to discriminate based on HTTP method in django urlpatterns
                            
                                How exactly does addStretch work in QBoxLayout?
                            
                                pygame installation issue in mac os
                            
                                Python sci-kit learn (metrics): difference between r2_score and explained_variance_score?
                            
                                Python: What is the difference between math.exp and numpy.exp and why do numpy creators choose to introduce exp again
                            
                                sklearn LogisticRegression and changing the default threshold for classification
                            
                                Is there any way to clear django.db.connection.queries?
                            
                                Confused about backslashes in regular expressions [duplicate]
                            
                                How to export current notebook in HTML on Jupyter
                            
                                Matplotlib colorbar ticks on left/opposite side
                            
                                Attributes of Python module `this`
                            
                                How to decrease the density of x-ticks in seaborn
                            
                                Select non-null rows from a specific column in a DataFrame and take a sub-selection of other columns
                            
                                Pipenv vs setup.py
                            
                                Click and pylint
                            
                                How to create a list of dictionaries from a dictionary with lists of different lengths
                            
                                What are the advantages of packaging your python library/application as an .egg file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

unicode() vs. str.decode() for a utf8 encoded byte string (python 2.x)

Tags:

python

unicode

utf-8

ʞɔıu

People also ask

2 Answers

dF.

bruno desthuilliers

Recent Activity

Donate For Us