Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I avoid converting to a string if a value is already a string?

Tags:

python

string

Sometimes you have to use list comprehension to convert everything to string including strings themselves.

b = [str(a) for a in l]

But do I have to do:

b = [a if type(a)==str else str(a) for a in l]

I was wondering if str on a string is optimized enough to not create another copy of the string.

I have tried:

>>> x="aaaaaa"
>>> str(x) is x
True

but that may be because Python can cache strings, and reuses them. But is that behaviour guaranteed for any value of a string?

like image 258
Jean-François Fabre Avatar asked Feb 14 '17 10:02

Jean-François Fabre


People also ask

Can you use str () on a string?

Unlike Java, the '+' does not automatically convert numbers or other types to string form. The str() function converts values to a string form so they can be combined with other strings.

What happens if you str a string python?

Calling str on a string object is pretty cheap: it just returns the original string object. Calling isinstance explicitly will definitely be slower. If you want to test this on real data, take a look at the timeit module.

Can a value be a string python?

In Python, we can represent an integer value in the form of string. Int value of a string can be obtained by using inbuilt function in python called as int() . Here we can pass string as argument to this function which returns int value of a string.


1 Answers

Testing if an object is already a string is slower than just always converting to a string.

That's because the str() method also makes the exact same test (is the object already a string). You are a) doing double the work, and b) your test is slower to boot.

Note: for Python 2, using str() on unicode objects includes an implicit encode to ASCII, and this can fail. You may still have to special case handling of such objects. In Python 3, there is no need to worry about that edge-case.

As there is some discussion around this:

  • isinstance(s, str) has a different meaning when s can be a subclass of str. As subclasses are treated exactly like any other type of object by str() (either __str__ or __repr__ is called on the object), this difference matters here.
  • You should use type(s) is str for exact type checks. Types are singletons, take advantage of this, is is faster:

    >>> import timeit
    >>> timeit.timeit("type(s) is str", "s = ''")
    0.10074466899823165
    >>> timeit.timeit("type(s) == str", "s = ''")
    0.1110201120027341
    
  • Using s if type(s) is str else str(s) is significantly slower for the non-string case:

    >>> import timeit
    >>> timeit.timeit("str(s)", "s = None")
    0.1823573520014179
    >>> timeit.timeit("s if type(s) is str else str(s)", "s = None")
    0.29589492800005246
    >>> timeit.timeit("str(s)", "s = ''")
    0.11716728399915155
    >>> timeit.timeit("s if type(s) is str else str(s)", "s = ''")
    0.12032335300318664
    

    (The timings for the s = '' cases are very close and keep swapping places).

All timings in this post were conducted on Python 3.6.0 on a Macbook Pro 15" (Mid 2015), OS X 10.12.3.

like image 114
Martijn Pieters Avatar answered Oct 16 '22 15:10

Martijn Pieters