Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String formatting options: pros and cons

Tags:

These are two very popular ways of formatting a string in Python. One is using a dict:

>>> 'I will be %(years)i on %(month)s %(day)i' % {'years': 21, 'month': 'January', 'day': 23} 'I will be 21 on January 23' 

And the other one using a simple tuple:

>>> 'I will be %i on %s %i' % (21, 'January', 23) 'I will be 21 on January 23' 

The first one is way more readable, but the second one is faster to write. I actually use them indistinctly.

What are the pros and cons of each one? regarding performance, readability, code optimization (is one of them transformed to the other?) and anything else you would think is useful to share.

like image 805
juliomalegria Avatar asked Dec 06 '11 05:12

juliomalegria


People also ask

Is it good to use string format?

tl;dr. Avoid using String. format() when possible. It is slow and difficult to read when you have more than two variables.

What is the purpose of string formatting?

Overview. String formatting uses a process of string interpolation (variable substitution) to evaluate a string literal containing one or more placeholders, yielding a result in which the placeholders are replaced with their corresponding values.

What are the benefits of using the .format method instead of string concatenation?

The main advantages of using format(…) are that the string can be a bit easier to produce and read as in particular in the second example, and that we don't have to explicitly convert all non-string variables to strings with str(…).


1 Answers

Why format() is more flexible than % string operations

I think you should really stick to format() method of str, because it is the preferred way to format strings and will probably replace string formatting operation in the future.

Furthermore, it has some really good features, that can also combine position-based formatting with keyword-based one:

>>> string = 'I will be {} years and {} months on {month} {day}' >>> some_date = {'month': 'January', 'day': '1st'} >>> diff = [3, 11] # years, months >>> string.format(*diff, **some_date) 'I will be 3 years and 11 months on January 1st' 

even the following will work:

>>> string = 'On {month} {day} it will be {1} months, {0} years' >>> string.format(*diff, **some_date) 'On January 1st it will be 11 months, 3 years' 

There is also one other reason in favor of format(). Because it is a method, it can be passed as a callback like in the following example:

>>> data = [(1, 2), ('a', 'b'), (5, 'ABC')] >>> formatter = 'First is "{0[0]}", then comes "{0[1]}"'.format >>> for item in map(formatter, data):     print item   First is "1", then comes "2" First is "a", then comes "b" First is "5", then comes "ABC" 

Isn't it a lot more flexible than string formatting operation?

See more examples on documentation page for comparison between % operations and .format() method.

Comparing tuple-based % string formatting with dictionary-based

Generally there are three ways of invoking % string operations (yes, three, not two) like that:

base_string % values 

and they differ by the type of values (which is a consequence of what is the content of base_string):

  • it can be a tuple, then they are replaced one by one, in the order they are appearing in tuple,

    >>> 'Three first values are: %f, %f and %f' % (3.14, 2.71, 1) 'Three first values are: 3.140000, 2.710000 and 1.000000' 
  • it can be a dict (dictionary), then they are replaced based on the keywords,

    >>> 'My name is %(name)s, I am %(age)s years old' % {'name':'John','age':98} 'My name is John, I am 98 years old' 
  • it can be a single value, if the base_string contains single place where the value should be inserted:

    >>> 'This is a string: %s' % 'abc' 'This is a string: abc' 

There are obvious differences between them and these ways cannot be combined (in contrary to format() method which is able to combine some features, as mentioned above).

But there is something that is specific only to dictionary-based string formatting operation and is rather unavailable in remaining three formatting operations' types. This is ability to replace specificators with actual variable names in a simple manner:

>>> name = 'John' >>> surname = 'Smith' >>> age = 87 # some code goes here >>> 'My name is %(surname)s, %(name)s %(surname)s. I am %(age)i.' % locals() 'My name is Smith, John Smith. I am 87.' 

Just for the record: of course the above could be easily replaced by using format() by unpacking the dictionary like that:

>>> 'My name is {surname}, {name} {surname}. I am {age}.'.format(**locals()) 'My name is Smith, John Smith. I am 87.' 

Does anyone else have an idea what could be a feature specific to one type of string formatting operation, but not to the other? It could be quite interesting to hear about it.

like image 132
Tadeck Avatar answered Sep 16 '22 14:09

Tadeck