Let's say we have a dict that will always have keys first_name and last_name but they may be equal to None.
{
'first_name': None,
'last_name': 'Bloggs'
}
We want to save the first name if it is passed in or save it as an empty string if None is passed in.
first_name = account['first_name'] if account['first_name'] else ""
vs
first_name = account['first_name'] or ""
Both of these work, however, what is the difference behind the scenes? Is one more efficient than the other?
None is a singleton object (there only ever exists one None ). is checks to see if the object is the same object, while == just checks if they are equivalent. But since there is only one None , they will always be the same, and is will return True.
Use the is not operator to check if a variable is not None in Python, e.g. if my_var is not None: . The is not operator returns True if the values on the left-hand and right-hand sides don't point to the same object (same location in memory).
The None keyword is used to define a null value, or no value at all. None is not the same as 0, False, or an empty string. None is a data type of its own (NoneType) and only None can be None.
What is the difference between the two following expressions?
first_name = account['first_name'] if account['first_name'] else ""
vs
first_name = account['first_name'] or ""
The primary difference is that the first, in Python, is the conditional expression,
The expression
x if C else y
first evaluates the condition,C
rather thanx
. IfC
is true,x
is evaluated and its value is returned; otherwise,y
is evaluated and its value is returned.
while the second uses the boolean operation:
The expression
x or y
first evaluatesx
; ifx
is true, its value is returned; otherwise,y
is evaluated and the resulting value is returned.
Note that the first may require two key lookups versus the second, which only requires one key lookup.
This lookup is called subscript notation:
name[subscript_argument]
Subscript notation exercises the __getitem__
method of the object referenced by name
.
It requires both the name and the subscript argument to be loaded.
Now, in the context of the question, if it tests as True
in a boolean context (which a non-empty string does, but None
does not) it will require a second (redundant) loading of both the dictionary and the key for the conditional expression, while simply returning the first lookup for the boolean or
operation.
Therefore I would expect the second, the boolean operation, to be slightly more efficient in cases where the value is not None
.
Others have compared the bytecode generated by both expressions.
However, the AST represents the first breakdown of the language as parsed by the interpreter.
The following AST demonstrates that the second lookup likely involves more work (note I have formatted the output for easier parsing):
>>> print(ast.dump(ast.parse("account['first_name'] if account['first_name'] else ''").body[0]))
Expr(
value=IfExp(
test=Subscript(value=Name(id='account', ctx=Load()),
slice=Index(value=Str(s='first_name')), ctx=Load()),
body=Subscript(value=Name(id='account', ctx=Load()),
slice=Index(value=Str(s='first_name')), ctx=Load()),
orelse=Str(s='')
))
versus
>>> print(ast.dump(ast.parse("account['first_name'] or ''").body[0]))
Expr(
value=BoolOp(
op=Or(),
values=[
Subscript(value=Name(id='account', ctx=Load()),
slice=Index(value=Str(s='first_name')), ctx=Load()),
Str(s='')]
)
)
Here we see that the bytecode for the conditional expression is much longer. This usually bodes poorly for relative performance in my experience.
>>> import dis
>>> dis.dis("d['name'] if d['name'] else ''")
1 0 LOAD_NAME 0 (d)
2 LOAD_CONST 0 ('name')
4 BINARY_SUBSCR
6 POP_JUMP_IF_FALSE 16
8 LOAD_NAME 0 (d)
10 LOAD_CONST 0 ('name')
12 BINARY_SUBSCR
14 RETURN_VALUE
>> 16 LOAD_CONST 1 ('')
18 RETURN_VALUE
For the boolean operation, it's almost half as long:
>>> dis.dis("d['name'] or ''")
1 0 LOAD_NAME 0 (d)
2 LOAD_CONST 0 ('name')
4 BINARY_SUBSCR
6 JUMP_IF_TRUE_OR_POP 10
8 LOAD_CONST 1 ('')
>> 10 RETURN_VALUE
Here I would expect the performance to be much quicker relative to the other.
Therefore, let's see if there's much difference in performance then.
Performance is not very important here, but sometimes I have to see for myself:
def cond(name=False):
d = {'name': 'thename' if name else None}
return lambda: d['name'] if d['name'] else ''
def bool_op(name=False):
d = {'name': 'thename' if name else None}
return lambda: d['name'] or ''
We see that when the name is in the dictionary, the boolean operation is about 10% faster than the conditional.
>>> min(timeit.repeat(cond(name=True), repeat=10))
0.11814919696189463
>>> min(timeit.repeat(bool_op(name=True), repeat=10))
0.10678509017452598
However, when the name is not in the dictionary, we see that there is almost no difference:
>>> min(timeit.repeat(cond(name=False), repeat=10))
0.10031125508248806
>>> min(timeit.repeat(bool_op(name=False), repeat=10))
0.10030031995847821
In general, I would prefer the or
boolean operation to the conditional expression - with the following caveats:
None
.In the case where either the above is not true, I would prefer the following for correctness:
first_name = account['first_name']
if first_name is None:
first_name = ''
The upsides are that
is None
is quite fast,This should also not be any less performant:
def correct(name=False):
d = {'name': 'thename' if name else None}
def _correct():
first_name = d['name']
if first_name is None:
first_name = ''
return _correct
We see that we get quite competitive performance when the key is there:
>>> min(timeit.repeat(correct(name=True), repeat=10))
0.10948465298861265
>>> min(timeit.repeat(cond(name=True), repeat=10))
0.11814919696189463
>>> min(timeit.repeat(bool_op(name=True), repeat=10))
0.10678509017452598
when the key is not in the dictionary, it is not quite as good though:
>>> min(timeit.repeat(correct(name=False), repeat=10))
0.11776355793699622
>>> min(timeit.repeat(cond(name=False), repeat=10))
0.10031125508248806
>>> min(timeit.repeat(bool_op(name=False), repeat=10))
0.10030031995847821
The difference between the conditional expression and the boolean operation is two versus one lookups respectively on a True
condition, making the boolean operation more performant.
For correctness's sake, however, do the lookup one time, check for identity to None
with is None
, and then reassign to the empty string in that case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With