Considering this code snippet:
from os import walk
files = []
for (dirpath, _, filenames) in walk(mydir):
# More code that modifies files
if len(files) == 0: # <-- C1801
return None
I was alarmed by Pylint with this message regarding the line with the if statement:
[pylint] C1801:Do not use
len(SEQUENCE)
as condition value
The rule C1801, at first glance, did not sound very reasonable to me, and the definition on the reference guide does not explain why this is a problem. In fact, it downright calls it an incorrect use.
len-as-condition (C1801): Do not use
len(SEQUENCE)
as condition value Used when Pylint detects incorrect use of len(sequence) inside conditions.
My search attempts have also failed to provide me a deeper explanation. I do understand that a sequence's length property may be lazily evaluated, and that __len__
can be programmed to have side effects, but it is questionable whether that alone is problematic enough for Pylint to call such a use incorrect. Hence, before I simply configure my project to ignore the rule, I would like to know whether I am missing something in my reasoning.
When is the use of len(SEQ)
as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?
When is the use of
len(SEQ)
as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?
It’s not really problematic to use len(SEQUENCE)
– though it may not be as efficient (see chepner’s comment). Regardless, Pylint checks code for compliance with the PEP 8 style guide which states that
For sequences, (strings, lists, tuples), use the fact that empty sequences are false.
Yes: if not seq: if seq: No: if len(seq): if not len(seq):
As an occasional Python programmer, who flits between languages, I’d consider the len(SEQUENCE)
construct to be more readable and explicit (“Explicit is better then implicit”). However, using the fact that an empty sequence evaluates to False
in a Boolean context is considered more “Pythonic”.
Note that the use of len(seq) is in fact required (instead of just checking the bool value of seq) when using NumPy arrays.
a = numpy.array(range(10))
if a:
print "a is not empty"
results in an exception: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
And hence for code that uses both Python lists and NumPy arrays, the C1801 message is less than helpful.
This was a issue in Pylint, and it no longer considers len(x) == 0
as incorrect.
You should not use a bare len(x)
as a condition. Comparing len(x)
against an explicit value, such as if len(x) == 0
of if len(x) > 0
is totally fine and not prohibited by PEP 8.
From PEP 8:
# Correct: if not seq: if seq: # Wrong: if len(seq): if not len(seq):
Note that explicitly testing for the length is not prohibited. The Zen of Python states:
Explicit is better than implicit.
In the choice between if not seq
and if not len(seq)
, both are implicit, but the behaviour is different. But if len(seq) == 0
or if len(seq) > 0
are explicit comparisons and are in many contexts the correct behaviour.
In Pylint, PR 2815 has fixed this bug, first reported as issue 2684. It will continue to complain about if len(seq)
, but it will no longer complain about if len(seq) > 0
. The PR was merged 2019-03-19, so if you are using Pylint 2.4 (released 2019-09-14) or newer, you should not see this problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With