If I store a boolean value using the CSV module, it gets converted to the strings True
or False
by the str()
function. However, when I load those values, a string of False
evaluates to being True
because it's a non-empty string.
I can work around it by manually checking the string at read time with an IF statement to see what the string is, but it's somewhat less than elegant. Any better ideas, or is this just one of those things in the programming world?
true
and false
, True
and False
, but I've also seen yes
and no
.0
or 1
0.0
or 1.0
Let's compare the respective advantages / disadvantages:
+
A human can read it-
CSV readers will have it as a string and both will evaluate to "true" when bool
is applied to it+
CSV readers might see that this column is integer and bool(0)
evaluates to false.+
A bit more space efficient-
Not totally clear that it is boolean+
CSV readers might see that this column is integer and bool(0.0)
evaluates to false.-
Not totally clear that it is boolean+
Possible to have null (as NaN)The Pandas CSV reader shows the described behaviour.
Have a look at mpu.string.str2bool
:
>>> str2bool('True')
True
>>> str2bool('1')
True
>>> str2bool('0')
False
which has the following implementation:
def str2bool(string_, default='raise'):
"""
Convert a string to a bool.
Parameters
----------
string_ : str
default : {'raise', False}
Default behaviour if none of the "true" strings is detected.
Returns
-------
boolean : bool
Examples
--------
>>> str2bool('True')
True
>>> str2bool('1')
True
>>> str2bool('0')
False
"""
true = ['true', 't', '1', 'y', 'yes', 'enabled', 'enable', 'on']
false = ['false', 'f', '0', 'n', 'no', 'disabled', 'disable', 'off']
if string_.lower() in true:
return True
elif string_.lower() in false or (not default):
return False
else:
raise ValueError('The value \'{}\' cannot be mapped to boolean.'
.format(string_))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With