Apparently integers costs 24 bytes in Python. I can understand that it does so because of extra bells and whistles of representing unbounded number. However it looks like boolean data types also cost whooping 24 bytes even though it might ever represent only two values. Why?
Edit: I'm not asking for best way to store bools. I'm already aware of NumPy, BitArray etc from other answers. My question is why, not how. Just to be clear and focused about that I've removed 2nd part of the question.
A bool
may be pretty huge for what it represents, but there are only two of them. A list full of True
s only contains 4- or 8-byte references to the one canonical True
object.
If 8 bytes is still too big, and you really want to use Python for whatever it is you're doing, you could consider using an array type like that provided by the built-in array
module or NumPy. These offer 1-byte-per-bool representations. If this is still too much, you could use a bitset, either manually with Python's built-in bignums or with something like BitVector
from PyPI. These options are likely to slow your program way down. Some of them can offer speed improvements, but only if you take advantage of features that let you push work out of interpreted code and into C.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With