I was playing around with sys
's getsizeof()
and found that False
(or 0
) consists of less bytes than True
(or 1
). Why is that?
import sys
print("Zero: " + str(sys.getsizeof(0)))
print("One: " + str(sys.getsizeof(1)))
print("False: " + str(sys.getsizeof(False)))
print("True: " + str(sys.getsizeof(True)))
# Prints:
# Zero: 24
# One: 28
# False: 24
# True: 28
In fact, other numbers (also some that consist of more than one digit) are 28 bytes.
for n in range(0, 12):
print(str(n) + ": " + str(sys.getsizeof(n)))
# Prints:
# 0: 24
# 1: 28
# 2: 28
# 3: 28
# 4: 28
# 5: 28
# 6: 28
# 7: 28
# 8: 28
# 9: 28
# 10: 28
# 11: 28
Even more: sys.getsizeof(999999999)
is also 28 bytes! sys.getsizeof(9999999999)
, however, is 32.
So what's going on? I assume that the booleans True
and False
are internally converted to 0
and 1
respectively, but why is zero different in size from other lower integers?
Side question: is this specific to how Python (3) represents these items, or is this generally how digits are presented in the OS?
Zero is used to represent false, and One is used to represent true. For interpretation, Zero is interpreted as false and anything non-zero is interpreted as true. To make life easier, C Programmers typically define the terms "true" and "false" to have values 1 and 0 respectively.
Python assigns boolean values to values of other types. For numerical types like integers and floating-points, zero values are false and non-zero values are true. For strings, empty strings are false and non-empty strings are true.
In Python True and False are equivalent to 1 and 0. Use the int() method on a boolean to get its int values. int() turns the boolean into 1 or 0. Note: that any value not equal to 'true' will result in 0 being returned.
In Python 3. x True and False are keywords and will always be equal to 1 and 0 .
Remember that Python int
values are of arbitrary size. How does that work?
Well, in CPython,1 an int is represented by a PyLong_Object
, which has an array of 4-byte chunks2, each holding 30 bits3 worth of the number.
0
takes no chunks at all.1
- (1<<30)-1
takes 1 chunk.1<<30
- (1<<60)-1
takes 2 chunks.And so on.
This is slightly oversimplified; for full details, see longintrepr.h
in the source.
In Python 2, there are two separate types, called int
and long
. An int
is represented by a C 32-bit signed integer4 embedded directly in the header, instead of an array of chunks. A long
is like a Python 3 int
.
If you do the same test with 0L
, 1L
, etc., to explicitly ask for long
values, you will get the same results as in Python 3. But without the L
suffix, any literal that fits in 32 bits gives you an int
, and only literals that are too big give you long
s.5 (This means that (1<<31)-1
is an int
, but 1<<31
is a 2-chunk long
.)
1. In a different implementation, this might not be true. IIRC, Jython does roughly the same thing as CPython, but IronPython uses a C# "bignum" implementation.
2. Why 30 bits instead of 32? Mainly because the implementation of pow
and **
can be simpler and faster if it can assume that the number of bits in two "digits" is divisible by 10
.
3. It uses the C "struct hack". Technically, a Py_LongObject
is 28 bytes, but nobody ever allocates a Py_LongObject
; they malloc 24, 28, 32, 36, etc. bytes then cast to Py_LongObject *
.
4. In fact, a Python int
is a C long
, just to make things confusing. So the C API is full of things like PyInt_FromLong
where the long
means "32-bit int" and PyLong_FromSize_t
where the long
means "bignum".
5. Early versions of Python 2.x didn't integrate int
and long
as nicely, but hopefully nobody has to worry about those anymore.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With