Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get size in Bytes needed for an integer in Python

Tags:

How can I find out the number of Bytes a certain integer number takes up to store?

E.g. for

  • hexadecimal \x00 - \xff (or decimal 0 - 255 = binary 0000 0000 - 1111 1111) I'm looking to get 1 (Byte),
  • hexadecimal \x100 - \xffff (or decimal 256 - 65535 = binary 0000 0001 0000 0000 - 1111 1111 1111 1111) would give me 2 (Bytes)

and so on.

Any clue for hexadecimal or decimal format as the input?

like image 514
stdcerr Avatar asked Jan 15 '13 01:01

stdcerr


2 Answers

Unless you're dealing with an array.array or a numpy.array - the size always has object overhead. And since Python deals with BigInts naturally, it's really, really hard to tell...

>>> i = 5 >>> import sys >>> sys.getsizeof(i) 24 

So on a 64bit platform it requires 24 bytes to store what could be stored in 3 bits.

However, if you did,

>>> s = '\x05' >>> sys.getsizeof(s) 38 

So no, not really - you've got the memory-overhead of the definition of the object rather than raw storage...

If you then take:

>>> a = array.array('i', [3]) >>> a array('i', [3]) >>> sys.getsizeof(a) 60L >>> a = array.array('i', [3, 4, 5]) >>> sys.getsizeof(a) 68L 

Then you get what would be called normal byte boundaries, etc.. etc... etc...

If you just want what "purely" should be stored - minus object overhead, then from 2.(6|7) you can use some_int.bit_length() (otherwise just bitshift it as other answers have shown) and then work from there

like image 120
Jon Clements Avatar answered Nov 12 '22 19:11

Jon Clements


def byte_length(i):     return (i.bit_length() + 7) // 8 

Of course, as Jon Clements points out, this isn't the size of the actual PyIntObject, which has a PyObject header, and stores the value as a bignum in whatever way is easiest to deal with rather than most compact, and which you have to have at least one pointer (4 or 8 bytes) to on top of the actual object, and so on.

But this is the byte length of the number itself. It's almost certainly the most efficient answer, and probably also the easiest to read.

Or is ceil(i.bit_length() / 8.0) more readable?

like image 29
abarnert Avatar answered Nov 12 '22 20:11

abarnert