Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: pick appropriate datatype size (int) automatically

Tags:

python

numpy

I'm allocating a (possibly large) matrix of zeros with Python and numpy. I plan to put unsigned integers from 1 to N in it.

N is quite variable: could easily range from 1 all the way up to a million, perhaps even more.

I know N prior to matrix initialisation. How can I choose the data type of my matrix such that I know it can hold (unsigned) integers of size N?

Furthermore, I want to pick the smallest such data type that will do.

For example, if N was 1000, I'd pick np.dtype('uint16'). If N is 240, uint16 would work, but uint8 would also work and is the smallest data type I can use to hold the numbers.

This is how I initialise the array. I'm looking for the SOMETHING_DEPENDING_ON_N:

import numpy as np
# N is known by some other calculation.
lbls = np.zeros( (10,20), dtype=np.dtype( SOMETHING_DEPENDING_ON_N ) )

cheers!

Aha!

Just realised numpy v1.6.0+ has np.min_scalar_type, documentation. D'oh! (although the answers are still useful because I don't have 1.6.0).

like image 749
mathematical.coffee Avatar asked Dec 19 '11 04:12

mathematical.coffee


People also ask

How does Python decide data type?

When a variable is created by an assignment such as variable=value, Python determines and assigns a data type to the variable. A data type defines how the variable is stored and the rules governing how the data can be manipulated. Python uses the variable's assigned value to determine its type.

What is Int64 in Python?

You will often see the data type Int64 in Python which stands for 64 bit integer. The 64 refers to the memory allocated to store data in each cell which effectively relates to how many digits it can store in each “cell”. Allocating space ahead of time allows computers to optimize storage and processing efficiency.

What is integer data type in Python?

int (signed integers) − They are often called just integers or ints. They are positive or negative whole numbers with no decimal point. Integers in Python 3 are of unlimited size. Python 2 has two integer types - int and long. There is no 'long integer' in Python 3 anymore.

Do variables have types what determines the type of a variable in Python?

So to answer the question: Python never determines the type of a variable (label/name), it only uses it to refer to an object and that object has a type.

Is memory size the only criteria to select data type in Python?

This comparison is for int data used in list, tuple, set, dict. Memory size may not be the only criteria to select data type. Rather, time required to perform operation on data type can be critical criteria. In this page...

What is the minimum size of int variable in Python?

Python int variable requires minimum 24 bytes on 32-bit / 64-bit system. It may vary as per hardware. Python float variable requires 24 bytes on 32-bit / 64-bit system. It may vary as per hardware.

How to determine the type of data type in Python?

Note – type () function is used to determine the type of data type. In Python, sequence is the ordered collection of similar or different data types. Sequences allows to store multiple values in an organized and efficient fashion. There are several sequence types in Python – In Python, Strings are arrays of bytes representing Unicode characters.

How many numeric data types are there in Python?

There are four main Python Numeric Data Types: int (Integer) long. float. complex. boolean (Subset of integer) You might be thinking why do I mention 5 data types in the above list. It has a reason and I will clear your doubt in this article. I keep talking Python is far different from all other programming languages.


2 Answers

What about writing a simple function to do the job?

import numpy as np

def type_chooser(N):
    for dtype in [np.uint8, np.uint16, np.uint32, np.uint64]:
        if N <= dtype(-1):
            return dtype
    raise Exception('{} is really big!'.format(N))

Example usage:

>>> type_chooser(255)
<type 'numpy.uint8'>
>>> type_chooser(256)
<type 'numpy.uint16'>
>>> type_chooser(18446744073709551615)
<type 'numpy.uint64'>
>>> type_chooser(18446744073709551616)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "spam.py", line 6, in type_chooser
    raise Exception('{} is really big!'.format(N))
Exception: 18446744073709551616 is really big!
like image 50
wim Avatar answered Oct 24 '22 10:10

wim


Create a mapping of maximum value to type, and then look for the smallest value larger than N.

typemap = {
  256: uint8,
  65536: uint16,
   ...
}

return typemap.get(min((x for x in typemap.iterkeys() if x > N)))
like image 28
Ignacio Vazquez-Abrams Avatar answered Oct 24 '22 08:10

Ignacio Vazquez-Abrams