Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c-style union with numpy dtypes?

Tags:

union

numpy

I'm interested in using numpy arrays of somewhat inhomogenous data types. Since numpy specifies that the data must be homogenous, this would be accomplished by defining a super-dtype that acts as a union wrapper over all the sub-dtypes. Accessing the fields of the sub-dtypes then gives a different interpretation of the underlying data.

There's already some facility for this, for example

dtype(('|S2', [('x', '|i1'), ('y', '|i1')]))

refers to an array of two-byte strings, but the first and second bytes can also be interpreted as integers through the 'x' and 'y' field names. I can't figure out how to assign a field label to the two-byte string, though.

Can this be made more general, so that we can overlay any number of different field specifications on the data?

My first try was to specify the field offsets in the dtype, but it failed with a complaint that the offsets must be ordered (i.e. non-overlapping data).

dtype1 = np.dtype(dict(
   names=['a','b'], 
   formats=['|a2','<i2'], 
   offsets=[0,0]))

Another technique works, but is cumbersome. In this technique I can define several variables as view onto the same underlying data, and change the dtype of the different variables to let me access the data in different formats, i.e.

a=np.zeros(3, dtype='<a2')
b=a[:]
b.dtype='<i2'

This lets me access the data either as strings or integers depending on whether I'm looking at a or b. But it is a cumbersome way of manipulating the data. Ideally, I'd like to be able to specify a variety of different fields with arbitrary offsets. Is there any way to do this?

like image 315
russt Avatar asked Jan 14 '13 09:01

russt


People also ask

What is Dtype U11 Numpy?

# dtype('<U11') In the first case, each element of the list we pass to the array constructor is an integer. Therefore, NumPy decides that the dtype should be integer (32 bit integer to be precise). In the second case, one of the elements (3.0) is a floating-point number.

What is Dtype NP uint8?

dtype dtype('uint8') dtype objects also contain information about the type, such as its bit-width and its byte-order. The data type can also be used indirectly to query properties of the type, such as whether it is an integer: >>> d = np. dtype(int) >>> d dtype('int32') >>> np.

How do I change the Dtype of an NP array?

In order to change the dtype of the given array object, we will use numpy. astype() function. The function takes an argument which is the target data type. The function supports all the generic types and built-in types of data.


1 Answers

Union dtypes have been allowed since June 2011: https://github.com/numpy/numpy/pull/94

You'll need to upgrade to NumPy 1.7.x to use this.

However, in previous versions you can use the overlay dtype constructor:

>>> a = np.zeros(3, dtype=np.dtype(('<i2', [('a', '|a2')])))
>>> a[0] = 0x3456
>>> a['a'][0]
'V4'

This is documented at http://docs.scipy.org/doc/numpy-dev/reference/arrays.dtypes.html#specifying-and-constructing-data-types (search for (base_dtype, new_dtype)).

like image 110
ecatmur Avatar answered Oct 01 '22 07:10

ecatmur