I'm interested in using numpy arrays of somewhat inhomogenous data types. Since numpy specifies that the data must be homogenous, this would be accomplished by defining a super-dtype that acts as a union wrapper over all the sub-dtypes. Accessing the fields of the sub-dtypes then gives a different interpretation of the underlying data.
There's already some facility for this, for example
dtype(('|S2', [('x', '|i1'), ('y', '|i1')]))
refers to an array of two-byte strings, but the first and second bytes can also be interpreted as integers through the 'x' and 'y' field names. I can't figure out how to assign a field label to the two-byte string, though.
Can this be made more general, so that we can overlay any number of different field specifications on the data?
My first try was to specify the field offsets in the dtype, but it failed with a complaint that the offsets must be ordered (i.e. non-overlapping data).
dtype1 = np.dtype(dict(
names=['a','b'],
formats=['|a2','<i2'],
offsets=[0,0]))
Another technique works, but is cumbersome. In this technique I can define several variables as view onto the same underlying data, and change the dtype of the different variables to let me access the data in different formats, i.e.
a=np.zeros(3, dtype='<a2')
b=a[:]
b.dtype='<i2'
This lets me access the data either as strings or integers depending on whether I'm looking at a or b. But it is a cumbersome way of manipulating the data. Ideally, I'd like to be able to specify a variety of different fields with arbitrary offsets. Is there any way to do this?
# dtype('<U11') In the first case, each element of the list we pass to the array constructor is an integer. Therefore, NumPy decides that the dtype should be integer (32 bit integer to be precise). In the second case, one of the elements (3.0) is a floating-point number.
dtype dtype('uint8') dtype objects also contain information about the type, such as its bit-width and its byte-order. The data type can also be used indirectly to query properties of the type, such as whether it is an integer: >>> d = np. dtype(int) >>> d dtype('int32') >>> np.
In order to change the dtype of the given array object, we will use numpy. astype() function. The function takes an argument which is the target data type. The function supports all the generic types and built-in types of data.
Union dtypes have been allowed since June 2011: https://github.com/numpy/numpy/pull/94
You'll need to upgrade to NumPy 1.7.x to use this.
However, in previous versions you can use the overlay dtype constructor:
>>> a = np.zeros(3, dtype=np.dtype(('<i2', [('a', '|a2')])))
>>> a[0] = 0x3456
>>> a['a'][0]
'V4'
This is documented at http://docs.scipy.org/doc/numpy-dev/reference/arrays.dtypes.html#specifying-and-constructing-data-types (search for (base_dtype, new_dtype)
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With