I have two arrays of strings:
In [51]: r['Z']
Out[51]:
array(['0', '0', '0', ..., '0', '0', '0'],
dtype='|S1')
In [52]: r['Y']
Out[52]:
array(['X0', 'X0', 'X0', ..., 'X0', 'X1', 'X1'],
dtype='|S2')
What is the difference between S1 and S2? Is it just that they hold entries of different length?
What if my arrays have strings of different lengths?
Where can I find a list of all possible dtypes and what they mean?
The |S1 and |S2 strings are data type descriptors; the first means the array holds strings of length 1, the second of length 2. The | pipe symbol is the byteorder flag; in this case there is no byte order flag needed, so it's set to | , meaning not applicable.
A data type object (an instance of numpy. dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data: Type of the data (integer, float, Python object, etc.)
The type of a NumPy array is numpy. ndarray ; this is just the type of Python object it is (similar to how type("hello") is str for example). dtype just defines how bytes in memory will be interpreted by a scalar (i.e. a single number) or an array and the way in which the bytes will be treated (e.g. int / float ).
In order to change the dtype of the given array object, we will use numpy. astype() function. The function takes an argument which is the target data type.
See the dtypes
documentation.
The |S1
and |S2
strings are data type descriptors; the first means the array holds strings of length 1, the second of length 2. The |
pipe symbol is the byteorder flag; in this case there is no byte order flag needed, so it's set to |
, meaning not applicable.
For storing strings of variable length in a numpy array you could store them as python objects. For example:
In [456]: x=np.array(('abagd','ds','asdfasdf'),dtype=np.object_)
In [457]: x[0]
Out[457]: 'abagd'
In [459]: map(len,x)
Out[459]: [5, 2, 8]
In [460]: x[1]=='ds'
Out[460]: True
In [461]: x
Out[461]: array([abagd, ds, asdfasdf], dtype=object)
In [462]: str(x)
Out[462]: '[abagd ds asdfasdf]'
In [463]: x.tolist()
Out[463]: ['abagd', 'ds', 'asdfasdf']
In [464]: map(type,x)
Out[464]: [str, str, str]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With