Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to interpret Python output dtype='<U32'?

Tags:

python

numpy

I am taking an online course, and the following supposedly demonstrates that "NumPy arrays: contain only one type":

In [19]: np.array([1.0, "is", True])
Out[19]:
array(['1.0', 'is', 'True'],
dtype='<U32')

At first, I thought that the output was a form of error message, but this is not confirmed by a web search. In fact, I haven't come across an explanation....can anyone explain how to interpret the output?

Afternote: After reviewing the answers, the dtype page, and the numpy.array() page, it seems that dtype='<U32' would be more accurately described as dtype('<U32'). Is this correct? I seems so to me, but I'm a newbie, and even the numpy.array() page assigns a string to the dtype parameter rather than an actual dtype object.

Also, why does '<U32' specify a 32-character string when all of the elements are much shorter strings?

like image 654
user36800 Avatar asked Jul 09 '19 03:07

user36800


2 Answers

It is fully explained in the manual:

Several kinds of strings can be converted. Recognized strings can be prepended with '>' (big-endian), '<' (little-endian), or '=' (hardware-native, the default), to specify the byte order.

[...]

The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are

[...]

'U'        Unicode string

So, a little-endian Unicode string of 32 characters.

like image 125
Amadan Avatar answered Sep 18 '22 16:09

Amadan


dtype='<U32' is a little-endian 32 character string.

The documentation on dtypes goes into more depth about each of the character.

'U' Unicode string

Several kinds of strings can be converted. Recognized strings can be prepended with '>' (big-endian), '<' (little-endian), or '=' (hardware-native, the default), to specify the byte order.

Examples:

dt = np.dtype('f8')   # 64-bit floating-point number
dt = np.dtype('c16')  # 128-bit complex floating-point number
dt = np.dtype('a25')  # 25-length zero-terminated bytes
dt = np.dtype('U25')  # 25-character string```
like image 22
adlopez15 Avatar answered Sep 16 '22 16:09

adlopez15