Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dtypes. Difference between S1 and S2 in Python

Tags:

python

numpy

I have two arrays of strings:

In [51]: r['Z']
Out[51]: 
array(['0', '0', '0', ..., '0', '0', '0'], 
      dtype='|S1')

In [52]: r['Y']                                                                                                                
Out[52]: 
array(['X0', 'X0', 'X0', ..., 'X0', 'X1', 'X1'], 
      dtype='|S2')

What is the difference between S1 and S2? Is it just that they hold entries of different length?

What if my arrays have strings of different lengths?

Where can I find a list of all possible dtypes and what they mean?

like image 979
Amelio Vazquez-Reina Avatar asked Feb 09 '13 16:02

Amelio Vazquez-Reina


People also ask

What is S1 and S2 in Python?

The |S1 and |S2 strings are data type descriptors; the first means the array holds strings of length 1, the second of length 2. The | pipe symbol is the byteorder flag; in this case there is no byte order flag needed, so it's set to | , meaning not applicable.

What does Dtypes mean in Python?

A data type object (an instance of numpy. dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data: Type of the data (integer, float, Python object, etc.)

What is the difference between type and Dtype in Python?

The type of a NumPy array is numpy. ndarray ; this is just the type of Python object it is (similar to how type("hello") is str for example). dtype just defines how bytes in memory will be interpreted by a scalar (i.e. a single number) or an array and the way in which the bytes will be treated (e.g. int / float ).

How do I change Dtype in Python?

In order to change the dtype of the given array object, we will use numpy. astype() function. The function takes an argument which is the target data type.


2 Answers

See the dtypes documentation.

The |S1 and |S2 strings are data type descriptors; the first means the array holds strings of length 1, the second of length 2. The | pipe symbol is the byteorder flag; in this case there is no byte order flag needed, so it's set to |, meaning not applicable.

like image 69
Martijn Pieters Avatar answered Sep 19 '22 17:09

Martijn Pieters


For storing strings of variable length in a numpy array you could store them as python objects. For example:

In [456]: x=np.array(('abagd','ds','asdfasdf'),dtype=np.object_)

In [457]: x[0]
Out[457]: 'abagd'

In [459]: map(len,x)
Out[459]: [5, 2, 8]

In [460]: x[1]=='ds'
Out[460]: True

In [461]: x
Out[461]: array([abagd, ds, asdfasdf], dtype=object)

In [462]: str(x)
Out[462]: '[abagd ds asdfasdf]'

In [463]: x.tolist()
Out[463]: ['abagd', 'ds', 'asdfasdf']

In [464]: map(type,x)
Out[464]: [str, str, str]
like image 44
MrCartoonology Avatar answered Sep 18 '22 17:09

MrCartoonology