What is the default dtype for str like input in numpy?

Tags:

I just wanted to confirm if the default data type for string is unicode while creating a ndarray. I could not find any reference which states this clearly. May be it is too obvious and doesn't need stating.

When dtype is specified:

>>> import numpy as np
>>> g = np.array([['a', 'b'],['c', 'd']], dtype='S')
>>> g
array([[b'a', b'b'],
       [b'c', b'd']], 
      dtype='|S1')

Without specifying the dtype:

>>> g = np.array([['a', 'b'],['c', 'd']])
>>> g
array([['a', 'b'],
       ['c', 'd']], 
      dtype='<U1')

Also, what does the literal b indicate when dtype is specified. As per the documentation, it indicates bool which doesn't seem to be the case here.

Can some one please clarify?

607

asked Sep 05 '17 09:09

Isha Garg

1 Answers

b'...' means it's a byte-string and the default dtype for arrays of strings depends on the kind of strings. Unicodes (python 3 strings are unicode) are U and Python 2 str or Python 3 bytes have the dtype S. You can find the explanation of dtypes in the NumPy documentation here

Array-protocol type strings

The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are:

'?' boolean

'b' (signed) byte

'B' unsigned byte

'i' (signed) integer

'u' unsigned integer

'f' floating-point

'c' complex-floating point

'm' timedelta

'M' datetime

'O' (Python) objects

'S', 'a' zero-terminated bytes (not recommended)

'U' Unicode string

'V' raw data (void)

However in your first case you actually forced NumPy to convert it to bytes because you specified dtype='S'.

166

answered Nov 14 '22 22:11

MSeifert

Related questions
                            
                                python: why does random.shuffle change the array
                            
                                Calling base class method after child class __init__ from base class __init__?
                            
                                Pythonic way to print 2D list -- Python
                            
                                Scatter plot on large amount of data
                            
                                Appending rows in excel xlswriter
                            
                                Regular expression: matching words between white space
                            
                                Why does Python2.7 dict use more space than Python3 dict?
                            
                                Python Windows Authentication username and password is not working
                            
                                Expected shape (None, 8) but got array with shape (8,1)
                            
                                Multi processing code repeatedly runs
                            
                                How do you find nodes with no outgoing edges in networkx?
                            
                                Python .NET, multithreading and the windows event loop
                            
                                How to overlay plots from different cells?
                            
                                Seaborn FacetGrid PointPlot Label Data Points
                            
                                What type should it be , after using .toArray() for a Spark vector?
                            
                                convert dict of lists of tuples to dataframe
                            
                                Split the index into separate columns in pandas
                            
                                Using sklearn directly in python from within matlab
                            
                                Getting symbols with Lark parsing
                            
                                Configure Multiprocessing in python to use forkserver

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the default dtype for str like input in numpy?

Tags:

python

string

numpy

numpy-dtype

Isha Garg

People also ask

1 Answers

Array-protocol type strings

MSeifert

Recent Activity

Donate For Us