Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to decode a numpy array of dtype=numpy.string_?

I need to decode, with Python 3, a string that was encoded the following way:

>>> s = numpy.asarray(numpy.string_("hello\nworld"))
>>> s
array(b'hello\nworld', 
      dtype='|S11')

I tried:

>>> str(s)
"b'hello\\nworld'"

>>> s.decode()
AttributeError                            Traceback (most recent call last)
<ipython-input-31-7f8dd6e0676b> in <module>()
----> 1 s.decode()

AttributeError: 'numpy.ndarray' object has no attribute 'decode'

>>> s[0].decode()
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-34-fae1dad6938f> in <module>()
----> 1 s[0].decode()

IndexError: 0-d arrays can't be indexed
like image 624
PiRK Avatar asked Oct 03 '16 12:10

PiRK


2 Answers

Another option is the np.char collection of string operations.

In [255]: np.char.decode(s)
Out[255]: 
array('hello\nworld', 
      dtype='<U11')

It accepts the encoding keyword if needed. But .astype is probably better if you don't need this.

This s is 0d (shape ()), so needs to be indexed with s[()].

In [268]: s[()]
Out[268]: b'hello\nworld'
In [269]: s[()].decode()
Out[269]: 'hello\nworld'

s.item() also works.

like image 110
hpaulj Avatar answered Sep 28 '22 07:09

hpaulj


In Python 3, there are two types that represent sequences of characters: bytes and str (contain Unicode characters). When you use string_ as your type, numpy will return bytes. If you want the regular str you should use unicode_ type in numpy:

>>> s = numpy.asarray(numpy.unicode_("hello\nworld"))
>>> s
array('hello\nworld', 
      dtype='<U11')

>>> str(s)
'hello\nworld'

But note that if you don't specify a type for your string (string_ or unicode_) it will return the default str type (which in python 3.x is the str (contain the unicode characters)).

>>> s = numpy.asarray("hello\nworld")
>>> str(s)
'hello\nworld'
like image 35
Mazdak Avatar answered Sep 28 '22 07:09

Mazdak