I have a numpy array of strings 'A' of length 100 and they are sentences of different sizes. It is string NOT numpy strings
>>> type(A[0])
<type 'str'>
I want to find the location of strings in A which contain certain pattern like 'zzz' in them.
I tried
np.core.defchararray.find(A, 'zzz')
gives error:
TypeError: string operation on non-string array
I assume I will need to change each of the 'str' in A to numpy string ?
Edit:
I want to find the index of 'zzz' appearance in A
No need to be fancy with this, you can get the list of indicies with a list comprehension and the in
operator:
>>> import numpy as np
>>> lst = ["aaa","aazzz","zzz"]
>>> n = np.array(lst)
>>> [i for i,item in enumerate(n) if "zzz" in item]
[1, 2]
Note that here the elements of the array are actually numpy strings, but the in
operator will work for regular strings too, so it's moot.
The issue here is the nature of your array of strings.
If I make the array like:
In [362]: x=np.array(['one','two','three'])
In [363]: x
Out[363]:
array(['one', 'two', 'three'],
dtype='<U5')
In [364]: type(x[0])
Out[364]: numpy.str_
The elements are special kind of string, implicitly padded to 5 characters (the longest, 'np.char methods work on this kind of array
In [365]: np.char.find(x,'one')
Out[365]: array([ 0, -1, -1])
But if I make a object array that contains strings, it produces your error
In [366]: y=np.array(['one','two','three'],dtype=object)
In [367]: y
Out[367]: array(['one', 'two', 'three'], dtype=object)
In [368]: type(y[0])
Out[368]: str
In [369]: np.char.find(y,'one')
...
/usr/lib/python3/dist-packages/numpy/core/defchararray.py in find(a, sub, start, end)
...
TypeError: string operation on non-string array
And more often than not, an object array has to be treated as a list.
In [370]: y
Out[370]: array(['one', 'two', 'three'], dtype=object)
In [371]: [i.find('one') for i in y]
Out[371]: [0, -1, -1]
In [372]: np.array([i.find('one') for i in y])
Out[372]: array([ 0, -1, -1])
The np.char
methods are convenient, but they aren't faster. They still have to iterate through the array applying regular string operations to each element.
you can try this one:
np.core.defchararray.find(A.astype(str), 'zzz')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With