Lets say I have arrays a
and b
a = np.array([1,2,3])
b = np.array(['red','red','red'])
If I were to apply some fancy indexing like this to these arrays
b[a<3]="blue"
the output I get is
array(['blu', 'blu', 'red'], dtype='<U3')
I understand that the issue is because of numpy initially allocating space only for 3 characters at first hence it cant fit the whole word blue into the array, what work around can I use?
Currently I am doing
b = np.array([" "*100 for i in range(3)])
b[a>2] = "red"
b[a<3] = "blue"
but it's just a work around, is this a fault in my code? Or is it some issue with numpy, how can I fix this?
You can handle variable length strings by setting the dtype
of b
to be "object"
:
import numpy as np
a = np.array([1,2,3])
b = np.array(['red','red','red'], dtype="object")
b[a<3] = "blue"
print(b)
this outputs:
['blue' 'blue' 'red']
This dtype
will handle strings, or other general Python objects. This also necessarily means that under the hood you'll have a numpy
array of pointers, so don't expect the performance you get when using a primitive datatype.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With