Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent strings being truncated when replacing values in a numpy array

Lets say I have arrays a and b

a = np.array([1,2,3])
b = np.array(['red','red','red'])

If I were to apply some fancy indexing like this to these arrays

b[a<3]="blue"

the output I get is

array(['blu', 'blu', 'red'], dtype='<U3')

I understand that the issue is because of numpy initially allocating space only for 3 characters at first hence it cant fit the whole word blue into the array, what work around can I use?

Currently I am doing

b = np.array([" "*100 for i in range(3)])
b[a>2] = "red"
b[a<3] = "blue"

but it's just a work around, is this a fault in my code? Or is it some issue with numpy, how can I fix this?

like image 511
Imtinan Azhar Avatar asked Oct 15 '25 03:10

Imtinan Azhar


1 Answers

You can handle variable length strings by setting the dtype of b to be "object":

import numpy as np
a = np.array([1,2,3])
b = np.array(['red','red','red'], dtype="object")

b[a<3] = "blue"

print(b)

this outputs:

['blue' 'blue' 'red']

This dtype will handle strings, or other general Python objects. This also necessarily means that under the hood you'll have a numpy array of pointers, so don't expect the performance you get when using a primitive datatype.

like image 163
Matt Messersmith Avatar answered Oct 17 '25 15:10

Matt Messersmith



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!