Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter numpy structured array based based on unique elements in one dimension

So I have a rather large (200k+ rows) structured array:

recordtype = np.dtype([('x',np.float32),('y',np.float32),('z',np.float32), \
                       ('u',np.float32),('v',np.float32),('w',np.float32), \
                       ('d',np.float32),('T',np.float32),('mdot',np.float32), \
                       ('f',np.float32),('t',np.float32),('name',np.str_,14)])
data = np.loadtxt('tmp2.out',dtype=recordtype,skiprows=2)

In the 'name' columns, there are non-unique elements: len(data[:]['name']) is larger than len(set(data[:]['name'])). I would like to create a new array with only unique elements from name, I guess first occurrence is fine. How would I do this efficiently?

like image 842
FrenchKheldar Avatar asked Feb 02 '26 21:02

FrenchKheldar


1 Answers

to get unique indices you can use np.unique

unique_elements, indices = np.unique(data[:]['name'], return_index = True)

then you know the unique indices in the name dimension that you need to access. Then you should be able to do just select those indices

data = data[indices]
like image 137
Hammer Avatar answered Feb 04 '26 11:02

Hammer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!