Imagine you have a structured numpy array, generated from a csv with the first row as field names. The array has the form:
dtype([('A', '<f8'), ('B', '<f8'), ('C', '<f8'), ..., ('n','<f8'])
Now, lets say you want to remove from this array the 'ith' column. Is there a convenient way to do that?
I'd like a it to work like delete:
new_array = np.delete(old_array, 'i')
Any ideas?
Deleting element from NumPy array using np. The delete(array_name ) method will be used to do the same. Where array_name is the name of the array to be deleted and index-value is the index of the element to be deleted.
Many times we have non-numeric values in NumPy array. These values need to be removed, so that array will be free from all these unnecessary values and look more decent. It is possible to remove all columns containing Nan values using the Bitwise NOT operator and np. isnan() function.
It's not quite a single function call, but the following shows one way to drop the i-th field:
In [67]: a
Out[67]:
array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<f8')])
In [68]: i = 1 # Drop the 'B' field
In [69]: names = list(a.dtype.names)
In [70]: names
Out[70]: ['A', 'B', 'C']
In [71]: new_names = names[:i] + names[i+1:]
In [72]: new_names
Out[72]: ['A', 'C']
In [73]: b = a[new_names]
In [74]: b
Out[74]:
array([(1.0, 3.0), (4.0, 6.0)],
dtype=[('A', '<f8'), ('C', '<f8')])
Wrapped up as a function:
def remove_field_num(a, i):
names = list(a.dtype.names)
new_names = names[:i] + names[i+1:]
b = a[new_names]
return b
It might be more natural to remove a given field name:
def remove_field_name(a, name):
names = list(a.dtype.names)
if name in names:
names.remove(name)
b = a[names]
return b
Also, check out the drop_rec_fields
function that is part of the mlab
module of matplotlib.
Update: See my answer at How to remove a column from a structured numpy array *without copying it*? for a method to create a view of subsets of the fields of a structured array without making a copy of the array.
Having googled my way here and learned what I needed to know from Warren's answer, I couldn't resist posting a more succinct version, with the added option to remove multiple fields efficiently in one go:
def rmfield( a, *fieldnames_to_remove ):
return a[ [ name for name in a.dtype.names if name not in fieldnames_to_remove ] ]
Examples:
a = rmfield(a, 'foo')
a = rmfield(a, 'foo', 'bar') # remove multiple fields at once
Or if we're really going to golf it, the following is equivalent:
rmfield=lambda a,*f:a[[n for n in a.dtype.names if n not in f]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With