Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove a column from a structured numpy array *without copying it*?

Given a structured numpy array, I want to remove certain columns by name without copying the array. I know I can do this:

names = list(a.dtype.names)
if name_to_remove in names:
    names.remove(name_to_remove)
a = a[names]

But this creates a temporary copy of the array which I want to avoid because the array I am dealing with might be very large.

Is there a good way to do this?

like image 417
Konstantin Schubert Avatar asked May 06 '16 18:05

Konstantin Schubert


People also ask

How do you remove a column from a NumPy array?

Using the NumPy function np. delete() , you can delete any row and column from the NumPy array ndarray . Specify the axis (dimension) and position (row number, column number, etc.). It is also possible to select multiple rows and columns using a slice or a list.

How can I remove columns in NumPy array that contains non numeric values?

Many times we have non-numeric values in NumPy array. These values need to be removed, so that array will be free from all these unnecessary values and look more decent. It is possible to remove all columns containing Nan values using the Bitwise NOT operator and np. isnan() function.

How do I delete multiple columns in NumPy?

To delete multiple elements from a numpy array by index positions, pass the numpy array and list of index positions to be deleted to np. delete() i.e. It deleted the elements at index position 1,2 and 3 from the numpy array. It returned a copy of the passed array by deleting multiple element at given indices.

How do I select a column in NumPy?

We can use [][] operator to select an element from Numpy Array i.e. Example 1: Select the element at row index 1 and column index 2. Or we can pass the comma separated list of indices representing row index & column index too i.e.


1 Answers

You can create a new data type containing just the fields that you want, with the same field offsets and the same itemsize as the original array's data type, and then use this new data type to create a view of the original array. The dtype function handles arguments with many formats; the relevant one is described in the section of the documentation called "Specifying and constructing data types". Scroll down to the subsection that begins with

{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ..., 'itemsize': ...}

Here are a couple convenience functions that use this idea.

import numpy as np


def view_fields(a, names):
    """
    `a` must be a numpy structured array.
    `names` is the collection of field names to keep.

    Returns a view of the array `a` (not a copy).
    """
    dt = a.dtype
    formats = [dt.fields[name][0] for name in names]
    offsets = [dt.fields[name][1] for name in names]
    itemsize = a.dtype.itemsize
    newdt = np.dtype(dict(names=names,
                          formats=formats,
                          offsets=offsets,
                          itemsize=itemsize))
    b = a.view(newdt)
    return b


def remove_fields(a, names):
    """
    `a` must be a numpy structured array.
    `names` is the collection of field names to remove.

    Returns a view of the array `a` (not a copy).
    """
    dt = a.dtype
    keep_names = [name for name in dt.names if name not in names]
    return view_fields(a, keep_names)

For example,

In [297]: a
Out[297]: 
array([(10.0, 13.5, 1248, -2), (20.0, 0.0, 0, 0), (30.0, 0.0, 0, 0),
       (40.0, 0.0, 0, 0), (50.0, 0.0, 0, 999)], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('i', '<i8'), ('j', '<i8')])

In [298]: b = remove_fields(a, ['i', 'j'])

In [299]: b
Out[299]: 
array([(10.0, 13.5), (20.0, 0.0), (30.0, 0.0), (40.0, 0.0), (50.0, 0.0)], 
      dtype={'names':['x','y'], 'formats':['<f8','<f8'], 'offsets':[0,8], 'itemsize':32})

Verify that b is a view (not a copy) of a by changing b[0]['x']...

In [300]: b[0]['x'] = 3.14

and seeing that a is also changed:

In [301]: a[0]
Out[301]: (3.14, 13.5, 1248, -2)
like image 145
Warren Weckesser Avatar answered Oct 15 '22 14:10

Warren Weckesser