How to remove a column from a structured numpy array without copying it?

Tags:

Given a structured numpy array, I want to remove certain columns by name without copying the array. I know I can do this:

names = list(a.dtype.names)
if name_to_remove in names:
    names.remove(name_to_remove)
a = a[names]

But this creates a temporary copy of the array which I want to avoid because the array I am dealing with might be very large.

Is there a good way to do this?

417

asked May 06 '16 18:05

Konstantin Schubert

1 Answers

You can create a new data type containing just the fields that you want, with the same field offsets and the same itemsize as the original array's data type, and then use this new data type to create a view of the original array. The dtype function handles arguments with many formats; the relevant one is described in the section of the documentation called "Specifying and constructing data types". Scroll down to the subsection that begins with

{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ..., 'itemsize': ...}

Here are a couple convenience functions that use this idea.

import numpy as np


def view_fields(a, names):
    """
    `a` must be a numpy structured array.
    `names` is the collection of field names to keep.

    Returns a view of the array `a` (not a copy).
    """
    dt = a.dtype
    formats = [dt.fields[name][0] for name in names]
    offsets = [dt.fields[name][1] for name in names]
    itemsize = a.dtype.itemsize
    newdt = np.dtype(dict(names=names,
                          formats=formats,
                          offsets=offsets,
                          itemsize=itemsize))
    b = a.view(newdt)
    return b


def remove_fields(a, names):
    """
    `a` must be a numpy structured array.
    `names` is the collection of field names to remove.

    Returns a view of the array `a` (not a copy).
    """
    dt = a.dtype
    keep_names = [name for name in dt.names if name not in names]
    return view_fields(a, keep_names)

For example,

In [297]: a
Out[297]: 
array([(10.0, 13.5, 1248, -2), (20.0, 0.0, 0, 0), (30.0, 0.0, 0, 0),
       (40.0, 0.0, 0, 0), (50.0, 0.0, 0, 999)], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('i', '<i8'), ('j', '<i8')])

In [298]: b = remove_fields(a, ['i', 'j'])

In [299]: b
Out[299]: 
array([(10.0, 13.5), (20.0, 0.0), (30.0, 0.0), (40.0, 0.0), (50.0, 0.0)], 
      dtype={'names':['x','y'], 'formats':['<f8','<f8'], 'offsets':[0,8], 'itemsize':32})

Verify that b is a view (not a copy) of a by changing b[0]['x']...

In [300]: b[0]['x'] = 3.14

and seeing that a is also changed:

In [301]: a[0]
Out[301]: (3.14, 13.5, 1248, -2)

145

answered Oct 15 '22 14:10

Warren Weckesser

Related questions
                            
                                How can I extract the abstract from efetch (Biopython, Entrez)?
                            
                                Bokeh widget-Working Checkbox Group Example
                            
                                Preceding Word Length
                            
                                How can I disable the label when plotting pandas data?
                            
                                Unhashable type: 'list' error in python
                            
                                What's the most efficient way to find which elements of one array are close to any element in another?
                            
                                Firebase user authentication in python
                            
                                Geodesic buffering in python
                            
                                how to set a column to DATE format in xlsxwriter
                            
                                Multi-Threaded NLP with Spacy pipe
                            
                                Sort a list in python based on another sorted list
                            
                                Finding email addresses in body using scrapy
                            
                                One liner to look up nested value from a dictionary python
                            
                                Running a python method/function directly from a file
                            
                                Passing columns to rows on python pandas
                            
                                How to mock in python and still allow the actual code of mocked function to execute
                            
                                How do I write this equation in Python?
                            
                                Deleting variable does not erase its memory from RAM memory
                            
                                How can I find the best fuzzy string match?
                            
                                Calculate the closest colourblind-friendly colour?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to remove a column from a structured numpy array without copying it?

Tags:

python

arrays

numpy

Konstantin Schubert

People also ask

1 Answers

Warren Weckesser

Recent Activity

Donate For Us

How to remove a column from a structured numpy array *without copying it*?

Tags:

python

arrays

numpy

Konstantin Schubert

People also ask

1 Answers

Warren Weckesser

Related questions

Recent Activity

Donate For Us

How to remove a column from a structured numpy array without copying it?