numpy -- Transform non-contiguous data to contiguous data in place

Tags:

numpy

Consider the following code:

import numpy as np
a = np.zeros(50)
a[10:20:2] = 1
b = c = a[10:40:4]
print b.flags  # You'll see that b and c are not C_CONTIGUOUS or F_CONTIGUOUS

My question:

Is there a way (with only a reference to b) to make both b and c contiguous? It is completely fine if np.may_share_memory(b,a) returns False after this operation.

Things which are close, but don't quite work out are: np.ascontiguousarray/np.asfortranarray as they will return a new array.

My use case is that I have very large 3D fields stored in a subclass of a numpy.ndarray. In order to save memory, I would like to chop those fields down to the portion of the domain that I am actually interested in processing:

a = a[ix1:ix2,iy1:iy2,iz1:iz2]

Slicing for the subclass is somewhat more restricted than slicing of ndarray objects, but this should work and it will "do the right thing" -- the various custom meta-data attached on the subclass will be transformed/preserved as expected. Unfortunately, since this returns a view, numpy won't free the big array afterward so I don't actually save any memory here.

To be completely clear, I'm looking to accomplish 2 things:

preserve the metadata on my class instance. slicing will work, but I'm not sure about other forms of copying.
make it so that the original array is free to be garbage collected

587

asked Mar 15 '13 00:03

mgilson

1 Answers

According to Alex Martelli:

"The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it's done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates."

However, the following appears to free at least some of the memory: Warning: my way of measuring free memory is Linux-specific:

import time
import numpy as np

def free_memory():
    """
    Return free memory available, including buffer and cached memory
    """
    total = 0
    with open('/proc/meminfo', 'r') as f:
        for line in f:
            line = line.strip()
            if any(line.startswith(field) for field in ('MemFree', 'Buffers', 'Cached')):
                field, amount, unit = line.split()
                amount = int(amount)
                if unit != 'kB':
                    raise ValueError(
                        'Unknown unit {u!r} in /proc/meminfo'.format(u=unit))
                total += amount
    return total

def gen_change_in_memory():
    """
    https://stackoverflow.com/a/14446011/190597 (unutbu)
    """
    f = free_memory()
    diff = 0
    while True:
        yield diff
        f2 = free_memory()
        diff = f - f2
        f = f2
change_in_memory = gen_change_in_memory().next

Before allocating the large array:

print(change_in_memory())
# 0

a = np.zeros(500000)
a[10:20:2] = 1
b = c = a[10:40:4]

After allocating the large array:

print(change_in_memory())
# 3844 # KiB

a[:len(b)] = b
b = a[:len(b)]
a.resize(len(b), refcheck=0)
time.sleep(1)

Free memory increases after resizing:

print(change_in_memory())
# -3708 # KiB

112

answered Sep 19 '22 12:09

unutbu

Related questions
                            
                                NLTK makes it easy to compute bigrams of words. What about letters?
                            
                                tweepy stops after a few hours
                            
                                Need more than 32 USB sound cards on my system [closed]
                            
                                Django form to query database (models)
                            
                                Binary Tree in Python
                            
                                django prevent delete of model instance
                            
                                Reductions down a column in Pandas
                            
                                Django: Can't change default language
                            
                                Visualize a clickable graph in an HTML page
                            
                                How to get orthogonal distances of vectors from plane in Numpy/Scipy?
                            
                                How to register new client on Instagram API
                            
                                Is there a more elegant pythonic way of expressing the following condional expression?
                            
                                Python: split list of integers based on step between them
                            
                                How to use Python left outer join using FOR/LIST/DICTIONARY comprehensions (not SQL)?
                            
                                Python (numpy): drop columns by index
                            
                                Installation of biopython - python 3.3 not found in registry
                            
                                Python: access objects from another module
                            
                                How to run Python from Windows cmd [duplicate]
                            
                                Python convert Excel File (xls or xlsx) to/from ODS
                            
                                scikit-learn, add features to a vectorized set of documents

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With