Currently, I have some code like this <pre class="prettyprint"><code>import numpy as np ret = np.array([]) for i in range(100000): tmp = get_input(i) ret = np.append(ret, np.zeros(len(tmp))) ret = np.append(ret, np.ones(fixed_length)) </code></pre> I think this code is not efficient as <code>np.append</code> needs to return a copy of the array instead of modify the ret in-place I was wondering whether I can use the <code>extend</code> for a numpy array like this: <pre class="prettyprint"><code>import numpy as np from somewhere import np_extend ret = np.array([]) for i in range(100000): tmp = get_input(i) np_extend(ret, np.zeros(len(tmp))) np_extend(ret, np.ones(fixed_length)) </code></pre> So that the <code>extend</code> would be much more efficient. Does anyone have ideas about this? Thanks!

Imagine a numpy array as occupying one contiguous block of memory. Now imagine other objects, say other numpy arrays, which are occupying the memory just to the left and right of our numpy array. There would be no room to append to or extend our numpy array. The underlying data in a numpy array always occupies a contiguous block of memory. So any request to append to or extend our numpy array can only be satisfied by allocating a whole new larger block of memory, copying the old data into the new block and then appending or extending. So: <ol> <li>It will not occur in-place.</li> <li>It will not be efficient.</li> </ol>

How to extend an array in-place in Numpy?

Tags:

python

arrays

numpy

scipy

Currently, I have some code like this

import numpy as np ret = np.array([]) for i in range(100000):   tmp =  get_input(i)   ret = np.append(ret, np.zeros(len(tmp)))   ret = np.append(ret, np.ones(fixed_length))

I think this code is not efficient as np.append needs to return a copy of the array instead of modify the ret in-place

I was wondering whether I can use the extend for a numpy array like this:

import numpy as np from somewhere import np_extend ret = np.array([]) for i in range(100000):   tmp =  get_input(i)   np_extend(ret, np.zeros(len(tmp)))   np_extend(ret, np.ones(fixed_length))

So that the extend would be much more efficient. Does anyone have ideas about this? Thanks!

560

asked Nov 04 '12 02:11

Hanfei Sun

2 Answers

Imagine a numpy array as occupying one contiguous block of memory. Now imagine other objects, say other numpy arrays, which are occupying the memory just to the left and right of our numpy array. There would be no room to append to or extend our numpy array. The underlying data in a numpy array always occupies a contiguous block of memory.

So any request to append to or extend our numpy array can only be satisfied by allocating a whole new larger block of memory, copying the old data into the new block and then appending or extending.

So:

It will not occur in-place.
It will not be efficient.

121

answered Sep 30 '22 16:09

unutbu

You can use the .resize() method of ndarrays. It requires that the memory is not referred to by other arrays/variables.

import numpy as np ret = np.array([]) for i in range(100):     tmp = np.random.rand(np.random.randint(1, 100))     ret.resize(len(ret) + len(tmp)) # <- ret is not referred to by anything else,                                     #    so this works     ret[-len(tmp):] = tmp

The efficiency can be improved by using the usual array memory overrallocation schemes.

answered Sep 30 '22 17:09

pv.

Related questions
                            
                                Python glob but against a list of strings rather than the filesystem
                            
                                How to split Vector into columns - using PySpark
                            
                                negative zero in python
                            
                                Using the __call__ method of a metaclass instead of __new__?
                            
                                Pylint showing invalid variable name in output
                            
                                Ruby equivalent of Python's "dir"?
                            
                                How to write bytes to a file in Python 3 without knowing the encoding?
                            
                                Subclassing int in Python
                            
                                High Memory Usage Using Python Multiprocessing
                            
                                How to do Decimal to float conversion in Python?
                            
                                How to automatically destroy django test database
                            
                                How can I use io.StringIO() with the csv module?
                            
                                How to access sparse matrix elements?
                            
                                Python mock call_args_list unpacking tuples for assertion on arguments
                            
                                Scope of variable within "with" statement?
                            
                                Pandas isna() and isnull(), what is the difference?
                            
                                How to group DataFrame by a period of time?
                            
                                Django persistent database connection
                            
                                BeautifulSoup innerhtml?
                            
                                Use Python format string in reverse for parsing

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With