Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fastest way to populate a 1D numpy array

Tags:

python

numpy

I have seen questions similar to this, but not one directly addressing the issue. I have timed the following two ways of populating the array and half the time using np.zeros() is faster and half the time doing it directly is faster. Is there a preferable way? I am quite new to using numpy arrays, and have gotten involved with the aim of speeding up my code rather without too much thought to readability.

import numpy as np
import time

lis = range(100000)

timer = time.time()
list1 = np.array(lis)
print 'normal array creation', time.time() - timer, 'seconds'

timer = time.time()
list2 = np.zeros(len(lis))
list2.fill(lis)
print 'zero, fill - array creation', time.time() - timer, 'seconds'

Thank you

like image 891
Anake Avatar asked Dec 02 '11 11:12

Anake


People also ask

How can I make NumPy array faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

Is appending to NumPy array faster than list?

NumPy Arrays Are Faster Than Lists As predicted, we can see that NumPy arrays are significantly faster than lists.

Is NP append fast?

np. append is extremely slow, why is that the case? The docs don't have anything on the performance part. With the below given code example, it took me more than 10 minutes to have some result.

What is faster than NumPy?

pandas provides a bunch of C or Cython optimized functions that can be faster than the NumPy equivalent function (e.g. reading text from text files). If you want to do mathematical operations like a dot product, calculating mean, and some more, pandas DataFrames are generally going to be slower than a NumPy array.


2 Answers

If you have a list of floats a=[x/10. for x in range(100000)], then you can create an array with:

np.array(a) # 9.92ms
np.fromiter(a, dtype=np.float) # 5.19ms

Your approach

list2 = np.zeros(len(lis))
list2.fill(lis)

won't work as expected. The .fill fills the whole array with one value.

like image 84
eumiro Avatar answered Oct 12 '22 11:10

eumiro


np.fromiter will pre-allocate the output array if given the number of elements:

a = [x/10. for x in range(100000)] # 10.3ms
np.fromiter(a, dtype=np.float) # 3.33ms
np.fromiter(a, dtype=np.float, count=100000) # 3.03ms
like image 30
kwgoodman Avatar answered Oct 12 '22 11:10

kwgoodman