NumPy: Alternative to `vectorize` that lets me access the array

Tags:

I have this code:

output_array = np.vectorize(f, otypes='d')(input_array)

And I'd like to replace it with this code, which is supposed to give the same output:

output_array = np.ndarray(input_array.shape, dtype='d')
for i, item in enumerate(input_array):
    output_array[i] = f(item)

The reason I want the second version is that I can then start iterating on output_array in a separate thread, while it's being calculated. (Yes, I know about the GIL, that part is taken care of.)

Unfortunately, the for loop is very slow, even when I'm not processing the data on separate thread. I benchmarked it on both CPython and PyPy3, which is my target platform. On CPython it's 3 times slower than vectorize, and on PyPy3 it's 67 times slower than vectorize!

That's despite the fact that the Numpy documentation says "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop."

Any idea why my implementation is slow, and how to make a fast implementation that still allows me to use output_array before it's finished?

738

asked Jul 10 '20 06:07

Ram Rachum

Video Answer

1 Answers

Sebastian Berg gave me a solution. When iterating over items from the input array, use item.item() rather than just item. This turns the numpy.float64 objects to normal Python floats, making everything much faster and solving my particular problem :)

answered Oct 25 '22 18:10

Ram Rachum

Related questions
                            
                                Python Plotly: How to add an image to a 3D scatter plot
                            
                                Docker hyperkit process CPU usage going crazy. How to keep it under control?
                            
                                Dash: how to control graph style via CSS?
                            
                                Python: How to offer a single executable file without showing the code in 2020
                            
                                Cannot open jupyter notebook in VSCode
                            
                                How to find which library prevents updating a package in conda?
                            
                                Problem with KerasRegressor & multiple output
                            
                                Tensorflow graph nodes are exchange
                            
                                Error "Running as root without --no-sandbox is not supported"
                            
                                Fastparquet giving "TypeError: expected str, bytes or os.PathLike object, not _io.BytesIO" while using dataframe.to_parquet()
                            
                                Change alembic logger
                            
                                Unable to Instantiate Python Dataclass (Frozen) inside a Pytest function that uses Fixtures
                            
                                How to deal with name clash collections.Counter and typing.Counter?
                            
                                Ubuntu 20.04 "Temporary failure in name resolution" - recently reinstalled
                            
                                GSDMM Convergence of Clusters (Short Text Clustering)
                            
                                Python logging why outputing twice?
                            
                                pandas dataframe column based on previous rows
                            
                                How to run a Django project with .pyc files without using source codes?
                            
                                How to display hover info on a plotly Table?
                            
                                Problem with creating an environment from .yml file, error "CondaEnvException: Pip failed" raised

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

NumPy: Alternative to `vectorize` that lets me access the array

Tags:

python

pypy

numpy

Ram Rachum

People also ask

Video Answer

1 Answers

Ram Rachum

Recent Activity

Donate For Us