What is the most efficient way of doing square root of sum of square of two numbers?

Tags:

I am looking for the more efficient and shortest way of performing the square root of a sum of squares of two or more numbers. I am actually using numpy and this code:

np.sqrt(i**2+j**2)

That seems five time faster than:

np.sqrt(sum(np.square([i,j])))

(i and j are to numbers!)

I was wondering if there was already a built-in function more efficient to perform this very common task with even less code.

601

asked Jun 28 '18 07:06

G M

2 Answers

For the case of i != j it is not possible to do this with np.linalg.norm, thus I recommend the following:

(i*i + j*j)**0.5

If i and j are single floats, this is about 5 times faster than np.sqrt(i**2+j**2). If i and j are numpy arrays, this is about 20% faster (due to replacing the square with i*i and j*j. If you do not replace the squares, the performance is equal to np.sqrt(i**2+j**2).
Some timings using single floats:

i = 23.7
j = 7.5e7
%timeit np.sqrt(i**2 + j**2)
# 1.63 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit (i*i + j*j)**0.5
# 336 ns ± 7.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit math.sqrt(i*i + j*j)
# 321 ns ± 8.21 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

math.sqrt is slightly faster than (i*i + j*j)**0.5, but this comes at the cost of losing flexibility: (i*i + j*j)**0.5 will work on single floats AND arrays, whereas math.sqrt will only work on scalars.

And some timings for medium-sized arrays:

i = np.random.rand(100000)
j = np.random.rand(100000)
%timeit np.sqrt(i**2 + j**2)
# 1.45 ms ± 314 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit (i*i + j*j)**0.5
# 1.21 ms ± 78.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

109

answered Oct 10 '22 22:10

JE_Muc

Instead of optimizing this fairly simple function call, you could try to rewrite your program such that i and j are arrays instead of single numbers (assuming that you need to call the function on a lot of different inputs). See this small benchmark:

import numpy as np
i = np.arange(10000)
j = np.arange(10000)

%%timeit 
np.sqrt(i**2+j**2)
# 74.1 µs ± 2.74 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
for idx in range(len(i)):
    np.sqrt(i[idx]**2+j[idx]**2)
# 25.2 ms ± 1.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

As you can see, the first variant (using arrays of numbers as input) is ~300x faster than the second one using a python for loop. The reason for this is that in the first example, all computation is performed by numpy (which is implemented in c internally and therefore really fast), whereas in the second example, numpy code and regular python code (the for loop) interleave, making the execution much slower.

If you really want to improve the performance of your program, I'd suggest rewriting it so that you can perform your function once on two numpy arrays instead of calling it for each pair of numbers.

answered Oct 10 '22 22:10

Felix

Related questions
                            
                                Saving a dataframe result value to a string variable?
                            
                                Show categorical x-axis values when making line plot from pandas Series in matplotlib
                            
                                How can I convert nested dictionary to defaultdict?
                            
                                Python - Split array into multiple arrays
                            
                                Unpack list of lists into list [duplicate]
                            
                                Unable to runserver with docker-compose up
                            
                                "Expand" pandas dataframe by values in column
                            
                                How to crop a bounding box out of an image
                            
                                EOFError: marshal data too short
                            
                                Python Plotly - Multiple dropdown plots, each of which have subplots
                            
                                How to display 16-bit 4096 intensity image in Python openCV?
                            
                                How to divide each column of pandas Dataframe by a Series?
                            
                                psycopg2.extras.DictCursor not returning dict in postgres
                            
                                Why does the result of scipy.sparse.csc_matrix.sum() change its type to numpy matrix?
                            
                                Simple way to print binary numbers in groups of nibbles
                            
                                PySpark Boolean Pivot
                            
                                plot two seaborn heatmap graphs side by side
                            
                                Cachetools for subsequent runs in python
                            
                                Could not convert string to float error from the Titanic competition
                            
                                String Operation on captured group in re Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the most efficient way of doing square root of sum of square of two numbers?

Tags:

performance

python

numpy

G M

People also ask

2 Answers

JE_Muc

Felix

Recent Activity

Donate For Us