I have to calculate the distance on a hilbert-curve from 2D-Coordinates. With the hilbertcurve-package i built my own "hilbert"-function, to do so. The coordinates are stored in a dataframe (col_1 and col_2). As you see, my function works when applied to two values (test).
However it just does not work when applied row wise via apply-function! Why is this? what am I doing wrong here? I need an additional column "hilbert" with the hilbert-distances from the x- and y-coordinate given in columns "col_1" and "col_2".
import pandas as pd
from hilbertcurve.hilbertcurve import HilbertCurve
df = pd.DataFrame({'ID': ['1', '2', '3'],
'col_1': [0, 2, 3],
'col_2': [1, 4, 5]})
def hilbert(x, y):
n = 2
p = 7
hilcur = HilbertCurve(p, n)
dist = hilcur.distance_from_coordinates([x, y])
return dist
test = hilbert(df.col_1[2], df.col_2[2])
df["hilbert"] = df.apply(hilbert(df.col_1, df.col_2), axis=0)
The last command ends in error:
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Thank you for your help!
Since you have hilbert(df.col_1, df.col_2)
in the apply, that's immediately trying to call your function with the full pd.Series
es for those two columns, triggering that error. What you should be doing is:
df.apply(lambda x: hilbert(x['col_1'], x['col_2']), axis=1)
so that the lambda function given will be applied to each row.
You have to define your axis as 1, because you want to apply your function on the rows, not the columns.
You can define a lambda function to apply the hilbert only for the two rows like this:
df['hilbert'] = df.apply(lambda row: hilbert(row['col_1'], row['col_2']), axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With