I have a pandas DataFrame with the following structure:

And I have an array of tuples
arr_tuples = [(0,3),(1,1),(1,3),(2,1)]
Each tuple in the array represent the row and column index of the above dataframe respectively.
I can find all the values in the data frame for the indices in arr_tuples using for loop like this:
value_array = []
for item in arr_tuples:
row = item[0]
col = item[1]
value = df.iloc[row,col] # I also tried df.get_value here with similar result
value_array.append(value)
But this seems to be a very slow method. If there are a lot of tuples in my arr_tuples, this will take a long time.
Is there a better and faster way to achieve the same ? Is there any way in pandas where I can use a list/array of tuples (containing row and column index) to get values in a dataframe ?
You can use pd.DataFrame.lookup with some zip and unpacking trickery
df.lookup(*zip(*arr_tuples))
array([ 4, 5, 7, 12])
list(zip(*arr_tuples)) creates two tuples out of the list of tuples
[(0, 1, 1, 2), (3, 1, 3, 1)]
Well that's perfect because the first tuple are indices and the second are columns. That's what pd.DataFrame.lookup accepts as arguments. So if I unpack those, it'll just work
df.lookup(*zip(*arr_tuples))
array([ 4, 5, 7, 12])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With