Imagine I have a pandas.Dataframe like:
x = DataFrame({ 'a': [7,6,8,0,2,5],
'b': [3,4,5,6,7,8],
'c': [3,8,5,6,0,1]}, index=[1,2,3,4,5,6])
then, I have a pandas.Series that gives me, for each key, a specific index I want to select:
y = Series([4,1,6], index=['a','b','c'])
Is there someway I could locate these indexes in the best pandas way? I wish to avoid looping over the pandas.Series or the pandas.Dataframe, and I prefer using commands like .loc, .query and so on.
You can use a combination of loc and np.diagonal to achieve this:
In [26]:
np.diagonal(x.loc[y])
Out[26]:
array([0, 3, 1], dtype=int64)
loc here will perform row label lookup:
In [27]:
x.loc[y]
Out[27]:
a b c
4 0 6 6
1 7 3 3
6 5 8 1
np.diagonal returns the values in the diagonal.
To make this robust to column order we can specifically use the values for the label lookup and the index for the columns to select:
In [30]:
np.diagonal(x.loc[y.values, y.index])
Out[30]:
array([0, 3, 1], dtype=int64)
The above will work with even if the columns in y are a different order than x column order.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With