Task: Search a multi column dataframe for a value (all values are unique) and return the index of that row.
Currently: using get_loc, but it only seems allow a pass of a single column at a time, resulting in quite a ineffective set of try except statements. Although it works is anyone aware of a more effective way to do this?
df = pd.DataFrame(np.random.randint(0,100,size=(4, 4)), columns=list('ABCD'))
try:
unique_index = pd.Index(df['A'])
print(unique_index.get_loc(20))
except KeyError:
try:
unique_index = pd.Index(df['B'])
print(unique_index.get_loc(20))
except KeyError:
unique_index = pd.Index(df['C'])
print(unique_index.get_loc(20))
Loops don't seem to work because of the KeyError that is raised if a column doesn't contain a value. I've looked at functions such as .contains or .isin but it's the location index that i'm interested in.
Consider this example instead using np.random.seed
np.random.seed([3, 1415])
df = pd.DataFrame(
np.random.randint(200 ,size=(4, 4)),
columns=list('ABCD'))
df
A B C D
0 11 98 123 90
1 143 126 55 141
2 139 141 154 115
3 63 104 128 120
We can find where values are what you're looking for using np.where
and slicing. Notice that I used a value of 55
because that what I had in the data I got from the seed I chose. This will work just fine for 20
if it is in your data set. In fact, it'll work if you have more than one.
i, j = np.where(df.values == 55)
list(zip(df.index[i], df.columns[j]))
[(1, 'C')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With