Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More effective way to use pandas get_loc?

Task: Search a multi column dataframe for a value (all values are unique) and return the index of that row.

Currently: using get_loc, but it only seems allow a pass of a single column at a time, resulting in quite a ineffective set of try except statements. Although it works is anyone aware of a more effective way to do this?

df =  pd.DataFrame(np.random.randint(0,100,size=(4, 4)), columns=list('ABCD'))
try: 
     unique_index = pd.Index(df['A'])
     print(unique_index.get_loc(20))
except KeyError:
    try: 
        unique_index = pd.Index(df['B'])
        print(unique_index.get_loc(20))
    except KeyError:
                unique_index = pd.Index(df['C'])
                print(unique_index.get_loc(20))

Loops don't seem to work because of the KeyError that is raised if a column doesn't contain a value. I've looked at functions such as .contains or .isin but it's the location index that i'm interested in.

like image 389
F.D Avatar asked Feb 21 '18 16:02

F.D


1 Answers

Consider this example instead using np.random.seed

np.random.seed([3, 1415])
df = pd.DataFrame(
    np.random.randint(200 ,size=(4, 4)),
    columns=list('ABCD'))

df

     A    B    C    D
0   11   98  123   90
1  143  126   55  141
2  139  141  154  115
3   63  104  128  120

We can find where values are what you're looking for using np.where and slicing. Notice that I used a value of 55 because that what I had in the data I got from the seed I chose. This will work just fine for 20 if it is in your data set. In fact, it'll work if you have more than one.

i, j = np.where(df.values == 55)
list(zip(df.index[i], df.columns[j]))

[(1, 'C')]
like image 100
piRSquared Avatar answered Sep 27 '22 19:09

piRSquared