Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return the column name(s) for a specific value in a pandas dataframe

Tags:

python

pandas

where I have found this option in other languages such as R or SQL but I am not quite sure how to go about this in Pandas.

So I have a file with 1262 columns and 1 row and need the column headers to return for every time that a specific value appears.

Say for example this test dataframe:

Date               col1    col2    col3    col4    col5    col6    col7 
01/01/2016 00:00   37.04   36.57   35.77   37.56   36.79   35.90   38.15 

And I need to locate the column name for e.g. where value = 38.15. What is the best way of doing so?

Thanks

like image 857
Helena K Avatar asked Jul 12 '16 14:07

Helena K


People also ask

How do you get the name of the column of a value in Pandas?

You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.

How do I reference a column name in Pandas?

You can refer to variables in the environment by prefixing them with an '@' character like @a + b . You can refer to column names that contain spaces or operators by surrounding them in backticks. This way you can also escape names that start with a digit, or those that are a Python keyword.

How do I select column names in a DataFrame?

Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.


1 Answers

Seeing as you only have a single row then you can call iloc[0] on the result and use this to mask the columns:

In [47]:
df.columns[(df == 38.15).iloc[0]]

Out[47]:
Index(['col7'], dtype='object')

Breaking down the above:

In [48]:
df == 38.15

Out[48]:
             Date   col1   col2   col3   col4   col5   col6  col7
01/01/2016  False  False  False  False  False  False  False  True

In [49]:
(df == 38.15).iloc[0]

Out[49]:
Date    False
col1    False
col2    False
col3    False
col4    False
col5    False
col6    False
col7     True
Name: 01/01/2016, dtype: bool

You can also use idxmax with param axis=1:

In [52]:
(df == 38.15).idxmax(axis=1)[0]

Out[52]:
'col7'
like image 133
EdChum Avatar answered Oct 09 '22 08:10

EdChum