where I have found this option in other languages such as R or SQL but I am not quite sure how to go about this in Pandas.
So I have a file with 1262 columns and 1 row and need the column headers to return for every time that a specific value appears.
Say for example this test dataframe:
Date col1 col2 col3 col4 col5 col6 col7
01/01/2016 00:00 37.04 36.57 35.77 37.56 36.79 35.90 38.15
And I need to locate the column name for e.g. where value = 38.15. What is the best way of doing so?
Thanks
You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.
You can refer to variables in the environment by prefixing them with an '@' character like @a + b . You can refer to column names that contain spaces or operators by surrounding them in backticks. This way you can also escape names that start with a digit, or those that are a Python keyword.
Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.
Seeing as you only have a single row then you can call iloc[0]
on the result and use this to mask the columns:
In [47]:
df.columns[(df == 38.15).iloc[0]]
Out[47]:
Index(['col7'], dtype='object')
Breaking down the above:
In [48]:
df == 38.15
Out[48]:
Date col1 col2 col3 col4 col5 col6 col7
01/01/2016 False False False False False False False True
In [49]:
(df == 38.15).iloc[0]
Out[49]:
Date False
col1 False
col2 False
col3 False
col4 False
col5 False
col6 False
col7 True
Name: 01/01/2016, dtype: bool
You can also use idxmax
with param axis=1
:
In [52]:
(df == 38.15).idxmax(axis=1)[0]
Out[52]:
'col7'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With