I have df with column names: 'a', 'b', 'c' ... 'z'. <pre class="prettyprint"><code>print(my_df.columns) Index(['a', 'b', 'c', ... 'y', 'z'], dtype='object', name=0) </code></pre> I have function which determine which columns should be displayed. For example: <pre class="prettyprint"><code>start = con_start() stop = con_stop() print(my_df.columns >= start) & (my_df <= stop) </code></pre> My result is: <pre class="prettyprint"><code>[False False ... False False False False True True True True False False] </code></pre> My goal is display dataframe only with columns that satisfy my condition. If start = 'a' and stop = 'b', I want to have: <pre class="prettyprint"><code>0 a b index1 index2 New York New York 0.000000 0.000000 California Los Angeles 207066.666667 214466.666667 Illinois Chicago 138400.000000 143633.333333 Pennsylvania Philadelphia 53000.000000 53633.333333 Arizona Phoenix 111833.333333 114366.666667 </code></pre>

You can use slicing to achieve this with .loc: <pre class="prettyprint"><code> df.loc[:,'a':'b'] </code></pre>

Python pandas -> select by condition in columns name

I have df with column names: 'a', 'b', 'c' ... 'z'.

print(my_df.columns)
Index(['a', 'b', 'c', ... 'y', 'z'],
  dtype='object', name=0)

I have function which determine which columns should be displayed. For example:

start = con_start()
stop = con_stop()
print(my_df.columns >= start) & (my_df <= stop)

My result is:

[False False ... False False False False  True  True
True  True False False]

My goal is display dataframe only with columns that satisfy my condition. If start = 'a' and stop = 'b', I want to have:

0                                      a              b         
index1       index2                                                  
New York     New York           0.000000       0.000000          
California   Los Angeles   207066.666667  214466.666667     
Illinois     Chicago       138400.000000  143633.333333     
Pennsylvania Philadelphia   53000.000000   53633.333333      
Arizona      Phoenix       111833.333333  114366.666667

How do I pull column names in pandas?

You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.

You can use slicing to achieve this with .loc:

 df.loc[:,'a':'b']

I want to make this robust and with as few assumptions as possible.

option 1
use iloc with array slicing
Assumptions:

my_df.columns.is_unique evaluates to True
columns are already in order

start = df.columns.get_loc(con_start())
stop = df.columns.get_loc(con_stop())

df.iloc[:, start:stop + 1]

option 2
use loc with boolean slicing
Assumptions:

column values are comparable

start = con_start()
stop = con_stop()

c = df.columns.values
m = (start <= c) & (stop >= c)

df.loc[:, m]

Python pandas -> select by condition in columns name

Tags:

python

python-3.x

pandas

data-science

CezarySzulc

People also ask

2 Answers

Scott Boston

piRSquared

Recent Activity

Donate For Us

Python pandas -> select by condition in columns name

Tags:

python

python-3.x

pandas

data-science

CezarySzulc

People also ask

2 Answers

Scott Boston

piRSquared

Related questions

Recent Activity

Donate For Us