Im just getting going with Pandas as a tool for munging two dimensional arrays of data. It's super overwhelming, even after reading the docs. You can do so much that I can't figure out how to do anything, if that makes any sense.
My dataframe (simplified):
Date       Stock1  Stock2   Stock3
2014.10.10  74.75  NaN     NaN
2014.9.9    NaN    100.95  NaN 
2010.8.8    NaN    NaN     120.45
So each column only has one value.
I want to remove all columns that have a max value less than x. So say here as an example, if x = 80, then I want a new DataFrame:
Date        Stock2   Stock3
2014.10.10   NaN     NaN
2014.9.9     100.95  NaN 
2010.8.8     NaN     120.45
How can this be acheived? I've looked at dataframe.max() which gives me a series. Can I use that, or have a lambda function somehow in select()?
You can drop columns by index by using DataFrame. drop() method and by using DataFrame. iloc[].
Use drop() method to delete rows based on column value in pandas DataFrame, as part of the data cleansing, you would be required to drop rows from the DataFrame when a column value matches with a static value or on another column value.
Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).
Use the df.max() to index with.
In [19]: from pandas import DataFrame
In [23]: df = DataFrame(np.random.randn(3,3), columns=['a','b','c'])
In [36]: df
Out[36]: 
          a         b         c
0 -0.928912  0.220573  1.948065
1 -0.310504  0.847638 -0.541496
2 -0.743000 -1.099226 -1.183567
In [24]: df.max()
Out[24]: 
a   -0.310504
b    0.847638
c    1.948065
dtype: float64
Next, we make a boolean expression out of this:
In [31]: df.max() > 0
Out[31]: 
a    False
b     True
c     True
dtype: bool
Next, you can index df.columns by this (this is called boolean indexing):
In [34]: df.columns[df.max() > 0]
Out[34]: Index([u'b', u'c'], dtype='object')
Which you can finally pass to DF:
In [35]: df[df.columns[df.max() > 0]]
Out[35]: 
          b         c
0  0.220573  1.948065
1  0.847638 -0.541496
2 -1.099226 -1.183567
Of course, instead of 0, you use any value that you want as the cutoff for dropping.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With