I'll preface this by saying this is a toy example - I do have motivations for doing this, as it sits in the middle of some other chained operations.
I have a DataFrame something like
df
Out[234]:
host1 host2 host3
dates
2014-02-02 1 3 4
2014-02-03 5 2 1
2014-02-04 2 5 6
2014-02-05 4 6 1
2014-02-06 3 2 1
I am trying to produce a new DataFrame consisting of two columns with the hosts being the index - one column being the values in the last row, the second being whether those values in the last row are greater than 1. My corresponding output should then look like:
newdf
Out[235]:
dates 2014-02-06 passes
host1 3 True
host2 2 True
host3 1 False
How can I do this with chained operations?
Accomplishing the output in and of itself is pretty easy I think, I just did
newdf = df.tail(1).T
newdf['passes'] = newdf.iloc[:, 0] > 1
The reason I'm struggling mightily to do it with chained operations is because as soon as I transpose the tail, the column name becomes of type pandas.tslib.Timestamp
,
df.tail(1).T
Out[236]:
dates 2014-02-06
host1 3
host2 2
host3 1
which I can't seem to access to rename with rename
, and so I then can't access it in some boolean operation in assign
to create the new "passes" column.
Data:
My toy DataFrame can be generated with
df = pd.DataFrame(dict(dates=pd.date_range('2014-02-02', periods=5),
host1=[1, 5, 2, 4, 3],
host2=[3, 2, 5, 6, 2],
host3=[4, 1, 6, 1, 1])).set_index('dates')
You can use lambda
expression in assign
, where the parameter is the result from previous chained operation:
df.tail(1).T.assign(passes = lambda x: x.iloc[:,0] > 1)
#dates 2014-02-06 00:00:00 passes
#host1 3 True
#host2 2 True
#host3 1 False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With