Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different fill methods for different columns in pandas

Tags:

python

pandas

I'm reindexing a dataframe in the standard way, i.e.

df.reindex(newIndex,method='ffill')

But realized I need to handle missing data differently on a column-by-column basis. That is, for some columns I want to ffill, but for others I want to missing values recorded as NAs.

For simplicity, let's say I have column X that I want ffilled, and column Y that I want NA-filled. How can I call .reindex to accomplish this?

like image 863
moustachio Avatar asked Oct 30 '13 23:10

moustachio


1 Answers

You can reindex() first, and then call ffill() for columns:

import pandas as pd
df = pd.DataFrame({"A":[10, 20, 30], "B":[100, 200, 300], 
                   "C":[100, 200, 300]}, index=[2, 6, 8])
df2 = df.reindex([2,4,6,8,10])

for col in ["A", "B"]:
    df2[col].ffill(inplace=True)
print df2

output:

    A    B    C
2   10  100  100
4   10  100  NaN
6   20  200  200
8   30  300  300
10  30  300  NaN
like image 113
HYRY Avatar answered Nov 10 '22 01:11

HYRY