Python Pandas - Fill down text value in column where following cells are blank

Question

I have a dataframe and i'm trying to fill down the value in the 'Date' column (which is text),as follows:

The dataframe is generated using dfs=pd.read_html(pageUrl,infer_types=False) then df=dfs[0]

            Date     Time datetime  Year
    0               None     None  2007
    1     May 1     0:58     None  2007
    2               1:00     None  2007
    3               1:30     None  2007
    4               1:45     None  2007
    5               3:45     None  2007
    6               4:45     None  2007
    7               6:30     None  2007
    8               7:15     None  2007
    9               7:45     None  2007

df.dtypes shows;

    Date        object
    Time        object
    datetime    object
    Year         int64
    dtype: object

Firstly I tried filling on a per-row basis. Trying to shift back one row to get the previous value if the current 'Date' is empty:

    def fillDate(r):
        if r['Date']=="":
            p=r.shift(-1)
            r['Date']=p['Date']
        return r

then

    df.apply(fillDate,axis=1)

This populates the 'Date' column with the 'Time'.

So then I tried applying with axis=0 (per column basis) and modifying the function so it only applies this to the 'Date' column (I can't see how to apply this to just one column)

    def fillDate(r):
        if r.name=='Date':
            if r['Date']=="":
                p=r.shift(-1)
                r['Date']=p['Date']
        return r

then

    df.apply(fillDate,axis=0)

gives the error

    KeyError: ('Date', u'occurred at index Date')

The aim is to fill down the value in the 'Date' with the value from the previous cell when the 'Date' is blank.

How can I do this?

Jeff · Accepted Answer

In [16]: df = pd.read_fwf(StringIO(data),widths=[5,12,8,8,6],header=0,names=['idx','date','time','datetime','year'])

# simulate what the OP actually has (though this doesn't happen upon read in)

In [30]: df['date'] = df['date'].fillna('')

In [31]: df
Out[31]: 
   idx   date  time datetime  year
0    0         None     None  2007
1    1  May 1  0:58     None  2007
2    2         1:00     None  2007
3    3         1:30     None  2007
4    4         1:45     None  2007
5    5         3:45     None  2007
6    6         4:45     None  2007
7    7         6:30     None  2007
8    8         7:15     None  2007
9    9         7:45     None  2007

In [32]: df.loc[df.date=='','date'] = np.nan

In [33]: df
Out[33]: 
   idx   date  time datetime  year
0    0    NaN  None     None  2007
1    1  May 1  0:58     None  2007
2    2    NaN  1:00     None  2007
3    3    NaN  1:30     None  2007
4    4    NaN  1:45     None  2007
5    5    NaN  3:45     None  2007
6    6    NaN  4:45     None  2007
7    7    NaN  6:30     None  2007
8    8    NaN  7:15     None  2007
9    9    NaN  7:45     None  2007

In [34]: df['date']  = df['date'].ffill()

In [35]: df
Out[35]: 
   idx   date  time datetime  year
0    0    NaN  None     None  2007
1    1  May 1  0:58     None  2007
2    2  May 1  1:00     None  2007
3    3  May 1  1:30     None  2007
4    4  May 1  1:45     None  2007
5    5  May 1  3:45     None  2007
6    6  May 1  4:45     None  2007
7    7  May 1  6:30     None  2007
8    8  May 1  7:15     None  2007
9    9  May 1  7:45     None  2007

Minura Punchihewa · Answer

If I am understanding the problem correctly, it should be as easy as,

df['Date'] = ['Date'].ffill(axis=0)

This will fill any missing values with the previously available value from the same column.

Here are some links that can be used to understand the method, including the documentation, https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.ffill.html https://www.studytonight.com/pandas/pandas-dataframe-ffill-method

Python Pandas - Fill down text value in column where following cells are blank

Tags:

python

pandas

numpy

zio

2 Answers

Jeff

Minura Punchihewa

Recent Activity

Donate For Us

Python Pandas - Fill down text value in column where following cells are blank

Tags:

python

pandas

numpy

zio

2 Answers

Jeff

Minura Punchihewa

Related questions

Recent Activity

Donate For Us