Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas(Python) : Fill empty cells with with previous row value?

I want to fill empty cells with with previous row value if they start with number. For example, I have

    Text    Text         30      Text    Text                 Text    Text                 Text    Text         31      Text    Text     Text    Text         31      Text    Text                 Text    Text                 Text    Text         32      Text    Text     Text    Text                 Text    Text                 Text    Text                 Text    Text                 Text    Text 

I however, want to have

Text    Text     30      Text    Text     30      Text    Text     30      Text    Text     31      Text    Text Text    Text     31      Text    Text     31      Text    Text     31      Text    Text     32      Text    Text Text    Text             Text    Text             Text    Text             Text    Text             Text    Text 

I tried to reach this by using this code:

data = pd.read_csv('DATA.csv',sep='\t', dtype=object, error_bad_lines=False) data = data.fillna(method='ffill', inplace=True) print(data) 

but it did not work.

Is there anyway to do this?

like image 327
i2_ Avatar asked Dec 18 '16 19:12

i2_


People also ask

What is a correct method to fill empty cells with a new value Pandas?

In this method, we will use “df. fillna(method='ffill')” , which is used to propagate non-null values forward or backward.

What is a correct method to fill empty cells with a new value?

Select the empty cells. Press F2 to enter a value in the active cell. Type in the number or text you want. Press Ctrl + Enter.

What is the correct Pandas method for returning the last rows?

The tail() method returns the last n rows. By default, the last 5 rows are returned. You can specify the number of rows.

How do you fill an empty value in Python?

Use the fillna() Method: The fillna() function iterates through your dataset and fills all null rows with a specified value. It accepts some optional arguments—take note of the following ones: Value: This is the value you want to insert into the missing rows. Method: Lets you fill missing values forward or in reverse.


2 Answers

First, replace your empty cells with NaNs:

df[df[0]==""] = np.NaN 

Now, Use ffill():

df.fillna(method='ffill') #       0 #0  Text #1    30 #2    30 #3    30 #4    31 #5  Text #6    31 #7    31 #8    31 #9    32 
like image 113
DYZ Avatar answered Sep 21 '22 22:09

DYZ


I think you can first get NaN instead whitespaces:

df.Text = df.Text[df.Text.str.strip() != '']  print (df)     Text Text.1       0     30   Text  Text 1    NaN   Text  Text 2    NaN   Text  Text 3     31   Text  Text 4   Text   Text   NaN 5     31   Text  Text 6    NaN   Text  Text 7    NaN   Text  Text 8     32   Text  Text 9   Text   Text   NaN 10   NaN   Text  Text 11   NaN   Text  Text 12   NaN   Text  Text 13   NaN   Text  Text 

Then use ffill (same as fillna with parameter ffill), get to_numeric for where for replace NaN if not numeric forward filling NaN, last replace NaN by empty string by fillna:

orig = df.Text.copy() df.Text = df.Text.ffill() mask1 = pd.to_numeric(df.Text, errors='coerce') df.Text = df.Text.where(mask1, orig).fillna('') print (df)     Text Text.1       0     30   Text  Text 1     30   Text  Text 2     30   Text  Text 3     31   Text  Text 4   Text   Text   NaN 5     31   Text  Text 6     31   Text  Text 7     31   Text  Text 8     32   Text  Text 9   Text   Text   NaN 10         Text  Text 11         Text  Text 12         Text  Text 13         Text  Text 
like image 45
jezrael Avatar answered Sep 20 '22 22:09

jezrael