Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - Insert blank row for each group in pandas

I have a dataframe

import pandas as pd
import numpy as np
df1=pd.DataFrame({'group':[1,1,2,2,2],
             'value':[2,3,np.nan,5,4]})
df1

    group   value
0   1       2
1   1       3
2   2       NaN
3   2       5
4   2       4

I want to add a row after each group in which the value of value is NaN . The desire output is:

   group   value
0   1       2
1   1       3
2   1       NaN
3   2       NaN
4   2       5
5   2       4
6   2       NaN

In my real dataset I have a lot of groups and more columns besides value, I want all of them to be NaN in newly added row.

Thanks a lot for the help

like image 548
Giang Do Avatar asked Aug 16 '18 20:08

Giang Do


People also ask

How do you add blank rows in pandas?

Appending an empty row in pandas dataframe We will first create a DataFrame and then we will add an empty row by using the concat() method or append() method, inside this method we will pass an empty Series such that it does not hold any value.

How do I group specific rows in pandas?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.

How do I add a row number to a DataFrame in pandas?

Use concat() to Add a Row at Top of DataFrame Use pd. concat([new_row,df. loc[:]]). reset_index(drop=True) to add the row to the first position of the DataFrame as Index starts from zero.

What does .values do in pandas?

Definition and Usage The values property returns all values in the DataFrame. The return value is a 2-dimensional array with one array for each row.


1 Answers

concat with append

s = df1.groupby('group')
out = pd.concat([i.append({'value': np.nan}, ignore_index=True) for _, i in s])
out.group = out.group.ffill().astype(int)

apply with append[1]

df1.groupby('group').apply(
    lambda d: d.append({'group': d.name}, ignore_index=True).astype({'group': int})
).reset_index(drop=True)

Both produce:

   group  value
0      1    2.0
1      1    3.0
2      1    NaN
3      2    NaN
4      2    5.0
5      2    4.0
6      2    NaN

[1] This solution brought to you by your local @piRSquared

like image 79
user3483203 Avatar answered Oct 13 '22 10:10

user3483203