Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Add an empty row after every index in a MultiIndex dataframe

Consider below df:

              IA1  IA2  IA3
Name Subject               
Abc  DS        45   43   34
     DMS       43   23   45
     ADA       32   46   36
Bcd  BA        45   35   37
     EAD       23   45   12
     DS        23   35   43
Cdf  EAD       34   33   23
     ADA       12   34   25

How can I add an empty row after each Name index?

Expected output:

              IA1  IA2  IA3
Name Subject               
Abc  DS        45   43   34
     DMS       43   23   45
     ADA       32   46   36

Bcd  BA        45   35   37
     EAD       23   45   12
     DS        23   35   43

Cdf  EAD       34   33   23
     ADA       12   34   25
     
like image 721
Mayank Porwal Avatar asked Jan 12 '21 07:01

Mayank Porwal


People also ask

How do you add blank rows in pandas?

You can also insert a new row to an existing Pandas Dataframe using numpy. insert(). The major advantage of using numpy. insert() to insert a new row to the Pandas Dataframe is that you can insert the new row at an arbitrary or a desired position/index in the dataframe by declaring the desired index of the row in np.

What does the pandas function MultiIndex From_tuples do?

from_tuples() function is used to convert list of tuples to MultiIndex. It is one of the several ways in which we construct a MultiIndex.

What does .index in pandas do?

pandas. Index is a basic object that stores axis labels for all pandas objects. DataFrame is a two-dimensional data structure, immutable, heterogeneous tabular data structure with labeled axis rows, and columns.


2 Answers

Use custom function for add empty rows in GroupBy.apply:

def f(x):
    x.loc[('', ''), :] = ''
    return x

Or:

def f(x):
    return x.append(pd.DataFrame('', columns=df.columns, index=[(x.name, '')]))

df = df.groupby(level=0, group_keys=False).apply(f)
print (df)
             IA1 IA2 IA3
Name Subject            
Abc  DS       45  43  34
     DMS      43  23  45
     ADA      32  46  36
                        
Bcd  BA       45  35  37
     EAD      23  45  12
     DS       23  35  43
                        
Cdf  EAD      34  33  23
     ADA      12  34  25
                        
like image 117
jezrael Avatar answered Oct 23 '22 16:10

jezrael


Adding another way using df.reindex and fill_value as '' after using pd.MultiIndex.from_product and Index.union and then sorting it.

idx = df.index.union(pd.MultiIndex.from_product((df.index.levels[0],[''])),sort=False)
out = df.reindex(sorted(idx,key=lambda x: x[0]),fill_value='')

print(out)

             IA1 IA2 IA3
Name Subject            
Abc  DS       45  43  34
     DMS      43  23  45
     ADA      32  46  36
                        
Bcd  BA       45  35  37
     EAD      23  45  12
     DS       23  35  43
                        
Cdf  EAD      34  33  23
     ADA      12  34  25
 

We use sort=False when using Index.union the index so order is retained , then using sorted on the first element returns:

sorted(idx,key=lambda x:x[0])

[('Abc', 'DS'),
 ('Abc', 'DMS'),
 ('Abc', 'ADA'),
 ('Abc', ''),
 ('Bcd', 'BA'),
 ('Bcd', 'EAD'),
 ('Bcd', 'DS'),
 ('Bcd', ''),
 ('Cdf', 'EAD'),
 ('Cdf', 'ADA'),
 ('Cdf', '')]
like image 41
anky Avatar answered Oct 23 '22 17:10

anky