Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas split /group dataframe by row values

Tags:

python

pandas

I have a dataframe of the following form

In [1]: df
Out [1]:
      A    B    C    D
 1    0    2    6    0
 2    6    1    5    2
 3  NaN  NaN  NaN  NaN
 4    9    3    2    2
 ...
15    2   12    5   23
16  NaN  NaN  NaN  NaN
17    8    1    5    3

I'm interested in splitting the dataframe into multiple dataframes (or grouping it) by the NaN rows.

So resulting in something as follows

In [2]: df1
Out [2]: 
      A    B    C    D
 1    0    2    6    0
 2    6    1    5    2

In [3]: df2
Out [3]:
      A    B    C    D
 1    9    3    2    2
 ...
12    2   12    5   23

In [4]: df3
Out [4]:
      A    B    C    D
 1    8    1    5    3
like image 699
Lim Jing Avatar asked Feb 13 '26 03:02

Lim Jing


1 Answers

You could use the compare-cumsum-groupby pattern, where we find the all-null rows, cumulative sum those to get a group number for each subgroup, and then iterate over the groups:

In [114]: breaks = df.isnull().all(axis=1)

In [115]: groups = [group.dropna(how='all') for _, group in df.groupby(breaks.cumsum())]

In [116]: for group in groups:
     ...:     print(group)
     ...:     print("--")
     ...:     
     A    B    C    D
1  0.0  2.0  6.0  0.0
2  6.0  1.0  5.0  2.0
--
      A     B    C     D
4   9.0   3.0  2.0   2.0
15  2.0  12.0  5.0  23.0
--
      A    B    C    D
17  8.0  1.0  5.0  3.0
--
like image 119
DSM Avatar answered Feb 15 '26 18:02

DSM



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!