Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can split pandas dataframe based on row values?

I have a pandas dataframe that effectively contains several different datasets. Between each dataset is a row full of NaN. Can I split the dataframe on the NaN row to make two dataframes? Thanks in advance.

like image 551
SquatLicense Avatar asked Apr 14 '26 06:04

SquatLicense


1 Answers

You can use this to split into many data frames based on all NaN rows:

#index of all NaN rows (+ beginning and end of df)
idx = [0] + df.index[df.isnull().all(1)].tolist() + [df.shape[0]]
#list of data frames split at all NaN indices
list_of_dfs = [df.iloc[idx[n]:idx[n+1]] for n in range(len(idx)-1)]

And if you want to exclude the NaN rows from split data frames:

idx = [-1] + df.index[df.isnull().all(1)].tolist() + [df.shape[0]]
list_of_dfs = [df.iloc[idx[n]+1:idx[n+1]] for n in range(len(idx)-1)]

Example:

df:

     0    1
0  1.0  1.0
1  NaN  1.0
2  1.0  NaN
3  NaN  NaN
4  NaN  NaN
5  1.0  1.0
6  1.0  1.0
7  NaN  1.0
8  1.0  NaN
9  1.0  NaN

list_of_dfs:

[     0    1
0  1.0  1.0
1  NaN  1.0
2  1.0  NaN, 

Empty DataFrame
Columns: [0, 1]
Index: [],   

     0    1
5  1.0  1.0
6  1.0  1.0
7  NaN  1.0
8  1.0  NaN
9  1.0  NaN]
like image 111
Ehsan Avatar answered Apr 16 '26 18:04

Ehsan