Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split pandas Dataframe into n equal parts + 1

I have a pandas Dataframe containing 44150 rows.

I want to split into sub-dataframes each containing 100 rows except the last that has to contain 50.

I've tried using numpy.array_split but it's splitting it into 392 dataframes of size 100 and 50 dataframes of size 99.

Is there anyway to split it the way I want?

like image 536
MehdiOua Avatar asked Jan 17 '19 21:01

MehdiOua


1 Answers

You can use iloc and a list comprehension:

df = pd.DataFrame({
    'x':np.random.randn(44150),
    'y':np.random.randn(44150),
})

S = 100
N = int(len(df)/S)
frames = [ df.iloc[i*S:(i+1)*S].copy() for i in range(N+1) ]

The last DataFrame - which can be found in frames[-1] - has 50 rows, while the other ones have 100.

like image 199
Abramodj Avatar answered Oct 11 '22 13:10

Abramodj