Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient splitting of data in Python

Consider following code

one, two = sales.random_split(0.5, seed=0)
set_1, set_2 = one.random_split(0.5, seed=0)
set_3, set_4 = two.random_split(0.5, seed=0)

What I am trying to in this code is to randomly split my data in Sales Sframe (which is similar to Pandas DataFrame) into roughly 4 equal parts.

What is a Pythonic/Efficient way to achieve this?

like image 964
Khurram Majeed Avatar asked Oct 30 '22 13:10

Khurram Majeed


1 Answers

np.random.seed(0)
np.random.shuffle(arr) # in-place
sets = np.array_split(arr, 4)
like image 146
John Zwinck Avatar answered Nov 02 '22 23:11

John Zwinck