Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shuffling Several DataFrames Together

Is it possible to shuffle several DataFrames together?

For example I have a DataFrame df1 and a DataFrame df2. I want to shuffle the rows randomly, but for both DataFrames in the same way.

Example

df1:

|___|_______|
| 1 |  ...  |
| 2 |  ...  |
| 3 |  ...  |
| 4 |  ...  |

df2:

|___|_______|
| 1 |  ...  |
| 2 |  ...  |
| 3 |  ...  |
| 4 |  ...  |

After shuffling a possible order for both DataFrames could be:

|___|_______|
| 2 |  ...  |
| 3 |  ...  |
| 4 |  ...  |
| 1 |  ...  |
like image 422
ScientiaEtVeritas Avatar asked Dec 18 '22 06:12

ScientiaEtVeritas


1 Answers

I think you can double reindex with applying numpy.random.permutation to index, but is necessary both DataFrames have same length and same unique index values:

df1 = pd.DataFrame({'a':range(5)})
print (df1)
   a
0  0
1  1
2  2
3  3
4  4

df2 = pd.DataFrame({'a':range(5)})
print (df2)
   a
0  0
1  1
2  2
3  3
4  4

idx = np.random.permutation(df1.index)
print (df1.reindex(idx))
   a
2  2
4  4
1  1
3  3
0  0

print (df2.reindex(idx))
   a
2  2
4  4
1  1
3  3
0  0

Alternative with reindex_axis:

print (df1.reindex_axis(idx, axis=0))
print (df2.reindex_axis(idx, axis=0))
like image 132
jezrael Avatar answered Dec 20 '22 21:12

jezrael