Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas - Reshape Dataframe

Tags:

python

pandas

Given the following data frame:

pd.DataFrame({"A":[1,2,3],"B":[4,5,6],"C":[6,7,8]})

   A   B   C
0  1   4   6
1  2   5   7
2  3   6   8
3  11  14  16
4  12  15  17
5  13  16  18

I would like to reshape it so it would look like so:

   A   B   C   A_1   B_1   C_1   A_2   B_2   C_2
0  1   4   6     2     5     7     3     6     8
1  11  14  16    12    15    17    13    16    18

So every 3 rows are grouped into 1 row

How can I achieve this with pandas?

like image 367
Shlomi Schwartz Avatar asked Jun 07 '20 12:06

Shlomi Schwartz


2 Answers

One idea is create MultiIndex with integer and modulo division and reshape by DataFrame.unstack:

a = np.arange(len(df))
df.index = [a // 3, a % 3]
df = df.unstack().sort_index(axis=1, level=1)
df.columns = [f'{a}_{b}' for a, b in df.columns]
print (df)
   A_0  B_0  C_0  A_1  B_1  C_1  A_2  B_2  C_2
0    1    4    6    2    5    7    3    6    8
1   11   14   16   12   15   17   13   16   18

For reverse operation is possible use str.split with DataFrame.stack:

a = np.arange(len(df))
df1 = (df.set_index(pd.MultiIndex.from_arrays([a // 3, a % 3]))
         .unstack().sort_index(axis=1, level=1))
df1.columns = [f'{a}_{b}' for a, b in df1.columns]
print (df1)
   A_0  B_0  C_0  A_1  B_1  C_1  A_2  B_2  C_2
0    1    4    6    2    5    7    3    6    8
1   11   14   16   12   15   17   13   16   18

df1.columns = df1.columns.str.split('_', expand=True)
df2 = df1.stack().reset_index(drop=True)
print (df2)
    A   B   C
0   1   4   6
1   2   5   7
2   3   6   8
3  11  14  16
4  12  15  17
5  13  16  18
like image 53
jezrael Avatar answered Sep 28 '22 04:09

jezrael


new = pd.concat([df[a::3].reset_index(drop=True) for a in range(3)], axis=1)
new.columns = ['{}_{}'.format(a,b) for b in range(3) for a in 'ABC']
like image 44
warped Avatar answered Sep 28 '22 05:09

warped