I'm trying to arrange a pandas dataframe that looks like this:
{'Sample': {0: 'Mark', 1: 'Mark', 2: 'Mark', 3: 'Mark', 4: 'Mark', 5: 'John', 6: 'John', 7: 'John', 8: 'John', 9: 'John', 10: 'Lisa', 11: 'Lisa', 12: 'Lisa', 13: 'Lisa', 14: 'Lisa'}, 'CO': {0: 3, 1: 3, 2: 2, 3: 4, 4: 5, 5: 10, 6: 11, 7: 11, 8: 14, 9: 11, 10: 23, 11: 25, 12: 26, 13: 23, 14: 45}, 'Argon': {0: 7, 1: 8, 2: 9, 3: 5, 4: 7, 5: 45, 6: 65, 7: 45, 8: 67, 9: 43, 10: 34, 11: 78, 12: 79, 13: 87, 14: 67}, 'Carbon': {0: 35, 1: 43, 2: 78, 3: 54, 4: 34, 5: 56, 6: 58, 7: 50, 8: 53, 9: 55, 10: 87, 11: 77, 12: 88, 13: 89, 14: 85}}
| Sample | CO | Argon | Carbon |
|---|---|---|---|
| Mark | 3 | 7 | 35 |
| Mark | 3 | 8 | 43 |
| Mark | 2 | 9 | 78 |
| Mark | 4 | 5 | 54 |
| Mark | 5 | 7 | 34 |
| John | 10 | 45 | 56 |
| John | 11 | 65 | 58 |
| John | 11 | 45 | 50 |
| John | 14 | 67 | 53 |
| John | 11 | 43 | 55 |
| Lisa | 23 | 34 | 87 |
| Lisa | 25 | 78 | 77 |
| Lisa | 26 | 79 | 88 |
| Lisa | 23 | 87 | 89 |
| Lisa | 45 | 67 | 85 |
To this:
{'Chemical': {0: 'Argon', 1: 'Argon', 2: 'Argon', 3: 'Argon', 4: 'Argon', 5: 'Carbon', 6: 'Carbon', 7: 'Carbon', 8: 'Carbon', 9: 'Carbon', 10: 'CO', 11: 'CO', 12: 'CO', 13: 'CO', 14: 'CO'}, 'Mark': {0: 7, 1: 8, 2: 9, 3: 5, 4: 7, 5: 35, 6: 43, 7: 78, 8: 54, 9: 34, 10: 3, 11: 3, 12: 2, 13: 4, 14: 5}, 'John': {0: 45, 1: 65, 2: 45, 3: 67, 4: 43, 5: 56, 6: 58, 7: 50, 8: 53, 9: 55, 10: 10, 11: 11, 12: 11, 13: 14, 14: 11}, 'Lisa': {0: 34, 1: 78, 2: 79, 3: 87, 4: 67, 5: 87, 6: 77, 7: 88, 8: 89, 9: 85, 10: 23, 11: 25, 12: 26, 13: 23, 14: 45}}
| Chemical | Mark | John | Lisa |
|---|---|---|---|
| Argon | 7 | 45 | 34 |
| Argon | 8 | 65 | 78 |
| Argon | 9 | 45 | 79 |
| Argon | 5 | 67 | 87 |
| Argon | 7 | 43 | 67 |
| Carbon | 35 | 56 | 87 |
| Carbon | 43 | 58 | 77 |
| Carbon | 78 | 50 | 88 |
| Carbon | 54 | 53 | 89 |
| Carbon | 34 | 55 | 85 |
| CO | 3 | 10 | 23 |
| CO | 3 | 11 | 25 |
| CO | 2 | 11 | 26 |
| CO | 4 | 14 | 23 |
| CO | 5 | 11 | 45 |
The data is body measurements of three people, and each chemical was measured 5 times per person
It looks like a simple case of transposing, but not surprisingly, it does exactly what it says on the tin! I've also tried melting the dataframe and groupby, but I'm really not getting anywhere with it. Any help would be most welcome.
Thank you!
Use melt with pivot_table in one-liner version:
>>> df.melt('Sample', var_name='Chemical') \
.assign(Idx=lambda x: x.groupby('Sample').cumcount()) \
.pivot_table('value', ['Idx', 'Chemical'], 'Sample') \
.droplevel(0).rename_axis(columns=None).reset_index()
Chemical John Lisa Mark
0 CO 10 23 3
1 CO 11 25 3
2 CO 11 26 2
3 CO 14 23 4
4 CO 11 45 5
5 Argon 45 34 7
6 Argon 65 78 8
7 Argon 45 79 9
8 Argon 67 87 5
9 Argon 43 67 7
10 Carbon 56 87 35
11 Carbon 58 77 43
12 Carbon 50 88 78
13 Carbon 53 89 54
14 Carbon 55 85 34
You can execute all steps independently to see the transformation:
>>> out = df.melt('Sample', var_name='Chemical')
>>> out = out.assign(Idx=lambda x: x.groupby('Sample').cumcount())
>>> out = out.pivot_table('value', ['Idx', 'Chemical'], 'Sample')
>>> out = out.droplevel(0).rename_axis(columns=None).reset_index()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With