I have a pandas dataframe, with 4 rows and 4 columns - here is asimple version:
import pandas as pd
import numpy as np
rows = np.arange(1, 4, 1)
values = np.arange(1, 17).reshape(4,4)
df = pd.DataFrame(values, index=rows, columns=['A', 'B', 'C', 'D'])
what I am trying to do is to convert this to a 2 * 8 dataframe, with B, C and D alligng for each array - so it would look like this:
1 2
1 3
1 4
5 6
5 7
5 8
9 10
9 11
9 12
13 14
13 15
13 16
reading on pandas documentation I tried this:
df1 = pd.pivot_table(df, rows = ['B', 'C', 'D'], cols = 'A')
but gives me an error that I cannot identify the source (ends with
DataError: No numeric types to aggregate
)
following that I want to split the dataframe based on A values, but I think the .groupby command is probably going to take care of it
What you are looking for is the melt
function
pd.melt(df,id_vars=['A'])
A variable value
0 1 B 2
1 5 B 6
2 9 B 10
3 13 B 14
4 1 C 3
5 5 C 7
6 9 C 11
7 13 C 15
8 1 D 4
9 5 D 8
10 9 D 12
11 13 D 16
A final sorting according to A
is then necessary
pd.melt(df,id_vars=['A']).sort('A')
A variable value
0 1 B 2
4 1 C 3
8 1 D 4
1 5 B 6
5 5 C 7
9 5 D 8
2 9 B 10
6 9 C 11
10 9 D 12
3 13 B 14
7 13 C 15
11 13 D 16
Note: pd.DataFrame.sort
has been deprecated in favour of pd.DataFrame.sort_values
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With