I have a data frame that look as follow:
import pandas as pd
d = {'decil': ['1. decil','1. decil','2. decil','2. decil','3. decil','3. decil'],
'kommune': ['AA','BB','AA','BB','AA','BB'],'2010':[44,25,242,423,845,962],
'2011':[64,26,239,620,862,862]}
df = pd.DataFrame(data=d)
Printing
decil kommune 2010 2011
1. decil AA 44 64
1. decil BB 25 26
2. decil AA 242 239
2. decil BB 423 620
3. decil AA 845 862
3. decil BB 962 862
My desired output is something like this
kommune year 1. decil 2. decil 3. decil
AA 2010 44 242 845
AA 2011 64 239 862
BB 2010 25 423 962
BB 2011 25 620 862
That is, I'm searching for a way to change the 'decil' column from long to wide format while at the same time changing the year columns from wide to long format. I have tried pd.pivot_table, loops and unstack without any luck. Is there any smart way around this? In advance, thanks for the help.
A dataset can be written in two different formats: wide and long. A wide format contains values that do not repeat in the first column. A long format contains values that do repeat in the first column. Notice that in the wide dataset, each value in the first column is unique.
You want to reshape it to wide format. Press CTRL + SHIFT + ENTER to confirm this formula as it's an array formula. If this formula is entered correctly, you would see the formula inside the curly brackets {}. Column A does not necessarily to be in numeric format.
To convert long data back into a wide format, we can use the cast function. There are many cast functions, but we will use the dcast function because it is used for data frames.
Use set_index
with stack
and unstack
:
df = (df.set_index(['decil','kommune'])
.stack()
.unstack(0)
.reset_index()
.rename_axis(None, axis=1))
print (df)
kommune level_1 1. decil 2. decil 3. decil
0 AA 2010 44 242 845
1 AA 2011 64 239 862
2 BB 2010 25 423 962
3 BB 2011 26 620 862
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With