I want to create duplicate rows for a dataframe in python. The dataframe looks like this
SKU Ids wk_1 wk_2 wk_3 wk_4 wk_5 wk_6
10 20 1 2 3 4 5 6
30 40 6 5 4 3 2 1
I want the output as
SKU Ids wk value
10 20 wk_1 1
10 20 wk_2 2
10 20 wk_3 3
10 20 wk_4 4
10 20 wk_5 5
10 20 wk_6 6
30 40 wk_1 6
30 40 wk_2 5
30 40 wk_3 4
30 40 wk_4 3
30 40 wk_5 2
30 40 wk_6 1
I am trying to use pivot_table, but it is showing me error
hqp = hq.pivot_table(columns=['sku', 'ids','value'],
index= ['sku', 'ids'],
values = ['wk_1', 'wk_2', 'wk_3', 'wk_4','wk_5', 'wk_6'])
This is how wide_to_long
build for
pd.wide_to_long(df,['wk'],i=['SKU','Ids'],j='value',sep='_').reset_index()
Out[28]:
SKU Ids value wk
0 10 20 1 1
1 10 20 2 2
2 10 20 3 3
3 10 20 4 4
4 10 20 5 5
5 10 20 6 6
6 30 40 1 6
7 30 40 2 5
8 30 40 3 4
9 30 40 4 3
10 30 40 5 2
11 30 40 6 1
Set SKU
and Ids
as index and stack
then reset_index
and rename
:
df = df.set_index(['SKU','Ids'])\
.stack().reset_index()\
.rename(columns={'level_2':'wk',0:'value'})
Or:
df = df.set_index(['SKU','Ids'])\
.stack().reset_index(name='value')\
.rename(columns={'level_2':'wk'})
Or as per W-B
suggestion in comments one more method using melt
and sort_values
:
df = df.melt(id_vars=['SKU','Ids'])\
.rename(columns={'variable':'wk'})\
.sort_values(['SKU','Ids'])
print(df)
SKU Ids wk value
0 10 20 wk_1 1
1 10 20 wk_2 2
2 10 20 wk_3 3
3 10 20 wk_4 4
4 10 20 wk_5 5
5 10 20 wk_6 6
6 30 40 wk_1 6
7 30 40 wk_2 5
8 30 40 wk_3 4
9 30 40 wk_4 3
10 30 40 wk_5 2
11 30 40 wk_6 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With