I have the following structure of dataframe:
| Number | From | To |D1_value |D2_value|D3_value |
| 111 | A | B | 10 | 12 | NaN |
| 222 | B | A | NaN | 4 | 6 |
How to convert it into:
|Number | From | To | Type | Value |
|111 | A | B | D1 | 10 |
|111 | A | B | D2 | 12 |
|222 | B | A | D2 | 4 |
|222 | B | A | D3 | 6 |
I will be beyond grateful if you can help me!
You basically need unpivot or melt: https://pandas.pydata.org/docs/reference/api/pandas.melt.html
pd.melt(df,
id_vars=['Number','From','To'],
value_vars = ['D1_value','D2_value'])\
.rename({'variable':'Type'},axis=1)\
.dropna(subset=['value'],axis=0)
You can also use pd.wide_to_long, after reordering the column positions:
temp = df.rename(columns = lambda col: "_".join(col.split("_")[::-1])
if col.endswith("value") else col)
pd.wide_to_long(temp,
stubnames = 'value',
i=['Number', 'From', 'To'],
j='Type', sep='_', suffix=".+").dropna().reset_index()
Out[19]:
Number From To Type value
0 111 A B D1 10.0
1 111 A B D2 12.0
2 222 B A D2 4.0
3 222 B A D3 6.0
You could also use pivot_longer from pyjanitor :
# pip install pyjanitor
import janitor
import pandas as pd
df.pivot_longer(index = slice('Number', 'To'),
#.value keeps column labels associated with it
# as column headers
names_to=('Type', '.value'),
names_sep='_').dropna()
Out[22]:
Number From To Type value
0 111 A B D1 10.0
2 111 A B D2 12.0
3 222 B A D2 4.0
5 222 B A D3 6.0
You can also use stack:
df = df.set_index(['Number', 'From', 'To'])
# this creates a MultiIndex column
df.columns = df.columns.str.split("_", expand = True)
df.columns.names = ['Type', None]
# stack has dropna=True as default
df.stack(level = 0).reset_index()
Number From To Type value
0 111 A B D1 10.0
1 111 A B D2 12.0
2 222 B A D2 4.0
3 222 B A D3 6.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With