Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert dataframe from horizontal to vertical

I have the following structure of dataframe:

| Number   | From | To |D1_value |D2_value|D3_value |
| 111      | A    | B  |    10   |   12   | NaN     |
| 222      | B    | A  |   NaN   |   4    | 6       |

How to convert it into:

|Number  | From | To | Type | Value |
|111     | A    | B  | D1   | 10    |
|111     | A    | B  | D2   | 12    |
|222     | B    | A  | D2   | 4     |
|222     | B    | A  | D3   | 6     |

I will be beyond grateful if you can help me!

like image 404
Anakin Skywalker Avatar asked Dec 07 '25 07:12

Anakin Skywalker


2 Answers

You basically need unpivot or melt: https://pandas.pydata.org/docs/reference/api/pandas.melt.html

pd.melt(df, 
        id_vars=['Number','From','To'], 
        value_vars = ['D1_value','D2_value'])\
.rename({'variable':'Type'},axis=1)\
.dropna(subset=['value'],axis=0)
like image 117
Stavros Davaris Avatar answered Dec 08 '25 19:12

Stavros Davaris


You can also use pd.wide_to_long, after reordering the column positions:

temp = df.rename(columns = lambda col: "_".join(col.split("_")[::-1]) 
                                       if col.endswith("value") else col)


pd.wide_to_long(temp, 
                stubnames = 'value', 
                i=['Number', 'From', 'To'], 
                j='Type', sep='_', suffix=".+").dropna().reset_index()
Out[19]: 
   Number From To Type  value
0     111    A  B   D1   10.0
1     111    A  B   D2   12.0
2     222    B  A   D2    4.0
3     222    B  A   D3    6.0

You could also use pivot_longer from pyjanitor :

# pip install pyjanitor
import janitor
import pandas as pd
df.pivot_longer(index = slice('Number', 'To'), 
                #.value keeps column labels associated with it
                # as column headers
                names_to=('Type', '.value'), 
                names_sep='_').dropna()
Out[22]: 
   Number From To Type  value
0     111    A  B   D1   10.0
2     111    A  B   D2   12.0
3     222    B  A   D2    4.0
5     222    B  A   D3    6.0

You can also use stack:

df = df.set_index(['Number', 'From', 'To'])

# this creates a MultiIndex column
df.columns = df.columns.str.split("_", expand = True)
df.columns.names = ['Type', None]

# stack has dropna=True as default
df.stack(level = 0).reset_index()
 
   Number From To Type  value
0     111    A  B   D1   10.0
1     111    A  B   D2   12.0
2     222    B  A   D2    4.0
3     222    B  A   D3    6.0
like image 43
sammywemmy Avatar answered Dec 08 '25 20:12

sammywemmy