Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert columns into multiple rows in pandas dataframe

I have a Dataframe that looks something like this:

   Deal  Year  Quarter_1  Quarter_2  Quarter_3  Financial_Data
h     1  1991          1          2          3             120
i     2  1992          4          5          6              80
j     3  1993          7          8          9             100

I want to combine all the quarters into one new column and copy the deal number, year and financial data. The end result should then look like this:

   Deal  Year  Quarter  Financial_Data
h     1  1991        1             120
i     1  1991        2             120
j     1  1991        3             120
k     2  1992        4              80
l     2  1992        5              80
m     2  1992        6              80
n     3  1993        7             100
o     3  1993        8             100
p     3  1993        9             100
like image 937
Elias K. Avatar asked Apr 30 '18 09:04

Elias K.


People also ask

How do I convert columns to rows in pandas?

Use the T attribute or the transpose() method to swap (= transpose) the rows and columns of pandas. DataFrame . Neither method changes the original object but returns a new object with the rows and columns swapped (= transposed object).

How do I split one column into multiple pandas?

Split column by delimiter into multiple columnsApply the pandas series str. split() function on the “Address” column and pass the delimiter (comma in this case) on which you want to split the column. Also, make sure to pass True to the expand parameter.

How do I make a column into a row in a DataFrame?

Method #2: Using pivot() method. In order to convert a column to row name/index in dataframe, Pandas has a built-in function Pivot. Now, let's say we want Result to be the rows/index, and columns be name in our dataframe, to achieve this pandas has provided a method called Pivot.


1 Answers

You can use melt method.

df = pd.melt(d, id_vars=["Deal", "Year", "Financial_Data"], 
             value_name="Quarter").drop(['variable'],axis=1).sort_values('Quarter')

Output

   Deal  Year  Financial_Data  Quarter
0     1  1991             120        1
3     1  1991             120        2
6     1  1991             120        3
1     2  1992              80        4
4     2  1992              80        5
7     2  1992              80        6
2     3  1993             100        7
5     3  1993             100        8
8     3  1993             100        9

If you have many columns, you can use df.columns.tolist() method in order to achieve your requirement.

column_list = df.columns.tolist()
id_vars_list = column_list[:2] + column_list[-1:]

The statement will become

df = pd.melt(d, id_vars=id_vars_list, 
             value_name="Quarter").drop(['variable'],axis=1).sort_values('Quarter')
like image 152
Mihai Alexandru-Ionut Avatar answered Sep 21 '22 12:09

Mihai Alexandru-Ionut