Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Data Frame how to merge columns

I have a pandas dataframe like in the picture. How can I turn it into the table like below. (the demonstration is in excel but I just want to illustrate to you how the table look like- this question does not related to importing and exporting dataframe from/to excel)

Thank you enter image description here

like image 573
Phuong Duyen Huynh Ngoc Avatar asked Mar 28 '18 11:03

Phuong Duyen Huynh Ngoc


People also ask

How do I merge column values in Pandas?

To start, you may use this template to concatenate your column values (for strings only): df['New Column Name'] = df['1st Column Name'] + df['2nd Column Name'] + ... Notice that the plus symbol ('+') is used to perform the concatenation.

How do you join columns in a data frame?

To join these DataFrames, pandas provides multiple functions like concat() , merge() , join() , etc. In this section, you will practice using merge() function of pandas. You can notice that the DataFrames are now merged into a single DataFrame based on the common values present in the id column of both the DataFrames.

Can you join on two columns Pandas?

We can merge two Pandas DataFrames on certain columns using the merge function by simply specifying the certain columns for merge. Example1: Let's create a Dataframe and then merge them into a single dataframe. Creating a Dataframe: Python3.


1 Answers

This is not possible.

Underlying pandas.DataFrame objects are numpy arrays, which do not group data in the way you suggest. Therefore, an arbitrary column cannot be displayed as grouped data.

Option 1

It is possible to partially replicate your desired output by using MultiIndex:

import pandas as pd

df = pd.DataFrame([['AAA', 8, 2, 'BBB'],
                   ['AAA', 9, 5, 'BBB'],
                   ['AAA', 10, 6, 'BBB']],
                  columns=['Name', 'Score1', 'Score2', 'PM'])

res = df.set_index(['Name', 'PM'])

Result:

          Score1  Score2
Name PM                 
AAA  BBB       8       2
     BBB       9       5
     BBB      10       6

Option 2

Or you can add a dummy column and set_index on 3 columns:

df['dummy'] = 0
res = df.set_index(['Name', 'PM', 'dummy'])

Result:

                Score1  Score2
Name PM  dummy                
AAA  BBB 0           8       2
         0           9       5
         0          10       6
like image 117
jpp Avatar answered Oct 20 '22 12:10

jpp