Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - transpose one column

I'm having difficulty using transpose with pandas.

I have the following df:

date         name    quantity
1/1/2018     A       5
1/1/2018     B       6
1/1/2018     C       7
1/2/2018     A       9
1/2/2018     B       8
1/2/2018     C       6

I eventually want to create a pairwise correlation for all the names and their quantities on each date. To to that end, I'm trying to create the following output from this df first:

 date       A    B    C
 1/1/2018   5    6    7
 1/2/2018   9    8    6

The transpose is difficult to me since I can get duplicate column headers, but I also don't want to lose any data by dropping them first. I have a feeling the answer may be with a panda utility that I don't really use and I may be tunneling on transpose...

like image 533
JesusMonroe Avatar asked Sep 27 '18 17:09

JesusMonroe


People also ask

How do I transpose a column in pandas?

Pandas DataFrame. transpose() is a library function that transpose index and columns. The transpose reflects the DataFrame over its main diagonal by writing rows as columns and vice-versa. Use the T attribute or the transpose() method to swap (= transpose) the rows and columns of DataFrame.

How do I transpose columns to rows in pandas?

Pandas DataFrame: transpose() functionThe transpose() function is used to transpose index and columns. Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. If True, the underlying data is copied. Otherwise (default), no copy is made if possible.

How do I change the type of column in one pandas?

to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.


Video Answer


2 Answers

Since you aren't performing an aggregation, pd.DataFrame.pivot should be preferred to groupby / pivot_table:

res = df.pivot(index='date', columns='name', values='quantity')

print(res)

name      A  B  C
date             
1/1/2018  5  6  7
1/2/2018  9  8  6

If you wish you can use reset_index to elevate date to a column.

like image 68
jpp Avatar answered Oct 02 '22 22:10

jpp


By no means is my proposed solution better than jpp's. I just happened to run into the same problem and solved it differently.

df.set_index(['date', 'name']).unstack()

The result looks a little messier too but it worked in my case:

enter image description here

like image 37
Bowen Liu Avatar answered Oct 02 '22 23:10

Bowen Liu