Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas (key, value) data, move duplicate instance of key's value into a new column

Tags:

python

pandas

I have a dataframe with columns 'NAME' and 'VALUE':

NAME       VALUE
apple        6
banana       3
cherry       7
apple        2
banana       5
cherry       3

I would like to move the duplicate instances' VALUE to a new column as such:

NAME       VALUE VALUE2
apple        6     2
banana       3     5
cherry       7     3

I have done this in Python previously by iterating through CSV rows and maintaining a unique list of NAME's read so far, checking for the existence of NAME in the list...if it is, then it's now the 2nd time I've seen it so I add its VALUE to the new column VALUE2. By the way, the rows are not assumed to be in any particular order.

Is there a quick way of doing this in Pandas?

like image 419
Robert Avatar asked Dec 24 '22 06:12

Robert


2 Answers

We'll need two set_index calls plus an unstack:

df = df.set_index('NAME').set_index(
    df.groupby('NAME').cumcount(), append=True
).unstack()   

df.columns = df.columns.map('{0[0]}{0[1]}'.format)

df
        VALUE0  VALUE1
NAME                  
apple        6       2
banana       3       5
cherry       7       3
like image 85
cs95 Avatar answered Dec 28 '22 23:12

cs95


By using pivot

df.assign(key=df.groupby('NAME').cumcount()+1).pivot('NAME','key','VALUE').add_prefix('VALUE_')
Out[397]: 
key     VALUE_1  VALUE_2
NAME                    
apple         6        2
banana        3        5
cherry        7        3
like image 35
BENY Avatar answered Dec 28 '22 23:12

BENY