How to create a new column for each unique component in a given column of a dataframe in Pandas?

Tags:

I am relatively new to Pandas so my sincere apologies if the question was not framed properly. I have the following dataframe

df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
                         'foo', 'bar', 'foo', 'foo'],
                   'B': ['one', 'one', 'two', 'three',
                         'two', 'two', 'one', 'three'],
                   'C': np.random.randn(8)})



     A      B         C         
0  foo    one  0.469112 
1  bar    one -0.282863 
2  foo    two -1.509059
3  bar  three -1.135632  
4  foo    two  1.212112  
5  bar    two -0.173215 
6  foo    one  0.119209 
7  foo  three -1.044236

What I want to achieve is following,

           foo_B         foo_C      bar_B      bar_C          
0             one        0.469112     -           -
1             -            -          one        -0.282863 
2             two        -1.509059    -            -
3             -               -       three    -1.135632               
4             two         1.212112    -            -
5              -              -       two      -0.173215 
6             one         0.119209      -           -
7              three     -1.044236      -           -

I exactly don't know which pandas function to use to obtain such a result. Kindly help

932

asked Apr 15 '20 21:04

mubas007

1 Answers

you can do it with set_index the column A with append=True to keep the original index, and unstack. Then rename the columns as wanted in your output.

df_f = df.set_index('A', append=True).unstack()
df_f.columns = [f'{col[1]}_{col[0]}' for col in df_f.columns]
print (df_f)
   bar_B  foo_B     bar_C     foo_C
0    NaN    one       NaN -0.230467
1    one    NaN  0.230529       NaN
2    NaN    two       NaN  1.633847
3  three    NaN -0.307068       NaN
4    NaN    two       NaN  0.130438
5    two    NaN  0.459630       NaN
6    NaN    one       NaN -0.791269
7    NaN  three       NaN  0.016670

148

answered Sep 28 '22 00:09

Ben.T

Related questions
                            
                                Identify leading and trailing NAs in pandas DataFrame
                            
                                Plot the dendrogram of communities found by NetworkX Girvan-Newman algorithm
                            
                                Round while groupping by in pandas with agg function
                            
                                Create checkerboard distribution with Python
                            
                                ImportError: cannot import name 'Serial' from 'serial' (unknown location)
                            
                                Reserved word as an attribute name in a dataclass when parsing a JSON object
                            
                                Cant create CSV file with django although already copaste from the documentation
                            
                                Multiprocessing in a loop, "Pool not running" error
                            
                                Python loses connection to MySQL database after about a day
                            
                                Python requirements conflict with PyPi
                            
                                AWS Cognito for Django3 + DRF Authentication
                            
                                What are the inputs to the transformer encoder and decoder in BERT?
                            
                                How to have persistent storage for a PYPI package
                            
                                With a PyTorch LSTM, can I have a different hidden_size than input_size?
                            
                                Rolling apply function must be real number, not Nonetype
                            
                                Removing lower case letter in column of Pandas dataframe
                            
                                can I split numpy array with mask?
                            
                                I need help making a discord py temp mute command in discord py
                            
                                How to fix ValueError: multiclass format is not supported
                            
                                kivy camera application with opencv in android shows black screen

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to create a new column for each unique component in a given column of a dataframe in Pandas?

Tags:

python

python-3.x

pandas

mubas007

People also ask

1 Answers

Ben.T

Recent Activity

Donate For Us