I have a dataframe which looks like this: <blockquote> <img src="https://i.stack.imgur.com/PuLyM.png" alt="enter image description here"> </blockquote> Each user has 10 records. Now, I want to create a dataframe which looks like this: <pre class="prettyprint"><code>userid name1 name2 ... name10 </code></pre> which means I need to invert every 10 records of the column <code>name</code> and append to a new dataframe. So, how do it do it? Is there any way I can do it in Pandas?

You may also be interested in pandas.DataFrame.pivot See this example dataframe: <pre class="prettyprint"><code>df userid name values 0 123 A 1 1 123 B 2 2 123 C 3 3 456 A 4 4 456 B 5 5 456 C 6 </code></pre> using df.pivot <pre class="prettyprint"><code>df.pivot(index='userid', columns='name', values='values') name A B C userid 123 1 2 3 456 4 5 6 </code></pre>

Groupby, transpose and append in Pandas?

Tags:

python-3.x

pandas

group-by

pandas-groupby

I have a dataframe which looks like this:

Each user has 10 records. Now, I want to create a dataframe which looks like this:

userid  name1  name2  ... name10

which means I need to invert every 10 records of the column name and append to a new dataframe.

So, how do it do it? Is there any way I can do it in Pandas?

380

asked Jul 14 '16 08:07

Dawny33

2 Answers

groupby('userid') then reset_index within each group to enumerate consistently across groups. Then unstack to get columns.

df.groupby('userid')['name'].apply(lambda df: df.reset_index(drop=True)).unstack()

Demonstration

df = pd.DataFrame([
        [123, 'abc'],
        [123, 'abc'],
        [456, 'def'],
        [123, 'abc'],
        [123, 'abc'],
        [456, 'def'],
        [456, 'def'],
        [456, 'def'],
    ], columns=['userid', 'name'])

df.sort_values('userid').groupby('userid')['name'].apply(lambda df: df.reset_index(drop=True)).unstack()

enter image description here

if you don't want the userid as the index, add reset_index to the end.

df.sort_values('userid').groupby('userid')['name'].apply(lambda df: df.reset_index(drop=True)).unstack().reset_index()

enter image description here

191

answered Sep 19 '22 09:09

piRSquared

You may also be interested in pandas.DataFrame.pivot

See this example dataframe:

df
    userid  name  values
0   123     A     1    
1   123     B     2    
2   123     C     3    
3   456     A     4    
4   456     B     5    
5   456     C     6

using df.pivot

df.pivot(index='userid', columns='name', values='values')
name  A   B   C
userid
123  1   2   3
456  4   5   6

answered Sep 18 '22 09:09

Jan_ewazz

Related questions
                            
                                Tkinter window says (not responding) but code is running
                            
                                ImportError: No module named serial
                            
                                Access autouse fixture without having to add it to the method argument
                            
                                "OSError: [Errno 22] Invalid argument" when read()ing a huge file
                            
                                get the lists of functions used/called within a function in python
                            
                                How to make future calls and wait until complete with Python?
                            
                                extracting data from typing types
                            
                                Understanding `width_shift_range` and `height_shift_range` arguments in Keras's ImageDataGenerator class
                            
                                Connection refused with postgresql using psycopg2
                            
                                Provide __classcell__ example for Python 3.6 metaclass
                            
                                Mock a method of a mocked object in Python?
                            
                                inequivalent arg 'durable' for queue
                            
                                argparse argument dependency
                            
                                `.loc` and `.iloc` with MultiIndex'd DataFrame
                            
                                Tf 2.0 : RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes
                            
                                How to organize Python modules for PyPI to support 2.x and 3.x
                            
                                Python: Dictionary as instance variable [duplicate]
                            
                                Multiprocessing Share Unserializable Objects Between Processes
                            
                                'super' object has no attribute '__getattr__' in python3
                            
                                Is there a comprehensive table of Python's "magic constants"?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With