Moving every other row to a new column and group pandas python

Tags:

python

pandas

I have an example data set that's much smaller than my actual data set, it is actually a text file and I want to read it in as a pandas table and do something with it:

import pandas as pd
d = {
     'one': ['title1', 'R2G', 'title2', 'K5G', 'title2','R14G', 'title2','R2T','title3', 'K10C', 'title4', 'W7C', 'title4', 'R2G', 'title5', 'K8C']
    }
df = pd.DataFrame(d)

Example dataset looks like this:

df

Out[20]:  
      one
0   title1
1      R2G
2   title2
3      K5G
4   title2
5     R14G
6   title2
7      R2T
8   title3
9     K10C
10  title4
11     W7C
12  title4
13     R2G
14  title5
15     K8C

I added a second column called 'value':

df.insert(1,'value','')
df

Out[22]: 
      one      value
0   title1
1      R2G
2   title2
3      K5G
4   title2
5     R14G
6   title2
7      R2T
8   title3
9     K10C
10  title4
11     W7C
12  title4
13     R2G
14  title5
15     K8C

I want to first move every other row over to the 'value' column:

      one    value
0   title1    R2G          
1   title2    K5G  
2   title2    R14G 
3   title2    R2T    
4   title3    K10C          
5   title4    W7C            
6   title4    R2G           
7   title5    K8C

I then want to group by the title name, since there might be more than 1 values for the same title:

     one     value
0   title1    R2G          
1   title2    K5G, R14G, R2T   
2   title3    K10C          
3   title4    W7C , R2G                        
4   title5    K8C

How can this be achieved?

995

asked Mar 23 '16 14:03

Jessica

2 Answers

Construct a new df by slicing the column using iloc and a step arg:

In [185]:
new_df = pd.DataFrame({'one':df['one'].iloc[::2].values, 'value':df['one'].iloc[1::2].values})
new_df

Out[185]:
      one value
0  title1   R2G
1  title2   K5G
2  title2  R14G
3  title2   R2T
4  title3  K10C
5  title4   W7C
6  title4   R2G
7  title5   K8C

You can then groupby on 'one' and apply a lambda on the 'value' column and just join the values:

In [188]:
new_df.groupby('one')['value'].apply(','.join).reset_index()

Out[188]:
      one         value
0  title1           R2G
1  title2  K5G,R14G,R2T
2  title3          K10C
3  title4       W7C,R2G
4  title5           K8C

178

answered Sep 20 '22 19:09

EdChum

Alternatively, you can reshape and aggregate by passing groups of values into list.

import pandas as pd
d = {
     'one': ['title1', 'R2G', 'title2', 'K5G', 'title2','R14G', 'title2','R2T','title3', 'K10C', 'title4', 'W7C', 'title4', 'R2G', 'title5', 'K8C']
    }
df = pd.DataFrame(d)
# because you have simple alternating pattern, you can just reshape
df = pd.DataFrame(df.values.reshape(-1, 2), columns = ['one', 'value'])
# groupby on value and aggregate by joining a string
df = df.groupby('one')['value'].apply(', '.join).reset_index()

answered Sep 21 '22 19:09

hilberts_drinking_problem

Related questions
                            
                                Filtering Objects in Class based view Django using Query parameters?
                            
                                Testing whether a string has repeated characters
                            
                                Outer join Spark dataframe with non-identical join column and then merge join column
                            
                                Python and functional programming: is there an apply() function?
                            
                                Cx_freeze ImportError no module named scipy
                            
                                top-level marshmallow schema validation
                            
                                Setting NLTK with Stanford NLP (both StanfordNERTagger and StanfordPOSTagger) for Spanish
                            
                                pandas, how to filter dataframe by column value
                            
                                Kivy FileChooser: List directories only
                            
                                How do I override a decorated method in Python?
                            
                                Update label text after pressing a button in Tkinter
                            
                                How to write Locust result of test api to file
                            
                                Subtract all items in a list against each other
                            
                                db.create_all() doesn't create tables defined in separate file
                            
                                URL Encode in Windows Batch Script
                            
                                Numpy array slicing using colons
                            
                                Kneser-Ney smoothing of trigrams using Python NLTK
                            
                                Python support sorted dictionary -- similar to C++ map?
                            
                                Setting a fixed FPS in Pygame, Python 3
                            
                                Element-wise constraints in scipy.optimize.minimize

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With