Python Pandas merge samed name columns in a dataframe

Tags:

python

pandas

So I have a few CSV files I'm trying to work with, but some of them have multiple columns with the same name.

For example I could have a csv like this:

ID   Name   a    a    a     b    b
1    test1  1    NaN  NaN   "a"  NaN
2    test2  NaN  2    NaN   "a"  NaN
3    test3  2    3    NaN   NaN  "b"
4    test4  NaN  NaN  4     NaN  "b"

loading into pandasis giving me this:

Click to copy

ID   Name   a    a.1  a.2   b    b.1
1    test1  1    NaN  NaN   "a"  NaN
2    test2  NaN  2    NaN   "a"  NaN
3    test3  2    3    NaN   NaN  "b"
4    test4  NaN  NaN  4     NaN  "b"

What I would like to do is merge those same name columns into 1 column (if there are multiple values keeping those values separate) and my ideal output would be this

Click to copy

ID   Name   a      b  
1    test1  "1"    "a"   
2    test2  "2"    "a"
3    test3  "2;3"  "b"
4    test4  "4"    "b"

So wondering if this is possible?

436

asked Jun 24 '14 15:06

Wizuriel

2 Answers

You could use groupby on axis=1, and experiment with something like

Click to copy

>>> def sjoin(x): return ';'.join(x[x.notnull()].astype(str))
>>> df.groupby(level=0, axis=1).apply(lambda x: x.apply(sjoin, axis=1))
  ID   Name        a  b
0  1  test1      1.0  a
1  2  test2      2.0  a
2  3  test3  2.0;3.0  b
3  4  test4      4.0  b

where instead of using .astype(str), you could use whatever formatting operator you wanted.

153

answered Nov 14 '22 21:11

DSM

Probably it is not a good idea to have duplicated column names, but it will work:

Click to copy

In [72]:

df2=df[['ID', 'Name']]
df2['a']='"'+df.T[df.columns.values=='a'].apply(lambda x: ';'.join(["%i"%item for item in x[x.notnull()]]))+'"' #these columns are of float dtype
df2['b']=df.T[df.columns.values=='b'].apply(lambda x: ';'.join([item for item in x[x.notnull()]])) #these columns are of objects dtype
print df2
   ID   Name      a    b
0   1  test1    "1"  "a"
1   2  test2    "2"  "a"
2   3  test3  "2;3"  "b"
3   4  test4    "4"  "b"

[4 rows x 4 columns]

answered Nov 14 '22 23:11

CT Zhu

Related questions
                            
                                How to check for a function type in Python?
                            
                                How to install MySQLdb in Python 2.6 CentOS
                            
                                'CSV does not exist' - Pandas DataFrame [duplicate]
                            
                                Python: How do i use itertools?
                            
                                What does [, element] mean? [duplicate]
                            
                                Addition of chars adding one character in front
                            
                                Applying constants in list comprehension
                            
                                How to check in python if I'm in certain range of times of the day?
                            
                                TypeError: Connect() takes exactly one argument
                            
                                How can I escape the format string?
                            
                                Can we draw digital waveform graph with Pyplot in python or Matlab?
                            
                                Using the same name for variables which are in different functions?
                            
                                Format number number with specific mask regex python
                            
                                My admin.TabularInline class returns exception: object has no attribute 'urls'
                            
                                Dictionary with lists as values - find longest list
                            
                                Pass a function as a variable with one input fixed
                            
                                Should I deploy only the .pyc files on server if I worry about code security?
                            
                                Change Flask-Babel locale outside of request context for scheduled tasks
                            
                                download images with google custom search api
                            
                                How to initialise a 2D array in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Pandas merge samed name columns in a dataframe

Tags:

python

pandas

Wizuriel

People also ask

2 Answers

DSM

CT Zhu

Recent Activity

Donate For Us