So I have a few CSV files I'm trying to work with, but some of them have multiple columns with the same name.
For example I could have a csv like this:
ID Name a a a b b
1 test1 1 NaN NaN "a" NaN
2 test2 NaN 2 NaN "a" NaN
3 test3 2 3 NaN NaN "b"
4 test4 NaN NaN 4 NaN "b"
loading into pandasis giving me this:
ID Name a a.1 a.2 b b.1
1 test1 1 NaN NaN "a" NaN
2 test2 NaN 2 NaN "a" NaN
3 test3 2 3 NaN NaN "b"
4 test4 NaN NaN 4 NaN "b"
What I would like to do is merge those same name columns into 1 column (if there are multiple values keeping those values separate) and my ideal output would be this
ID Name a b
1 test1 "1" "a"
2 test2 "2" "a"
3 test3 "2;3" "b"
4 test4 "4" "b"
So wondering if this is possible?
Different column names are specified for merges in Pandas using the “left_on” and “right_on” parameters, instead of using only the “on” parameter. Merging dataframes with different names for the joining variable is achieved using the left_on and right_on arguments to the pandas merge function.
Let's say you want to create a single Full Name column by combining two other columns, First Name and Last Name. To combine first and last names, use the CONCATENATE function or the ampersand (&) operator.
You could use groupby
on axis=1
, and experiment with something like
>>> def sjoin(x): return ';'.join(x[x.notnull()].astype(str))
>>> df.groupby(level=0, axis=1).apply(lambda x: x.apply(sjoin, axis=1))
ID Name a b
0 1 test1 1.0 a
1 2 test2 2.0 a
2 3 test3 2.0;3.0 b
3 4 test4 4.0 b
where instead of using .astype(str)
, you could use whatever formatting operator you wanted.
Probably it is not a good idea to have duplicated column names, but it will work:
In [72]:
df2=df[['ID', 'Name']]
df2['a']='"'+df.T[df.columns.values=='a'].apply(lambda x: ';'.join(["%i"%item for item in x[x.notnull()]]))+'"' #these columns are of float dtype
df2['b']=df.T[df.columns.values=='b'].apply(lambda x: ';'.join([item for item in x[x.notnull()]])) #these columns are of objects dtype
print df2
ID Name a b
0 1 test1 "1" "a"
1 2 test2 "2" "a"
2 3 test3 "2;3" "b"
3 4 test4 "4" "b"
[4 rows x 4 columns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With