Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge two rows in a dataframe pandas

I have a dataframe with two rows and I'd like to merge the two rows to one row. The df Looks as follows:

              PC           Rating CY   Rating PY    HT
0             DE101           NaN            AA     GV
0             DE101           AA+           NaN     GV

I have tried to create two seperate dataframes and Combine them with df.merge(df2) without success. The result should be the following

              PC           Rating CY   Rating PY    HT
0             DE101           AA+            AA     GV

Any ideas? Thanks in advance Could df.update be a possible solution?

EDIT:

df.head(1).combine_first(df.tail(1))

This works for the example above. However, for columns containing numerical values, this approach doesn't yield the desired output, e.g. for

              PC           Rating CY   Rating PY    HT    MV1   MV2
0             DE101           NaN            AA     GV    0     20 
0             DE101           AA+           NaN     GV    10    0

The output should be:

              PC           Rating CY   Rating PY    HT   MV1    MV2
0             DE101           AA+            AA     GV   10     20

The formula above doesn't sum up the values in the last two columns, but takes the values in the first row of the dataframe.

              PC           Rating CY   Rating PY    HT   MV1    MV2
0             DE101           AA+            AA     GV   0     20

How could this problem be fixed?

like image 914
Carmen Avatar asked Jan 17 '17 09:01

Carmen


2 Answers

You could use max with transpose like

In [2103]: df.max().to_frame().T
Out[2103]:
      PC Rating CY Rating PY  HT MV1 MV2
0  DE101       AA+        AA  GV  10  20
like image 54
Zero Avatar answered Sep 24 '22 05:09

Zero


You can make use of DF.combine_first() method after separating the DF into 2 parts where the null values in the first half would be replaced with the finite values in the other half while keeping it's other finite values untouched:

df.head(1).combine_first(df.tail(1))
# Practically this is same as → df.head(1).fillna(df.tail(1))

enter image description here


Incase there are columns of mixed datatype, partitioning them into it's constituent dtype columns and then performing various operations on it would be feasible by chaining them across.

obj_df = df.select_dtypes(include=[np.object])
num_df = df.select_dtypes(exclude=[np.object])

obj_df.head(1).combine_first(obj_df.tail(1)).join(num_df.head(1).add(num_df.tail(1)))

enter image description here

like image 40
Nickil Maveli Avatar answered Sep 23 '22 05:09

Nickil Maveli