I have two different DataFrames that I want to merge with date
and hours
columns. I saw some threads that are there, but I could not find the solution for my issue. I also read this document and tried different combinations, however, did not work well.
Example of my two different DataFrames,
DF1
date hours var1 var2
0 2013-07-10 00:00:00 150.322617 52.225920
1 2013-07-10 01:00:00 155.250917 53.365296
2 2013-07-10 02:00:00 124.918667 51.158249
3 2013-07-10 03:00:00 143.839217 53.138251
.....
9 2013-09-10 09:00:00 148.135818 86.676341
10 2013-09-10 10:00:00 147.833517 53.658016
11 2013-09-10 12:00:00 149.580233 69.745368
12 2013-09-10 13:00:00 163.715317 14.524894
13 2013-09-10 14:00:00 168.856650 10.762779
DF2
date hours myvar1 myvar2
0 2013-07-10 09:00:00 1.617 98.56
1 2013-07-10 10:00:00 2.917 23.60
2 2013-07-10 12:00:00 19.667 36.15
3 2013-07-10 13:00:00 14.217 45.16
.....
20 2013-09-10 20:00:00 1.517 53.56
21 2013-09-10 21:00:00 5.233 69.47
22 2013-09-10 22:00:00 13.717 14.25
23 2013-09-10 23:00:00 18.850 10.69
As you can see in both DataFrames, DF2
starts with 09:00:00
and I want to join with DF1
09:00:00
, which is basically the matchind dates and times. So far, I tried many different combination using previous threads and the documentation mentioned above. An example,
merged_df = DF2.merge(DF1, how = 'left', on = ['date', 'hours'])
This was introduces NAN
values for right right
DataFrame. I know, I do not have to use both date
and hours
columns, however, still getting the same result. I tried R
quick like this, which works perfectly fine.
merged_df <- left_join(DF1, DF2, by = 'date')
Is there anyway in pandas
to merge DatFrames just with matching values without getting NaN
values?
It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.
In this approach to prevent duplicated columns from joining the two data frames, the user needs simply needs to use the pd. merge() function and pass its parameters as they join it using the inner join and the column names that are to be joined on from left and right data frames in python.
By use + operator simply you can combine/merge two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.
Use how='inner'
in pd.merge
:
merged_df = DF2.merge(DF1, how = 'inner', on = ['date', 'hours'])
This will perform and "inner-join" thereby omitting rows in each dataframe that do not match. Hence, no NaN in either the right or left part of merged dataframe.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With