Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to join two dataframes for which column time values are within a certain range and are not datetime or timestamp objects?

I have two dataframes as shown below:

     time browncarbon blackcarbon
 181.7335    0.105270         NaN
 181.3809    0.166545    0.001217
 181.6197    0.071581         NaN

 422 rows x 3 columns

   start       end    toc 
179.9989  180.0002  155.0
180.0002  180.0016  152.0
180.0016  180.0030  151.0

1364 rows x 3 columns

The first dataframe has a time column that has instants every four minutes. The second dataframe has a two time columns spaced every two minutes. Both these time columns do not start and end at the same time. However, they contain data collected over the same day. How could I make another dataframe containing:

time browncarbon blackcarbon toc

422 rows X 4 columns

There is a related answer on Stack Overflow, however, that is applicable only when the time columns are datetime or timestamp objects. The link is: How to join two dataframes for which column values are within a certain range?

Addendum 1: The multiple start and end rows that get encapsulated into one of the time rows should also correspond to one toc row, as it does right now, however, it should be the average of the multiple toc rows, which is not the case presently.

Addendum 2: Merging two pandas dataframes with complex conditions

like image 785
Sujai Banerji Avatar asked Jan 27 '26 22:01

Sujai Banerji


1 Answers

We create a artificial key column to do an outer merge to get the cartesian product back (all matches between the rows). Then we filter all the rows where time falls in between the range with .query.

note: I edited the value of one row so we can get a match (see row 0 in example dataframes on the bottom)

df1.assign(key=1).merge(df2.assign(key=1), on='key', how='outer')\
   .query('(time >= start) & (time <= end)')\
   .drop(['key', 'start', 'end'], axis=1)

output

       time  browncarbon  blackcarbon    toc
1  180.0008      0.10527          NaN  152.0

Example dataframes used:

df1:

       time  browncarbon  blackcarbon
0  180.0008     0.105270          NaN
1  181.3809     0.166545     0.001217
2  181.6197     0.071581          NaN

df2:

      start       end    toc
0  179.9989  180.0002  155.0
1  180.0002  180.0016  152.0
2  180.0016  180.0030  151.0
like image 90
Erfan Avatar answered Jan 30 '26 20:01

Erfan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!