Merging two pandas dataframes by interval

Question

I have two pandas dataframes with following format:

df_ts = pd.DataFrame([
        [10, 20, 1,  'id1'],
        [11, 22, 5,  'id1'],
        [20, 54, 5,  'id2'],
        [22, 53, 7,  'id2'],
        [15, 24, 8,  'id1'],
        [16, 25, 10, 'id1']
    ], columns = ['x', 'y', 'ts', 'id'])


df_statechange = pd.DataFrame([
        ['id1', 2, 'ok'],
        ['id2', 4, 'not ok'],
        ['id1', 9, 'not ok']
    ], columns = ['id', 'ts', 'state'])

I am trying to get it to the format, such as:

df_out = pd.DataFrame([
        [10, 20, 1,  'id1', None    ],
        [11, 22, 5,  'id1', 'ok'    ],
        [20, 54, 5,  'id2', 'not ok'],
        [22, 53, 7,  'id2', 'not ok'],
        [15, 24, 8,  'id1', 'ok'    ],
        [16, 25, 10, 'id1', 'not ok']
    ], columns = ['x', 'y', 'ts', 'id', 'state'])

I understand how to accomplish it iteratively by grouping by id and then iterating through each row and changing status when it appears. Is there a pandas build-in more scalable way of doing this?

Dimgold · Accepted Answer

Unfortunately pandas merge support only equality joins. See more details at the following thread: merge pandas dataframes where one value is between two others if you want to merge by interval you'll need to overcome the issue, for example by adding another filter after the merge:

joined = a.merge(b,on='id')
joined = joined[joined.ts.between(joined.ts1,joined.ts2)]

Merging two pandas dataframes by interval

Tags:

python

merge

pandas

time-series

ymoiseev

1 Answers

Dimgold

Recent Activity

Donate For Us

Merging two pandas dataframes by interval

Tags:

python

merge

pandas

time-series

ymoiseev

1 Answers

Dimgold

Related questions

Recent Activity

Donate For Us