I am trying to add a new column to a pandas Dataframe (False/True),which reflects if the Value is between two datapoints from another file.
I have a two files which give the following info:
File A:(x) File B:(y)
'time' 'time_A' 'time_B'
0 1 0 1 3
1 3 1 5 6
2 5 2 8 10
3 7
4 9
5 11
6 13
I tried to do it with the .map function, however it gives true and false for each event, not one column.
x['Event'] = x['time'].map((lamda x: x< y['time_A']),(lamda x: x> y['time_B']))
This would be the expected result ->
File A:
'time' 'Event'
0 1 True
1 3 True
2 5 True
3 7 False
4 9 True
5 11 False
6 13 False
However what i get is something like this ->
File A:
'time'
0 1 "0 True
1 True
2 True"
Name:1, dtype:bool"
2 3 "0 True
1 True
2 True
Name:1, dtype:bool"
This should do it:
(x.assign(key=1)
.merge(y.assign(key=1),
on='key')
.drop('key', 1)
.assign(Event=lambda v: (v['time_A'] <= v['time']) &
(v['time'] <= v['time_B']))
.groupby('time', as_index=False)['Event']
.any())
time Event
0 1 True
1 3 True
2 5 True
3 7 False
4 9 True
5 11 False
6 13 False
Use pd.IntervalIndex
here:
idx=pd.IntervalIndex.from_arrays(B['time_A'],B['time_B'],closed='both')
#output-> IntervalIndex([[1, 3], [5, 6], [8, 10]],closed='both',dtype='interval[int64]')
A['Event']=B.set_index(idx).reindex(A['time']).notna().all(1).to_numpy()
print(A)
time Event
0 1 True
1 3 True
2 5 True
3 7 False
4 9 True
5 11 False
6 13 False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With