I have a dataframe A
ip_address
0 13
1 5
2 20
3 11
.. ........
and another dataframe B
lowerbound_ip_address upperbound_ip_address country
0 0 10 Australia
1 11 20 China
based on this I need to add a column in A
such that
ip_address country
13 China
5 Australia
I have an idea that I should write define a function and then call map on each row of A. But how would I search through each row of B for this. Is there a better way to do this.
The Fastest Ways As it turns out, join always tends to perform well, and merge will perform almost exactly the same given the syntax is optimal.
merge() It should takes around 1.75 sec.
It can be done using the merge() method. Below are some examples that depict how to merge data frames of different lengths using the above method: Example 1: Below is a program to merge two student data frames of different lengths.
Use pd.IntervalIndex
In [2503]: s = pd.IntervalIndex.from_arrays(dfb.lowerbound_ip_address,
dfb.upperbound_ip_address, 'both')
In [2504]: dfa.assign(country=dfb.set_index(s).loc[dfa.ip_address].country.values)
Out[2504]:
ip_address country
0 13 China
1 5 Australia
2 20 China
3 11 China
Details
In [2505]: s
Out[2505]:
IntervalIndex([[0, 10], [11, 20]]
closed='both',
dtype='interval[int64]')
In [2507]: dfb.set_index(s)
Out[2507]:
lowerbound_ip_address upperbound_ip_address country
[0, 10] 0 10 Australia
[11, 20] 11 20 China
In [2506]: dfb.set_index(s).loc[dfa.ip_address]
Out[2506]:
lowerbound_ip_address upperbound_ip_address country
[11, 20] 11 20 China
[0, 10] 0 10 Australia
[11, 20] 11 20 China
[11, 20] 11 20 China
Setup
In [2508]: dfa
Out[2508]:
ip_address
0 13
1 5
2 20
3 11
In [2509]: dfb
Out[2509]:
lowerbound_ip_address upperbound_ip_address country
0 0 10 Australia
1 11 20 China
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With