Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert the IP Address range to two BIGINT for faster lookup

I am working on a project in which we need to do IP Address lookups. So for this we have got the datasets which will look like below and now the next steps is to populate this datasets (may be a csv file) in to the MySql table. Below is the dataset example containing columns-

ip_address      country region  city           metro-code   latitude    longitude   postal-code   country-code   region-code    city-code   continent-code  country-code

24.32.116.116   usa       tx    clarksville    623           33.6103    -95.0498    75426            840           44             34918            6            us
65.29.108.232   usa       mi    novi           505           42.4637    -83.4604    48375            840           23             4339             6            us
66.169.99.69    usa       tx    ft worth       623           32.7016    -97.3823    76109            840           44             771              6            us
72.70.100.111   usa       ma    peabody        506           42.5332    -70.9726    1960             840           22             1392             6            us
174.252.116.137 usa       va    oakton         511           38.8892    -77.3222    22124            840           47             3860             6            us
68.55.144.242   usa       md    pasadena       512           39.1276    -76.5125    21122            840           21             4358             6            us
174.252.83.27   usa       pa    lancaster      566           40.0459    -76.3542    17603            840           39             333              6            us
174.78.192.31   usa       ga    warner robins  503           32.5958    -83.6384    31088            840           11             5052             6            us
98.192.146.235  usa       fl    panama city    656           30.1804    -85.5598    32404            840           10             671              6            us
71.192.181.20   usa       ma    springfield    543           42.1187    -72.5483    1109             840           22             967              6            us
76.183.54.227   usa       tx    dallas         623           32.7202    -96.6769    75217            840           44             77               6            us
69.62.143.140   usa       ca    granite bay    862           38.7442    -121.191    95746            840           5              49451            6            us
69.142.76.32    usa       nj    penns grove    504           39.707     -75.4467    8069             840           31             2335             6            us
70.173.54.93    usa       nv    las vegas      839           36.2059    -115.225    89108            840           29             173              6            us
98.242.8.222    usa       ca    fresno         866           36.7968    -119.883    93722            840           5              19               6            us

Problem Statement:-

I am planning to store the START_IP_NUM and END_IP_NUM as BIGINT in the MySql database instead of storing IP Address in the table as doing a lookup against a BIGINT is far faster than searching a string (dotted ip notation). Just like we have database setup for Maxmind.

So my question is we will be having separate Java Program that will populate the above datasets in the MySql table. So I need to device some sort of logic which can convert the above datasets into like below-

start_ip_num
end_ip_num
country 
region  
city           
metro-code  
latitude    
longitude   
postal-code   
country-code     
region-code 
city-code   
continent-code  
country-code

I am confuse given a IP Address how should I device start_ip_num and end_ip_num here and store it into the MySql table. As there will be a file I am assuming which will contain the datasets like above and then I need to read that file and device some sort of logic to convert the IP Address to two BIGINT and store it into the MySql table.

And after storing it into MySql database. I can do lookup something like this in MySql-

SELECT country
FROM geoip
WHERE
INET_ATON('174.36.207.186') BETWEEN start_ip_num AND end_ip_num
LIMIT 1

UPDATE:-

Suppose I have around couple of Thousands IP Address like 100,000 IP Addresses in a text file with the format I mentioned above.

Now my main Objective is to do the IP Address lookup. Corresponding to this IP Address get me all the other required fields.

So to make this thing work, I was initially planning to dump the text file data as it is in the MySql table. So MySql table will contain IP Address column and other columns just like I have in the above example. But doing a lookup on the String is expensive.

So I thought I should convert those IP Address in start_ip_num and end_ip_num range and then dump the data in MySql table so now it will look something like this-

start_ip_num
end_ip_num
country 
region  
city           
metro-code  
latitude    
longitude   
postal-code   
country-code     
region-code 
city-code   
continent-code  
country-code

And now if I need to do the lookups, I can have this SQL query embedded in my WebService eventually-

SELECT country
FROM geoip
WHERE
INET_ATON('174.36.207.186') BETWEEN start_ip_num AND end_ip_num
LIMIT 1

So question is- Given a Text file which will have list of IP Addresses and other fields corresponding to that IP Address how should I massage it in such a way such that I can store the same datasets in MySql in the format I mentioned above such as start_ip_num and end_ip_num, country and other fields.

Basically I am trying to follow the pattern of database such as Maxmind have currently. They also have start_ip_num and end_ip_num and other fields. And when you need to do the lookups you can do the lookups basis on the SQL I mentioned above by converting the IP Address to BIGINT using INET_ATON.

like image 640
arsenal Avatar asked Mar 12 '13 04:03

arsenal


1 Answers

IPv4 addresses contain 4 numbers, each of which can range from 0...255

You can convert an address A.B.C.D to a 32 bit integer ( or bigint if you prefer ) with this logic

Result = (A<<24) | (B<<16) | (C<<8) | D

Where A,B,C,D are integers. This is the method we use, and I think this was even originally tested against MaxMind. (apologies if not exactly java example)

like image 124
rlb Avatar answered Nov 02 '22 15:11

rlb