Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does MaxMind get its data and how can you access it?

Tags:

geoip

maxmind

From what sources does geolocation databases like http://www.maxmind.com/ get its data? As i have understood it the IP registry places like ARIN and RIPE just hold information about what company is assigned the IP range so it has to be from the ISP's right? If so, there has to be some way of accessing this.

like image 988
Sultanen Avatar asked Aug 07 '13 22:08

Sultanen


1 Answers

I had the same question and found the following information.

As I use MaxMind information I wanted their validity their website states: "MaxMind tests the accuracy of the GeoIP2 and GeoIP Legacy Databases on a periodic basis. In our recent tests, the downloadable databases were 99.8% accurate on a country level, 90% accurate on a state level in the US, and 86% accurate for cities in the US within a 50 kilometer radius. For more details, see GeoIP2 City Coverage and Accuracy. MaxMind periodically tests the accuracy of the data used in GeoIP2 products and services. Accuracy is calculated by checking known web user IP address and location pairs against the data within MaxMind's GeoIP2 Precision Web service as well as the GeoIP2 City and GeoLite2 City database offerings."

source: https://support.maxmind.com/geoip-faq/geoip2-and-geoip-legacy-databases/how-accurate-are-your-geoip2-and-geoip-legacy-databases/

For ISP information they state: "The ISP name is about 95% accurate in the US. Outside the US, accuracy ranges from 50% to 80%, depending on the country. The data is generally more accurate for countries with more Internet users."

source: https://www.maxmind.com/en/geoip2-isp-database

As to the process the following answer seemed informative:

https://www.quora.com/How-does-IP-geolocation-service-providers-collect-data-or-how-does-IP-geolocation-databases-are-filled:

IP geolocation databases are generally gathered based on the following:

  1. IP spidering--traceroutes and other automated methods designed to map the routing infrastructure of the Internet. These techniques can be fairly complex and time consuming, given the task (4+billion IP addresses that constantly are allocated, deallocated, or moved). Plus, with IPv6, this becomes orders of magnitude more difficult.

  2. Data supplied by users tied to IP addresses--some companies take anonymous user data (postal codes/city) tied to IP addresses and use that to help populate their databases. Obviously, this data needs to be carefully scrubbed to make sure it's reliable.

  3. Sharing relationships with ISPs. Companies such as mine (Digital Element...http://www.digitalelement.com/) are often contacted by ISPs to make sure our data is accurate, because they don't want their users to be incorrectly targeted by services such as Hulu or ESPN and possibly blocked from content when they should otherwise be able to get it. This data is usually highly accurate, assuming it is kept up to date, because ISPs have perfect knowledge of the location of their own IP addresses.

  4. Registry data--looking at ARIN, RIPE, etc. [Generally not that accurate.] \

like image 77
mBo Avatar answered Oct 20 '22 10:10

mBo