I have a list of about 200,000 IP addresses. I would like to link these to geographic location and get any other data that an IP address can give as well.
The best I've found so far is a service provided by infochimps: http://www.infochimps.com/datasets/digital-element-ip-intelligence-demographics There's also an R package for infochimps. But infochimps requires you to pay and for 200,000 IP addresses this could get expensive.
Is there any R package that can do something like this?
Thank you
Try using the RDSTK
package, which provides an R interface to the Data Science Toolkit API. Here is a presentation by the author of the package, that should help you get started.
From Xu Wang's comments (moved here to increase future findability):
For reference purposes: To install that package, one must install RCurl and rjson. Before installing RCurl, on Ubuntu I had to install two packages: sudo apt-get install curl libcurl4-gnutls-dev
The function that I needed was ip2coordinates
, which accepts an IP address as input
The function IPtoXY (http://thebiobucket.blogspot.com/2011/12/function-to-collect-geographic.html) uses the same API but does not need additional packages..
Edit, 26th of Sept: Thanks to @Peter M I became aware that my function mentioned above was not working anymore - here is the edited version which should work (also the link above was updated..):
# Purpose: Get geographic coordinates for a given IP-address
# Author: Kay Cichini
# Date: 2011-12-18
# Output: A string holding longitude and latitude with format "X;Y"
IPtoXY <- function(x) {
URL_IP <- paste("http://www.datasciencetoolkit.org//ip2coordinates/",
x, sep = "")
api_return <- readLines(URL_IP, warn = F)
lon1 <- api_return[grep("longitude", api_return)]
lon <- gsub("[^[:digit:].]", "", lon1)
lat1 <- api_return[grep("latitude", api_return)]
lat <- gsub("[^[:digit:].]", "", lat1)
return(paste(lat, lon, sep = ";"))
}
# Example:
> IPtoXY("74.88.200.52")
[1] "40.951301574707;73.78759765625"
The function from: http://thebiobucket.blogspot.com/2011/12/function-to-collect-geographic.html , does not work.
But the idea still does, so this should do:
iplocation <- function(ip=""){
response <- readLines(paste("http://www.datasciencetoolkit.org//ip2coordinates/",ip,sep=""))
success <- !any(grepl("null",response))
ip <- grep("[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*",response,value=T)
match <- regexpr("[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*",ip)
ip <- substr(ip,match,as.integer(attributes(match)[1])+match-1)
if(success==T){
extract <- function(label,response){
text <- grep(label,response,value=T)
match <- regexpr(paste('"',label,'"',": ",sep=""),text)
text <- substr(text,match+as.integer(attributes(match)[1]),nchar(text))
if(grepl("[[:digit:]]",text)){
text <- substr(text,1,nchar(text)-2)
}else{
text <- substr(text,2,nchar(text)-2)
}
if( regexpr('"',text)!= -1){
text<-substr(text,2,nchar(text))
}
print(text)
text
}
}
RESULT <- list()
RESULT$success <- success
RESULT$ip <- ip
if(success==T){
RESULT$latitude <- as.numeric(extract("latitude",response))
RESULT$longitude <- as.numeric(extract("longitude",response))
RESULT$country <- extract("country_name",response)
RESULT$locality <- extract("locality",response)
RESULT$postalcode <- extract("postal_code",response)
RESULT$region <- extract("region",response)
RESULT$countrycode <- extract("country_code3",response)
}
RESULT
}
iplocation()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With