Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to loop through a dataframe and reference multiple fields

Tags:

dataframe

r

I have a dataframe with Address, City, State, Zip entities. From there, I'm trying to use the Yahoo APIs to Geocode each address.

I'm basing this off the code in O'Reilly's Data Mashups using R Tutorial. The original example takes a vector of street addresses and uses a hard-coded city. I'm trying to make a dynamic example that supports multiple cities.

The abbreviated version of the code is:

    geocodeAddresses<-function(myStreets)
    }
  appid<-'<put your appid here>'
          baseURL<-"http://local.yahooapis.com/MapsService/V1/geocode?appid="
          myGeoTable<-data.frame(address=character(),lat=numeric(),long=numeric(),EID=numeric())
          for (myStreet in myStreets){  
            requestUrl<-paste(baseURL, appid, "&street=", URLencode(myStreet$address),"&city=",URLencode(myStreet$city),"&state=",URLencode(myStreet$state),sep="")
            xmlResult<-xmlTreeParse(requestUrl,isURL=TRUE,addAttributeNamespaces=TRUE)
            geoResult<-xmlResult$doc$children$ResultSet$children$Result
            lat<-xmlValue(geoResult[['Latitude']])
            long<-xmlValue(geoResult[['Longitude']])
            myGeoTable<-rbind(myGeoTable,data.frame(address=myStreet,Y=lat,X=long,EID=NA))
          }
    }

When I try and reference myStreet$City and myStreet$Address, I receive an error

$ operator is invalid for atomic vectors

Other than looping through data frame myStreets, I don't know how I can make the call to the Yahoo API only once for each row and store both the long/lat for each member.

like image 839
Neil Kodner Avatar asked Dec 30 '22 14:12

Neil Kodner


1 Answers

If myStreets is data.frame then for loop takes each column of it. So first step takes Addres and Addres$City doesn't make sense.

You could change for condition to loop over rows:

for (i in 1:nrow(myStreets))  {
   myStreet <- myStreets[i,]
   # rest is the same
}

To optimized your code you can also do something like:

myGeoTable <- data.frame( address=myStreet$address, lat=NA_real_, long=NA_real_, EID=NA_real_)
for (i in 1:nrow(myStreets))  {
  myStreet <- myStreets[i,] 
  requestUrl <- ...
  ...
  myGeoTable[i,2:4] <- c(lat,long,NA)
}
like image 61
Marek Avatar answered May 27 '23 02:05

Marek