Objective: Pass R a single vector of street addresses and have a three-vector dataframe returned where the first vector is the street address ("Street.Address"), the second vector is the latitude ("Lat"), and the third vector is the longitude ("Lng"). For simplicity, I'm only using four addresses; that is, the length of the vector is 4.
Approach: I'm using Jitender Aswani's code to create a geocode function using Google Maps' API. The function works brilliantly, and I'm able to find the lat/long of any address I choose. The code:
getGeoCode <- function(address)
{
#Load library
library("RJSONIO")
#Encode URL parameters
address <- gsub(' ','%20',address)
#Open connection
connectStr <- paste('http://maps.google.com/maps/api/geocode/json?sensor=false&address=',address, sep="")
con <- url(connectStr)
data.json <- fromJSON(paste(readLines(con), collapse=""))
close(con)
#Flatten the received JSON
data.json <- unlist(data.json)
lat <- data.json["results.geometry.location.lat"]
lng <- data.json["results.geometry.location.lng"]
gcodes <- c(lat, lng)
names(gcodes) <- c("Lat", "Lng")
return (gcodes)
}
geocodes<-getGeoCodes("Palo Alto, California")
geocodes
Lat Lng
"37.4418834" "-122.1430195"
My difficulty comes when trying to call the function in subsequent code. Let's call the original one column object "data.object." When I use the following code supplied by Aswani...
data.object <- with(data.object, data.frame(Street.Address, lapply(Street.Address, function(val){getGeoCode(val)})))
...I expect the function to return a three-column dataframe of length four, with column1 being the street address, column2 being the latitude, and column3 being the longitude:
Street.Address Lat Lng
[1] 3625 1ST AVE S SEATTLE WA 98134 47.571010 -122.334447
[2] 2119 RAINIER AVE S SEATTLE WA 98144 47.584136 -122.302744
[3] 9660 16TH AVE SW SEATTLE WA 98106 47.516180 -122.355138
[4] 8300 RAINIER AVE S SEATTLE WA 98118 47.529750 -122.270010
Instead, I'm getting a five-column dataframe where the values in the second column alternate between the first address' latitude and the first address' longitude, the values in the third column alternate between the second address' latitude and the second address' longitude, and so on:
Street.Address column2 column3 column4 column5
[1] 3625 1ST AVE S SEATTLE WA 98134 47.571010 47.584136 47.516180 47.529750
[2] 2119 RAINIER AVE S SEATTLE WA 98144 -122.334447 -122.302744 -122.355138 -122.270010
[3] 9660 16TH AVE SW SEATTLE WA 98106 47.571010 47.584136 47.516180 47.529750
[4] 8300 RAINIER AVE S SEATTLE WA 98118 -122.334447 -122.302744 -122.355138 -122.270010
I've tried rewriting the command using different combinations of the with(), within(), apply(), and lapply() functions, and I can't R to return a simple three-column dataframe. I know I'm overlooking something obvious, but I can't seem to figure it out.