Here's another option that uses the httr
package. (BTW: you don't need RJSONIO
). Replace your wiki.tables(...)
function with this:
wiki.tables <- function(towns) {
require(httr)
require(XML)
get.HTML<- function(url){
resp <- GET(url)
if (resp$status_code==200) return(htmlParse(content(resp,type="text")))
}
u <- paste('http://en.wikipedia.org/wiki/',
sep = '', towns[,1], ',_', towns[,2])
res <- lapply(u, get.HTML)
res <- res[sapply(res,function(x)!is.null(x))] # remove NULLs
tabs <- lapply(sapply(res, getNodeSet, path = '//*[@class="infobox vcard"]')
, readHTMLTable)
return(tabs)
}
This runs one GET request and tests the status code. The disadvantage of url.exists(...)
is that you have to query every url twice: once to see if it exists, and again to get the data.
Incidentally, when I tried your code the Yunderup url does in fact exist ??