Pergunta

I am parsing alot of website and I wrote a script that loops thru the thousands of link from a separate file. However, I experienced that sometimes R couldn't load one link and it stops in the middle of loop, leaving many of other urls unparsed. So I tried to use tryCatch, so the script ignores this case and keep parsing next urls. However, I recently experienced that tryCatch generates below error.

gethelp.url = 'http://forums.autodesk.com/t5/Vault-General/bd-p/101'
gethelp.df =tryCatch(htmlTreeParse(gethelp.url, useInternalNodes = T), error = function() next) 

Error in value[[3L]](cond) : unused argument (cond)
Calls: withRestarts ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
Execution halted

The confusing thing is sometimes it works well and sometimes it throws out this error message, even though the same script parses the same urls.

Can anyone give me a guidance how to interpret this error messages? I read the document but i couldn't find much insights.

Foi útil?

Solução

I think your function has to have cond as an argument – at least that's how I've used tryCatch() in the past, and your error message seems to indicate it as the problem.

Try the following: gethelp.df =tryCatch(htmlTreeParse(gethelp.url, useInternalNodes = T), error = function(cond) next)

Note that the above line will still throw an error, b/c the example code is not in a loop. So I just replaced next with NA, and it worked fine.

Edit: In response to OP's comment, I suggest trying the following:

gethelp.df =tryCatch(htmlTreeParse(gethelp.url, useInternalNodes = T), error = function(cond)"skip")
if(gethelp.df=="skip"){next}
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top