Question

i want to scrape the top 10 search links from a google page on searching a keyword.

i am using webharvest . Planning to scrape the href links and filter out the top 10 using some

attribute pattern? Is it the right way,its not working at the moment. Any other simple way to do it ? :(

Was it helpful?

Solution

How about just using the google search REST API as described here.

OTHER TIPS

It's easier to use Google Sheets (even you can monitor changes), but probably you have your reasons for choosing an external tool.

In general you need 3 functions to get results:

extract Title "//h3[@class='r']"
extract  URL "//h3/a/@href"
clean URL "\/url\?q=(.+)&sa" - (All external URLs in Google Search results have tracking enabled and we’ll use Regular Expression to extract clean URLs)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top