Question

Google Web Search API has been deprecated and replaced with Custom Search API (see http://code.google.com/apis/websearch/).

I wanted to search the whole web but it looks like with the new API only custom sites can be searched.

Is there a way to search the whole web programmatically? I was able to query the old API using JSON from a Java program.

Was it helpful?

Solution

You could just send them through like a browser does, and then parse the html, that is what I have always done, even for things like Youtube.

OTHER TIPS

Yes, Google Custom Search has now replaced the old Search API, but you can still use Google Custom Search to search the entire web, although the steps are not obvious from the Custom Search setup.

To create a Google Custom Search engine that searches the entire web:

  1. From the Google Custom Search homepage ( http://www.google.com/cse/ ), click Create a Custom Search Engine.
  2. Type a name and description for your search engine.
  3. Under Define your search engine, in the Sites to Search box, enter at least one valid URL (For now, just put www.anyurl.com to get past this screen. More on this later ).
  4. Select the CSE edition you want and accept the Terms of Service, then click Next. Select the layout option you want, and then click Next.
  5. Click any of the links under the Next steps section to navigate to your Control panel.
  6. In the left-hand menu, under Control Panel, click Basics.
  7. In the Search Preferences section, select Search the entire web but emphasize included sites.
  8. Click Save Changes.
  9. In the left-hand menu, under Control Panel, click Sites.
  10. Delete the site you entered during the initial setup process.

Now your custom search engine will search the entire web.

Pricing

  • Google Custom Search gives you 100 queries per day for free.
  • After that you pay $5 per 1000 queries.
  • There is a maximum of 10,000 queries per day.

Source: https://developers.google.com/custom-search/json-api/v1/overview#Pricing


  • The search quality is much lower than normal Google search (no synonyms, "intelligence" etc.)
  • It seems that Google is even planning to shut down this service completely.

Google Custom Search (as advocated in the top rated answers) works well, but is very expensive, compared to its competitors (below) or compared to other Google API's. It has a small free tier (100 queries/day) and a very high price of $5 per 1000 query.

They offer the option to upgrade to Site Search, which has slightly better prices, but that is meant for searching one site (your own), so it is really something quite different - not an upgrade.

The main alternatives seem to be:

Bing Search API
https://datamarket.azure.com/dataset/5BA839F1-12CE-4CCE-BF57-A49D98D29A44
Which has a free tier of 5000q/month, and prices starting at 5 query per penny, and no hard limit.

UPDATE: At the end of 2016 this API was shutdown in favour of its Azure counterpart "Cognitive Services Bing Search API":
https://azure.microsoft.com/en-us/services/cognitive-services/search/

See here for a pricing chart, which starts at US$3/m for 1,000 transactions. Unless I'm missing something it is quite expensive.

Yahoo BOSS Search API
UPDATE: Was discontinued on March 31, 2016. http://developer.yahoo.com/boss/search/
With prices starting at about 12 queries/penny for whole web searches.

And some I haven't heard of before:

http://www.gigablast.com/searchfeed.html

http://www.faroo.com/hp/api/api.html

http://www.commoncrawl.org/

http://www.entireweb.com/search_api/implementation/
[discontinued - as pointed out below]

There is a bit of discussion of some of these on this SO post.
[got closed for being off-topic and is now gone]

Here is an option at the bottom of the Custom Search Control Panel: "Sites to search", you can choose "Search the entire web but emphasize included sites"

Custom Search Control Panel - Sites to search

Faroo has a free Web Search API

I have just come across this from Common Crawl.

http://www.commoncrawl.org/

Might be the answer we are all looking for!!

There's a note on top of the docs:

Note: The Google Web Search API has been officially deprecated as of November 1, 2010. It will continue to work as per our deprecation policy, but the number of requests you may make per day will be limited. Therefore, we encourage you to move to the new Custom Search API.

The deprecation policy says that they will continue to run the API for 3 years. So if you already have an application that uses the old API, you don't have to rush to change things just yet. If you're writing a new application, use the Custom Search API. See my answer here for how to do this in Python, but the idea's the same for any language.

There's a free Java API called JFreeWebSearch which uses the already mentioned Faroo: http://www.ke.tu-darmstadt.de/resources/jfreewebsearch

You can create "everywhere" custom search engine right from the Google Custom Search homepage ( http://www.google.com/cse/ ). You should just click 'advanced', during adding new engine. There you can provide Schema.org site type. 'Thing' is most generic type, which covers all the web.

Gigablast offers a cheap web search API: http://www.gigablast.com/searchfeed.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top