Question

Would having the following robot.txt work?

User-agent: *
Disallow: /

User-agent: Googlebot-Image
Allow: /

My idea is to avoid google crawling my cdn domain but allowing google image still crawl and index my images.

Was it helpful?

Solution

The file has to be called robots.txt, not robot.txt.

Note that User-agent: * targets all bots (that are not matched by another User-agent record), not only the Googlebot. So if you want allow other bots to crawl your site, you would want to use User-agent: Googlebot instead.

So this robots.txt would allow "Googlebot-Image" everything, and disallow everything for all other bots:

User-agent: Googlebot-Image
Disallow:

User-agent: *
Disallow: /

(Note that Disallow: with an empty string value is equivalent to Allow: /, but the Allow field is not part of the original robots.txt specification, although some parsers support it, among them Google’s).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top