Question

Let's say I have a test folder (test.domain.com) and I don't want the search engines to crawl in it, do I need to have a robots.txt in the test folder or can I just place a robots.txt in the root, then just disallow the test folder?

Was it helpful?

Solution

Each subdomain is generally treated as a separate site and requires their own robots.txt file.

OTHER TIPS

When the crawler fetches test.domain.com/robots.txt that is the robots.txt file that it will see. It will not see any other robots.txt file.

If your test folder is configured as a virtual host, you need robots.txt in your test folder as well. (This is the most common usage). But if you move your web traffic from subdomain via .htaccess file, you could modify it to always use robots.txt from the root of your main domain.

Anyway - from my experience it's better to be safe than sorry and put (especially declining access) files robots.txt in all domains you need to protect. And double-check if you're getting the right file when accessing:

http://yourrootdomain.com/robots.txt
http://subdomain.yourrootdomain.com/robots.txt
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top