How to write a htaccess rule specific for a given subdomain? - Avoiding indexing some files

StackOverflow https://stackoverflow.com/questions/12942902

  •  08-07-2021
  •  | 
  •  

Question

I have the following on my .htaccess file:

Options +FollowSymlinks
#+FollowSymLinks must be enabled for any rules to work, this is a security 
#requirement of the rewrite engine. Normally it's enabled in the root and we 
#shouldn't have to add it, but it doesn't hurt to do so.

RewriteEngine on
#Apache scans all incoming URL requests, checks for matches in our #.htaccess file 
#and rewrites those matching URLs to whatever we specify.

#allow blank referrers.
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?site.com [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?site.dev [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?dev.site.com [NC]
RewriteRule \.(jpg|jpeg|png|gif)$ - [NC,F,L]

# if a directory or a file exists, use it directly
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d


# otherwise forward it to index.php
RewriteRule . index.php

site.com is the production site.

site.dev is a localhost dev environment.

dev.site.com is a subdomain where we test live.

I'm aware that this will avoid the site to be indexed:

Header set X-Robots-Tag "noindex, nofollow"

cf. http://yoast.com/prevent-site-being-indexed/

My question is however, fairly simple perhaps:

Is there a way to apply this line ONLY on dev.site.com, so that it doesn't get indexed ?

Was it helpful?

Solution

Is there a way to apply this line ONLY on dev.site.com, so that it doesn't get indexed ?

Yes, you need to put the Header line in the vhost config for dev.site.com. There's no way you can make a host check tied to a Header set directive from within an htaccess file.

The other possibility is if you want to block bots via useragent, you can remove the Header set and add some rules:

# request is for http://dev.site.com
RewriteCond %{HTTP_HOST} ^dev.site.com$ [NC]
# user-agent is a search engine bot
RewriteCond %{HTTP_USER_AGENT} (Googlebot|yahoo|msnbot) [NC]
# return forbidden
RewriteRule ^ - [L,F]

Note that the list of user agents isn't complete. You can try to go through the massive list of User-Agents and look for all of the index robots, or at least the more popular ones.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top