Question

I'm looking for an advice and the method to so; I have a folder on my domain where I am testing a certain landing page; If it goes well I'll might build a new website and domain with this landing page, and that's the main reasons I don't want it to get crawled, so I won't be punished by Google for duplicate content. I also don't want unwanted bots to scrape this landing page, as no good can come out of it. does it make sense to you?

If so, how can I do this? I don't think robots.txt is the best method as I understood that not all crawlers respect it, and even google may not fully respect it. I can't put a password since the landing page should be open to all humans (so the solution must not cause any problem to human visitors). does it leave the .htaccess file? If so, what code should I add there? are there any downsides I didn't get?

Thanks!

Was it helpful?

Solution

Use robots.txt file with following content:

User-agent: *
Disallow: /some-folder/
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top