If I respond to requests for robots.txt with HTTP code 418 AKA “I'm a teapot”, will this make search engines dislike me?

https://stackoverflow.com/questions/8147522

02-03-2021
|

Domanda

I have a very simple webapp that runs within HTML5's Canvas that doesn't have any public files that need to be indexed by search engines (beyond the front-page HTML file that includes calls to all the necessary resources). As such, I don't really need robots.txt file, since they'll just see the public files and that's it.

Now, as a joke, I'd like to return an HTTP-418 AKA "I'm a tea pot" response every time a web-crawler asks for robots.txt. However, if this will end up screwing me over in terms of my location in search results, then this is not a joke that would be very worthwhile for me.

Does anybody know anything about how different web-crawlers will respond to non-standard (though in this case it technically is standard) HTTP codes?

Also, on a more serious note, is there any reason to have a robots.txt file that says "everything is indexable!" instead of just not having a file?

Soluzione

Having a blank robots.txt file will also tell crawlers that you want all of your content indexed. There is an allow directive for robots.txt but it is non-standard and should not be relied upon. This is good to do because it keeps 404 errors from piling up in your access logs whenever a search engine tries to request a non-existent robots.txt from your site.
Sending out non-standard HTTP codes is not a good idea as you have absolutely no idea how search engines will respond to it. If they don't accept it they may use a 404 header as a fallback and that's obviously not what you want to happen. Basically, this is a bad place to make a joke.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow