What HTTP response code should be used to throttle a badly behaved web crawler:

Should any explanation be returned in the headers or in the body?

有帮助吗?

解决方案

Actually, the recommended(RFC6585) http status is 429 Too Many Requests. It is used, for example, on Twitter REST API Rate Limiter.

However, GSA will internally return 503 Service Unavailable if you flood it with requests, so IMO it's a safe assumption that it also expects external sites to behave in the same manner.

I went with 503 Service Unavailable on my throttling solution.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top