how to detect if a network timeout received is due to the request-endpoint or something intermediate (such as a http proxy)

StackOverflow https://stackoverflow.com/questions/17380201

Question

As part of writing a crawler (in Node.js but really not the point), I sometimes receive timeouts and other network exceptions. Some exceptions (like http errorcodes) can be correctly attributed to the the targetted request-endpoint. Others, like timeouts that I configure myself are harder (impossible?) to attribute.

For instance when crawling with http proxies, how to check if exceptions (like the mentioned timeouts) are due to the proxy or due to the request-endpoint?

Was it helpful?

Solution

You should be able to rely on the proxy relaying whatever it gets as fast as possible, unless it is a home-grown program, in which case anything is possible. So you should treat all timeouts as originating from the upstream server.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top