Question

Is there a way to find out if anyone is calling the image located on my website directly on their website?

I have a website and I just want to make sure no one is using my bandwidth.

Was it helpful?

Solution

Sure there are methods, some which can be trusted a little more than others.


Using Referer-Header

There is a HTTP-Header named Referer which most often contain a string representing the URL which a user visited to get access to the current request.

You can see it as a "I came from here"-header.

If it was guaranteed to always exists it would be a piece of cake to prevent people from leeching your bandwitdh, though since this is not the case it's pretty much a gamble to just rely on this value (which might not exists at times).


Using Cookies

Another way of telling whether a user is a true visitor on your website is to use cookies, a user that hasn't got a cookie and tries to get access to a specific resource (such as an image) could get a message saying "sorry, only real visitors of example.com get access to this image".

Too bad that nothing states that a client is forced to implement and handle cookies.


Using links with a set expiration time [RECOMMENDED]

This is probably the safest option, though it's the hardest to implement.

Using links that is only valid for N hours will make it impossible to leech your bandwidth without going into trouble of implementing some sort of crawler which regularly crawls your site and returns the current access token required to get access to a resource (such as an image).

When a user visits the site a token generated N hours is applied to all resources available is appended to their path sent back to the visitor. This token is mandatory and only valid for N hours.

If the user tries to access an image with an invalid/non-existent token you could send back either 404 or 401 as HTTP status code (preferably the later since it's a Forbidden request).

There are however some quirks worth mentioning:

  • Crawlers from *search-engine*s might not visit the whole site at a given moment inside the N hours, make sure that they can access the whole content of your site. Identify them by using the value of header User-Agent.

  • Don't be tempted to lower the lifespan of your token to less than any reasonable time, remember that some users are on slow connections and that having a token of 5 seconds might sound cool - but real users can get flagged erroneously.

  • never put a token on a resource that people should be able to find from external point (search engines for one), such as the page containing the images you wish to protect.

    If you do this by accident you will mostly harm the reputation of your site.


Additional thoughts...

Please remember that any method implemented to make it impossible for leechers to hotlink your resources never should result in true visitors being flagged for bandwidth leech. You probably want to ease up on the restriction rather than making it stronger.

I rather have 10 normal visitors and 2 leechers than no leechers but only 5 normal users (because I accidentally flagged 5 of the real visitors as leechers without thinking too much).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top