Introduction
This small tip is to demonstrate one way of preventing the resource leech from our website.
Background
For those who got confused by the term resource leech, let me first define what a resource leech is. I had a website that contains a lot of images and PDF files which are quite useful for users. So some guys running simple blog pages thought it would be a good idea to have their blogs providing those resources (so that they could get more visitors and make more pennies). They wrote some small 4 line text for each resource and provided the image and PDF links from my site directly.
Now, one thing is that the content is actually hosted on my site and users are getting them without even knowing that. But the major problem is in bandwidth usage. Why would I want to spend my bandwidth for serving the images/PDFs to some other websites.
Using the Code
I thought about the problem and decided to write simple code to prevent this problem. The solution is not very secure as the advanced users can still get their way around by modifying the HTTP header of their request but that is not what most guys will do.
What we can do is to simply:
- handle the
Application_BeginRequest
method in the global handler. - find the HOST servers URL, i.e., My servers URL.
- find the requesting servers URL.
- check if the requesting server belongs to my domain or not.
- if the requesting server does not belong to my domain, I will end the request without serving.
void Application_BeginRequest(object sender, EventArgs e)
{
HttpApplication application_ = (HttpApplication)sender;
HttpRequest request = application_.Context.Request;
string myServer = request.ServerVariables["HTTP_HOST"];
string referingServer = request.ServerVariables["HTTP_REFERER"];
if (referingServer != null)
{
if (referingServer.StartsWith("http://" + myServer) ||
referingServer.StartsWith("https://" + myServer))
{
}
else
{
application_.CompleteRequest();
}
}
}
Perhaps calling the CompleteRequest
is not an elegant solution, but it worked fine for me.
Points of Interest
As I said, this approach relies on the HTTP header information, so advanced users can get around by modifying the HTTP header information. Perhaps, the ideal way to solve this problem is to have HTTPHandler
s to each resource type, i.e., .jpg, .pdf and prevent leeching.
History
- 9th March, 2012: First version