Introduction
This article describes how to access restricted web sites from behind firewall / HTTP proxy.
The application discussed in this article provides the ability to access any restricted web site from behind firewall / HTTP web proxy using HTTP / HTTPS over HTTP. This is also called HTTP tunnelling. Before using the sample application given above, you must read the details HERE.
Brief Introduction About HTTP / HTTPS / Proxy Server
HTTP is a protocol to send / receive web contents. Request issued by HTTP client (i.e. web browser) can only be recognized by HTTP server where HTML page and other resources are stored (i.e. Apache HTTP server).
On the other hand, HTTPS is the secure version of HTTP protocol. Data transferred between client and server are always secured including images.
HTTP proxy server is an intermediate Server which can understand HTTP protocol.
Request issued by Browser goes to Proxy server first and then Proxy server forwards the request to main destination server and carries the response to client.
Proxy server is mainly used to cache web resources such as HTML pages, images, etc., and evaluates the request according to its filtering rule configured by your network administrator. Filtering rule can be set up by IP address of destination server, specific keyword contained in request/response. For an example, an HTTP proxy server which has enlisted “JOB search” as a bad keyword will block http://example.com/ITjob.html.
This filtering can be done at URL level, page title level and body content level. Firewall is a network application which has no knowledge about HTTP protocol, but according to its filtering rule it can drop any request made by the client. For an example, you won't be able to access example.com if it is in bad list of firewall. Any attempt to connect to this server will be refused by the Firewall. Finally about anonymous surfing – When client connects to a server, your IP is visible to the server and you can easily be tracked. In many situations, if you want to surf the web, but don't want the server to know who you actually are, you must be interested in using Proxy server, because the destination server will get the Proxy server's IP only, not yours.
How HTTP Protocol Looks Like
If user wants to open www.example.com/index.html using his favorite browser, what his browser does is: it first connects to www.example.com.
It then sends the following line to www.example.com which is called HTTP request header:
GET http://example.com/index.html HTTP/1.1[CRLF]
Host: example.com[CRLF]
Connection: close[CRLF]
User-Agent: Mozilla[CRLF]
[CRLF]
HTTP server receives the request and returns the content along with HTTP response header.
HTTP/1.1 200 OK [CRLF]
Date: Tue, 21 Jul 2009 17:12:32 GMT [CRLF]
Expires: -1 [CRLF]
Content-Type: text/html [CRLF]
Server: gws [CRLF]
Connection: close [CRLF]
[CRLF]
<html>This is response</html>
On the other hand, in the growing internet era, we also need to access some secure sites which generally start with https. A secure site looks like: https://securesite.com.
The Proxy server which is responsible to handle this type of secure requests is called HTTPS proxy server/ secure proxy server. In terms of handling mechanism, it behaves completely differently as compared to normal HTTP server. HTTPS proxy server can only see the Host (destination web server name and port) since the Request response data are completely encrypted. So HTTPS proxy has no way to cache data or evaluate data against any filtering rule but to depend only on destination Host and Port. Here HTTPS proxy server acts as a tunnel.
For an example, when the first user types URL https://myfinance.com, the browser sends the following line to HTTPS proxy server:
CONNECT https://myfinance.com HTTP 1.1
User-Agent: Mozilla
Now the proxy server connects to the destination server if it is successful it sends the following line as a response to the client.
HTTP/1.1 200 OK
Proxy-Agent: agentname
Now onwards HTTPS proxy acts as a tunnel just forwarding the request/response to server and client. So HTTPS proxy can block any site by neither by its page name nor its title / contents.
What is SSL Tunnel ?
SSL tunnel is responsible for establishing initial connection between client and destination server. Once the connection is established, it just forwards the encrypted request/response to client/server without having any knowledge about what is going on. Infact the protocol may not be SSL. Most of the proxy restricts outgoing connection except 443 port, so the client can talk using any protocol but in this case destination server’s port must be 443. Using this blind feature of SSL tunnelling, any application (e.g. Yahoo messenger) can connect to an outside server.
What is SSL Bridge ?
Unlike SSL tunnel, it is completely aware of what is actually going on. In functionality, it is the same as a normal HTTP proxy server.
When client requests for a secure ( HTTPS) site, the request first goes to the SSL bridge server, the bridge server sends its own SSL certificate to browser instead of sending destination servers certificate. Then onward browser encrypts all requests using bridge server’s certificate.
When bridge server receives request from browser, it decrypts all encrypted messages using its own certificate which it has sent during handshake. And it scans the request header some time request body. If the request contents pass proxy rules, it again encrypts the message using destination servers certificate.
Following is the flow:
Browser (encrypt/decrypt message using bridge certificate)<<->>
<<->> SSL Bridge server (encrypt/decrypt using own
certificate towards browser) (encrypt/decrypt message using main
certificate towards outgoing server) <<->>Main HTTP server.
How Restrictive Proxy / Firewall Behaves
When proxy is in-between your browser, send almost the same HTTP request to proxy server except some extra headers. (e.g. Proxy Connection : Keep Alive)
If the proxy is restrictive in nature, it checks the request and according to rules, it forwards the request to the destination server or rejects the request. Proxy filtering activity does not stop here. It can also check response header and response body contents or content type and as per its rule, it can block the response from being reached to the client.
For an example, a school authority always wants to block all other web sites except educational sites. If the client browser issues a request for accessing a social networking site, the Proxy will get the request and after scanning the request, it will drop the connection.
If Firewall/Proxy does not do content filtering - encryption will not be required:
What is the Solution?
The solution is very easy. Use an external HTTP tunnel server which is not blocked by your firewall/proxy and a HTTP tunnel client in your local PC.
Tunnel server is nothing but an HTTP server having some special script (i.e. PHP, ASP) which can receive Tunnel client request and process it. Infact HTTP tunnel client is also a server which runs in local. Browser first sends the request to HTTP tunnel client. HTTP tunnel client encrypts the main request & body and sends it to external HTTP tunnel server through proxy. External tunnel server then decrypts the message and finds out what is the main destination server and port and which protocol to use HTTP : HTTPS . It connects to the main destination server and receives the response and again encrypts the response header & body and sends it to HTTP tunnel client. HTTP tunnel client then decrypts the message and sends it to browser. Browser displays the HTML page. Easy!!!!!!
Here encryption may not always require bypassing firewall if firewall/proxy does not block site using content filtering technique. At the same time, strong encryption algorithm is not required if encryption is really necessary.
To fool firewall simple encryption / decryption algorithm is enough to prevent firewall from content filtering or request header scan.
Note that here the main HTTP request (which is issued by the browser) has also been packed in HTTP request body which will be sent by tunnel client to tunnel server.
The following picture shows you the details:
Conclusion
Schools / colleges / organizations impose rules keeping various factors in mind.
And you should abide by this rule. This article is only for enriching your knowledge if you are not familiar about this.
Before using this technique, think for a minute. Though proxy server / firewall is completely unaware of which main destination server you are connecting to, it is possible for the administrator to track whether you are using any HTTP tunnel software in your local machine which may create trouble for you.
You can download the sample application from the above given link. Before using the sample application, you must read the details HERE.