Introduction
This article describes how to solve XSS attack, please see my previous article that introduces what is XSS attack.
The Things That Server Side Can Do
1. HttpOnly
Only allowed to read Cookie under HTTP/HTTPS protocol, don't allow JavaScript to read cookies. The supported browsers are Internet Explorer 6+, Firefox2+, Google, Safari4+.
JavaEE code that add HttpOnly into Cookie:
response.setHeader("Set-Cookie","cookiename=value;
Path=/;Domain=domainvalue;Max-Age=seconds;HTTPOnly");
P.S.: For HTTPS, we can also set the Secure field to encrypt the cookie. This way actually prohibits JavaScript from reading Cookie, it's not a precaution.
2. Handler Rich Text
Some data cannot be encrypted in server side because of the usage scenarios issue. But the rich text is fully HTML code, it cannot be put into an attribute when output it, so you can handle it on the server side. The way to deal with it is to set the white list of labels and attributes, not allowed it has special labels or attributes (e.g. script, iframe, form, etc.), yes that's XSS Filter. Filter data before you store it.
There is a Java open source project named Anti-Samy which is a good XSS Filter:
Policy ploicy = Policy.getInstance(POLICY_FILE_LOCATION);
AntiSamy as = new AntiSamy();
CleanResults cr = as.scan(dirtyInput, policy);
MyUserDao.storeUserProfile(cr.getCleanHTML());
P.S.: Of course, we can filter data before it displays on the client side, but I think it's better to do that on the server side, that only needs to be done once and can save a lot of time for front-end developers.
The Things That Client Side Can Do
1. Input Check
Input check also needs to be done on the server side, because user can bypass the JavaScript check easy. Currently, the most popular way to do the input check is do the JavaScript check on the Client side and also do the same logic check on the server side. The JavaScript input check can prevent most of the attack that can save many server resources.
In short, input check needs to be done both on the Client side and the Server side.
In addition, the places that attacker can input XSS, e.g.
1. all input boxes on page
2.window.location(href,hash,etc)
3.window.named
4.document.referrer
5.document.cookie
6.localstorage
7.XMLHttpRequest returned data
Of course, not limit these ways.
2. Output Check
In general, when parameter output on page encode or escaping them to prevent the XSS attack.
The essence of XSS attack is HTML injection, user's input was considered to be part of HTML. Consequently, confuse the original meaning and generate a new meaning.
The places that can trigger XSS:
1.document.write
2.xxx.innerHTML=
3.xxx.outerHTML=
4.innerHTML.replace
5.document.attachEvent
6.window.attachEvent
7.document.location.replace
8.document.location.assign
If using jQuery, that is the place where you use 'append
', 'html
', 'before
', 'after
', etc. Most of the MVC frameworks can deal with XSS issue automatically on its View layer. Such as AngularJS.
What Can Encode Output?
Usually use HTMLEncode
and JavaScriptEncode
, both Client and Server sides can do that.
HTMLEncode
, i.e., convert string
into HTMLEntities
, usually convert these characters (&?<?>?"?'?/). JavaScriptEncode
, use '\' to convert special characters.
Where is There a Need to Encode?
- Output in HTML labels, attributes -- use
HTMLEncode
- Output in the labels of JavaScript -- use
JavaScriptEncode
- Output in events -- use
JavaScriptEncode
. i.e., <a href="#" onclick="funcA('$var')">test</a>
- Output in CSS -- use the way similar to JavaScript, encode all characters except letters and numbers into hexadecimal format "
\uHH
" - Output in URL -- usually if a parameter is a whole URL address, then check if the parameter starts with '
http
', if not, then add 'http
' for it to prevent fake protocol XSS attack, then use URLEncode
to encode that parameter.
P.S.: URLEncode
can convert characters into "%HH
" format.
Summary
The front-end developer must know use of the right encode methods in the right places. Sometimes, we need to use both HTMLEncode
and JavaScriptEncode
at one place, not always use one method.