What is Cross Site Scripting(XSS)?
Cross site scripting(XSS) is a web application vulnerability which helps attacker to attack visitors/users who visits the site. Attacker injects client-side script into webpage and when someone visits the site, script gets executed and user becomes victim of the attack. Unlike other injection attacks(SQL Injection, Command Injection etc), XSS does not attack the web application server or database. It uses the web application as a medium/platform to execute malicious script on user's browser and allows the attacker to get unauthorized access to things that would not be allowed under normal conditions. Mostly javasript is used for XSS attacks but VBScript, ActiveX, Flash can also be used.
What can be done using XSS?
XSS can be used to do lot of things. It depends on the attacker how he wants to use it. Below mentioned points are just a few examples of it.
- Stealing user cookie:- Attacker can use the injected script to collect user cookie details and set cookies in its browser. Since most of the web applications validates user using session id present in cookie, wep application will allow attacker to log in to that user's account.
- Scrapping confidential information:- Injected script can be used to scrap the web page to collect confidential information like credit card details, SSN etc.
- Posting data/doing action on behalf of logged in user:- XSS can be used to do a form post like positing a comment, triggering events like rating a post without logged in user's consent.
- Malicious redirects:- Injected script can be used redirect user to any url attacker want like a fake login page asking user to provide login details. Once user provides details, it is sent to the user.
- Malware attack:- Attacker can use the script to trigger browser vulnerability to install malware in user's system and use it to control/use user's system.
- Social engineering:- Attacker can inject new html code in the page to ask user about personal information like address, billing details etc or can show wrong error messages to trick user to download and install malicious software on its system.
How can someone inject script into a web page?
Script can easily be injected into a web page using one of the various ways a web application collects input. XSS can be performed by passing scripts in following.
- Input fields(textbox/textarea):- Script can be inserted in form fields which is used to collect information from user.
- Query string:- Attacker can pass script as a query string parameter also.
- Cookies:- If a web app uses cookie to store data temporarily before saving in database, attacker can change cookie value inject script.
- Data received from external source:- Data received from an external source can have script and if that data is directly used to show in browser, script executes and gives access to attacker to do whatever he wants
- DOM:- Sometimes data available in DOM also contains script/malicious code, which is not handled properly, may be the reason of XSS attack. We will see an example of it below.
XSS Attack Vectors
Lets see few attack vectors which can be used to inject script and execute it.
If you want to know the complete list of XSS attack vectors, you can visit XSS Filter Evasion Cheat Sheet page of OWASP.
Type of XSS
XSS attack can be of three types. One is Stored XSS(or Persistent XSS) and second one is Reflected XSS(or Non-persistent XSS) and last one is DOM based XSS. In Stored XSS and Reflected XSS, malicious data (injected script) pass though server but in DOM based XSS, data never goes through server and it rendered in browser using client side script only
Stored XSS(or Persistent XSS):-
Stored XSS occurs when injected script in input field(or via any other data option using which a web app collects data from user) is saved/stored in web server(in database, file etc) and same data is displayed to other visitors/users.
Reflected XSS(or Non-persistent XSS):-
In reflected XSS, injected script is not stored in web server but it is sent to browser directly for display purpose and when browser renders the data script gets executed.
DOM Based XSS:-
DOM based XSS occurs when client side script uses data from the DOM to display content or render a section of web page. In DOM based XSS, server side interaction in not needed. Main source of DOM based XSS attacks are
- document.URL
- document.baseURI
- document.location.href
- document.location.hash
- document.location.search
- docuemnt.location.pathname
- window.name
- document.referrer
and functions/html attributes which help in DOM based XSS are
- document.write()
- (element).innerHTML
- eval()
- setTimeout()
We will see examples of all above type XSS attacks and steps to stop those attacks.
How can we prevent XSS?
XSS can be prevented by filtering and HTML encoding data received from user or external source. It totally depends upon the requirement/situation, which method needs to be implemented to prevent XSS. Lets see what happens when we use above two methods to prevent XSS.
Filtering data for XSS:
The simplest and easiest way to protect from XSS is to pass data to a filter which removes all dangerous keywords from user provided data like <script>
, dangerous style properties like background-image:url()
, HTML mark ups which contain event handlers which are triggered without user interaction like <img src="non_existent_path" onerror="alert(1)"/>
. All web oriented programming languages provide function/method to remove HTML tags from a string(user provided data).
If user provides <script>alert(1);</script>
in any input field, after filtering it becomes alert(1);
and when this data is sent to browser, it is displayed as alert(1);
.
Since filtering data removes HTML tags from the data, it is not suitable to use data filtering method always like in case of writing a technical article about web designing. When we write a web designing article, we need to provide code in article so that reader gets a better idea about code. But if we implement article saving functionality to remove all html tags to prevent XSS, it will remove all HTML tags and the article will not serve its purpose. To overcome this situation, we use to encode user provided data.
Encoding Data for XSS:
This is the most preferable method to prevent XSS. When we encode user provided data and send it to browser, it means we are telling browser to treat it as data not as html. In this method of preventing XSS, all characters which have HTML character entity equivalents are translated into these entities. So when encoded data is sent to browsers, browser renders the encoded data which is displayed as HTML character. Lets see few HTML characters and their encoded value.
Character | Entity Name | Entity Code |
< | Less than | < |
> | Greater than | > |
" | Double quote | " |
& | Ampersand | & |
If user provides <script>alert(1);</script>
in any input field, after encoding it will be converted to <script>alert(1);</script>
; and when this data is sent to browser, it will be displayed as <script>alert(1);</script>
instead of getting executed as script.
As we are aware of options to prevent XSS in application, a question may arise - when should we apply filter or encode data? Lets understand something else before getting an answer to this question.
Normally more than one developer work in a web application. They make modules and one or two developer work on a module. Data collected in one module may be displayed on some other related module also and the developers who work on module which collects data may not be working on the module. If developers who work on module which collects data, do not filter escape data, then all developers who use that data have to be informed about that and apply filter or encode data while displaying it. Sometimes a developer may miss to filter or encode data while using it in another module which will result in XSS vulnerability. Even if all the developers use filter or encode data, it does not make sense to do the same thing in many places which can be done in one place - while saving data in database. It will be easier for developers working in other modules also, they do not need to be think about data from other modules may cause security issue and they can concentrate on the module they work. So the best place to filter or encode data is to do it while saving in database so that other modules will also get filtered/encoded data.
Examples
Now lets see some real time examples. I will use PHP to show examples. But same thing can be done using other web oriented programming languages also.
1. Signup confirmation by admin
Lets assume in a site, once user fills up the sign up form, it goes to admin for verification and activation of account and after admin confirms account gets activated. During sign up process attacker inserted script in address field like
<code>E Drachman St, Tucson, AZ 85705 <script type="text/javascript">alert('XSS');</script></code>
and address field data is saved in database as it is. When admin visits Pending Signup Confirmation page, all list of pending signup confimation is pulled from database and displayed. But while displaying data, address for user is displayed under Address column and script gets executed. So admin will see an alert with message XSS.
To prevent XSS in such cases, we can pass the input data thorough htmlspecialchars
or htmlentities
function which converts all html special character to equivalent html entities.
$address = 'E Drachman St, Tucson, AZ 85705<script type="text/javascript">alert("XSS");</script>';
$address = htmlspecialchars($address);
echo $address;
After passing address field though htmlspecialchars
function before saving in database, data in address field looks like this
<code>E Drachman St, Tucson, AZ 85705 <script type="text/javascript">alert('XSS');</script></code>
and when this is displayed in browser, it looks like
<code>E Drachman St, Tucson, AZ 85705 <script type="text/javascript">alert('XSS');</script></code>
Here <script> is normal text only not an HTML tag so it does not get executed and there will be no alert message.
2. XSS attack using query string
Lets assume, your employer shows employee profile in their web site. The URL for employee profile is something similar to this - http://www.employersite.com/profile.php?id=112
. and content displayed in profile page is shown below.
Employee ID:- 112
Name:- John Smith
Address:- E Drachman St, Tucson, AZ 85705
Designation:- Sales Manager
Employee ID show in the page is pulled from query string and is displayed using $_GET['id']
without filtering and encoding the data. So if we modify the url as this
<span style="color: rgb(153, 0, 0); font-family: Consolas, 'Courier New', Courier, mono; font-size: 15px;">http:
it will show an alert with message XSS when employee visits the page. This is an example of Reflected XSS because data is not saved in server.
To prevent XSS here, we filter the id value received in query string. Applying filter on above supplied employee id removes the HTML tag <script> and returns alert('XSS') only and it is treated a simple tax and will be displayed in browser instead on showing an alert message.After filtering the input parameter, output will look like below.
Employee ID:- alert('XSS');
Sorry!!! No employee exists with that ID.
Employee ID alert('XSS'); shown in the above is output displayed from query string after passing it thorough strip_tags function which removes the html tag leaving text inside it. Since alert('XSS'); is not a valid id, there is no employee in database with this id, so error message "Sorry!!! No employee exists with that ID." is shown in profile detail page.
In the above two examples we saw two cases where we encoded data to stop XSS and filtered data to stop XSS in web application using server side methods. Lets see an example where XSS occurs when data is rendered using client side script only. And after that we will see how to stop XSS using client side methods.
3. An example of DOM based XSS attack
Imagine the following page http://www.somewebsite.com/xss-test.html contains the following code:
<script>
document.write("<strong>URL</strong> : " + document.baseURI);
</script>
If a request like this http://www.example.com/test.html#<script>alert(1)</script> is sent, JavaScript code will be executed, because the page is writing whatever is typed in the URL to the page with document.write function. If you look at the source code of the page, you will not see <script>alert(1)</script>
because it is all happening in the DOM and is done by injected javascript. Once code is injected in page, one can simply exploit this DOM based cross-site scripting vulnerability to steal the cookies of the user, change page content etc.
So how can we fix this? the best way is to write it to an element like div, span and when writing do not use innerHTML, use textContent or innerText. Lets see how can we solve this by modifying our initial code.
<strong>URL</strong><span id="xss_text"></span>
<script type="text/javascript">
document.getElementById('xss_text').textContent = document.baseURI;
document.getElementById('xss_text').innerText = document.baseURI;
</script>
Above code will write the url in browser but it will display the text <script>alert(1)</script>
instead of executing it.
4. A special case - Filtering HTML element attributes
Lets see another situation - assume a website allows you to write articles and format content, set color for text, add images etc. Such a functionality is usually implemented using an editor which allows to set font size, add image, format text - almost all things we need to make our article look good and well readable. When article is saved, web application saves the html content as it is in database so that it will be easier to render exact design formatted by article writer. But if an attacker wants, it can change the html code using the source viewer provided by the editor and add event handling attributes like onclick, onmouseover, onerror and add script to these event handler. When article is saved, it will save all event handler attributes also, which will be triggered and script attached to that event will be execute, after article is rendered and if user does what is needed to trigger that event. All event handling attributes are dangerous in such case but events like onerror are more dangerous because these events gets triggered automatically in case of error in loading image.
<img src="http://falseurl.com/false123.jpg" onerror="alert(1);"/>
If article contains an image with a non-existent source and onerror attribute, when article is rendered to show users, they will see an alert message. In such cases we can't filter all html tags as article will look different than the one article writer wants and we can't encode html characters also as it will show all html content in browser. So we need to do something which will retain the design of article that is done by the article write and remove dangerous attributes.
To stop XSS in such cases, we need to create a custom filter which removes all attributes present in the html tags of article so that only safe HTML code is present in the article source code.
In all of the above examples, we see alert('XSS') or alert(1) which are not harmful. But it can be replaced with code which loads an external javascript file and executes code in it. Above examples show few possible scenarios where attacker can inject code and run its own script. Steps mentioned in above examples stop XSS is just for example purpose. It does not mean that you have to do that only in that situation. It totally depends on your requirement - how you want to handle user provided data if it has some malicious code. You can even validate the data and show error message for invalid data also.
Conclusion
Cross site scripting exists in web application as a result of negligence of developer while handling data or false belief in user that user will not provided malicious data. Sometimes it is a result of less knowledge of the developer about secure web application programming. We can get rid of XSS by putting some extra effort while building the application and keeping in mind that the web application we build is accessible by both a normal user as well as an attacker.