The OWASP Top Ten and ESAPI – Part 1 – Cross Site Scripting (XSS)

No Gravatar

This article will describe how to protect your J2EE application from XSS using ESAPI. As with all of the detail articles in this series, if you need a refresher on OWASP or ESAPI, please see the intro article The OWASP Top Ten and ESAPI.

OK, so on to XSS. Here is a slightly modified definition of XSS from OWASP:
XSS flaws occur whenever an application takes untrusted (typically user supplied) data and sends it to a web browser without first validating or encoding that content. XSS allows attackers to execute script in the victim’s browser which can hijack user sessions, deface web sites, possibly introduce worms, etc.

As you can see, XSS essentially allows an attacker to splash whatever they want on the screen since the application doesn’t do any input validation or output encoding. This is not a big deal when you have benevolent users, but an attacker could, say, input some nasty JavaScript and cause quite a few problems. This is typically what happens – JavaScript is output and generally executes in the background so the user is unaware of what’s occurring.

In general there are 3 types of XSS:
1. Stored – The dangerous data is stored in a permanent data store and shown repeatedly on the site – think online forums. If an attacker were to save a forum comment that had an XSS exploit in it, that comment would then be displayed to anyone who visited the page, without requiring any further interaction from the attacker.
2. Reflected – The dangerous data is sent along with the URL typically, and is not stored in a data store but is displayed (reflected) back to the user, which launches the attack. If an attacker sent a URL in an email, and a user clicked on it b/c the site was “trusted”, say that user’s bank, but the site had an XSS exploit, the attacker could cause many issues up to and including processing transactions on behalf of the user at the bank site (more on this in a follow-on article).
3. DOM – This is the most recently named type of attack, and simply represents an entirely client side issue. While the previous 2 types of XSS have to do with the server outputting data to the browser, this has to do with the browser manipulating the DOM and data being moved in and out of context. This can result in an XSS attack if data is not properly validated or encoded. This type of XSS issue has become more prevalent with web 2.0 and the heavy proliferation of JavaScript frameworks being used to do both serious functionality and DOM manipulation in web pages.

XSS, by some accounts, is the most common vulnerability and definitely is one of the most dangerous in the wild. It is extremely prevalent due to a lack of education for the most part, but it can at times be tricky to solve correctly. There are essentially 2 options for how to deal with XSS. Both could be used individually to solve the problem entirely … in theory … but that would require detailed knowledge of every possible input/output vector both now and in the future. Since this is rarely feasible, it is recommended to use both approaches. These approaches are Input Validation and Output Encoding.

Input Validation
Input validation is simply that – checking each input for validity. This can mean many things, but in the typical and simplest case, it means to check the type and length of the data. For instance, if you are accepting a standard US zip code from a text box, you would know that the only valid type is a digit (0-9) and that the length should be 5, no more and no less. Not all cases are this simple, but many are similar.

Consider this example for what can go wrong. There is a simple search engine and on the page is the search box. A user types in a query only to find on the results page that his search terms are printed on the screen, something like – You searched for “free stuff”. That’s all well and good, but what happens when the user inputs a bit of JavaScript into the search box? If the application doesn’t handle the input/output properly, the screen will print the JavaScript out in all it’s glory, and it will get put inline in the web page response, and treated just as if the web page developer had put that bit of code in there. That’s where the problems come in – now how do we solve these issues?

Here’s an image from OWASP showing their architecture for input validation. The key here is that everything is validated, all input that doesn’t originate within the application (including user input, request headers, cookies, database data, ldap, really everything …).

ESAPI Input Validation

So how do we use this validation framework to actually validate our data. There are 2 basic types of methods in the validator interface that can be used. They are listed below:

getValidInput(java.lang.String context, java.lang.String input, java.lang.String type,
	int maxLength, boolean allowNull, ValidationErrorList errors)
isValidInput(java.lang.String context, java.lang.String input, java.lang.String type,
	int maxLength, boolean allowNull)

The first, getValidInput, returns canonicalized and validated input data along with a list of errors (ValidationErrorList) if any validation issues occurred. The second is similar, but does not return errors, and rather just returns a boolean as to whether or not the input is valid. Here are a couple real examples of these being used.

String validatedFirstName = ESAPI.validator().getValidInput("FirstName",
	myForm.getFirstName(), "FirstNameRegex", 255, false, errorList);
boolean isValidFirstName = ESAPI.validator().isValidInput("FirstName",
	myForm.getFirstName(), "FirstNameRegex", 255, false);

Both of the samples above deal with the first name field from a typical web form, but this could just as easily be a request header or parameter, or cookie value, or anything else.

In the end, there is great value in general in validating ALL of your application inputs. It will help solve the XSS issue, but will also solve other problems, including some we probably haven’t even thought up yet.

Output Encoding
On the flip side of input validation is output encoding (also known as “escaping”). Here’s a quick definition from OWASP: “Escaping” is a technique used to ensure that characters are treated as data, not as characters that are relevant to the interpreter’s parser. There are lots of different types of escaping, sometimes confusingly called output “encoding.” Some of these techniques define a special “escape” character, and other techniques have a more sophisticated syntax that involves several characters. Escaping is the primary means to make sure that untrusted data can’t be used to convey an injection attack. There is no harm in escaping data properly – it will still render in the browser properly. Escaping simply lets the interpreter know that the data is not intended to be executed, and therefore prevents attacks from working.

What does all of this mean to the developer? In order to prevent “bad” data from causing XSS issues on the screen when rendered, we can’t just out.println them to the screen – we have to “encode/escape” them. There are many libraries out now that do some form of encoding, most are minimal. c:out and jstl both do encoding – some of the struts tags, jsf, spring. All do some minimal encoding. However, ESAPI takes this to a different, but necessary level.

There are 2 issues with the previously mentioned frameworks when it comes to their encoding schemes. 1 – They don’t encode enough characters – they miss some things. 2 – They only encode for 1 context and miss the other 4. As mentioned in the cheat sheet, there are actually 5 output contexts for the browser:
1. HTML entity (this is the standard HTML output that the above frameworks at least partially handle)
2. HTML Attribute
3. JavaScript
4. CSS
5. URL

This image from OWASP shows their architecture for output validation. The important notion here is that before any data is output in the application, the context for output is considered, and the data is encoded.

ESAPI Output Encoding

All of these contexts have some special handling rules. I suggest you reference the cheat sheet in order to learn those. I’ll give a couple of simple examples here just for reference.

First, a simple example to output to an HTML entity.

//performing input validation
String cleanComment = ESAPI.validator().getValidInput("comment",
	request.getParameter("comment"), "CommentRegex", 300, false, errorList);

//check the errorList here ...

//performing output encoding for the HTML context
String safeOutput = ESAPI.encoder().encodeForHTML( cleanComment );

Now, an example of creating a URL that is safe for output.

//performing input validation
String cleanUserName = ESAPI.validator().getValidInput("userName",
	request.getParameter("userName"), "userNameRegex", 50, false, errorList);

//check the errorList here ...

//performing output encoding for the url context
String safeOutput = "/admin/"
	+ ESAPI.encoder().encodeForURL(cleanUserName);

Above, you can see that is it very simple to encode output for a given context, as long as you know the context you’re going to. It does take discipline to use this throughout your application, but it will pay you back many-fold in rewards. Additionally, ESAPI does have tag libraries that wrap each of the output encoding mechanisms available for use.

One final note regarding output encoding: you should always explicitly set the character encoding for all your pages (ISO-8859-1 or UTF8 are popular choices).

As you can see, there is quite a bit to solving XSS. It can become tricky, especially in the instances where multiple contexts are involved, like trying to safely pass a parameter to a javascript function inside an HTML attribute handler event, but it is doable. The big takeaway should be that input validation and output encoding, while they do solve XSS, are in general excellent practices that should be exercised for ALL input to an application.

A final XSS note: A great reference for understanding what is occurring in XSS and how to protect against it is the OWASP XSS Prevention Cheat Sheet. Various references in this article drew from the info on this site.

Other articles in this series:
Part 0: The OWASP Top Ten and ESAPI
Part 1: The OWASP Top Ten and ESAPI – Part 1 – Cross Site Scripting (XSS)
Part 2: The OWASP Top Ten and ESAPI – Part 2 – Injection Flaws
Part 3: The OWASP Top Ten and ESAPI – Part 3 – Malicious File Execution
Part 4: The OWASP Top Ten and ESAPI – Part 4 – Insecure Direct Object Reference
Part 5: The OWASP Top Ten and ESAPI – Part 5 – Cross Site Request Forgery (CSRF)
Part 6: The OWASP Top Ten and ESAPI – Part 6 – Information Leakage and Improper Error Handling
Part 7: The OWASP Top Ten and ESAPI – Part 7 – Broken Authentication and Session Management
Part 8: The OWASP Top Ten and ESAPI – Part 8 – Insecure Cryptographic Storage
Part 9: The OWASP Top Ten and ESAPI – Part 9 – Insecure Communications
Part 10: The OWASP Top Ten and ESAPI – Part 10 – Failure to Restrict URL Access

Note: Article updated on 11/18 per Jim Manico’s catch of improper url output encoding since ESAPI’s url encoding is intended for parameter values, not the entire url. Also added input validation as Jim’s comment pointed out was missing.

Be Sociable, Share!

Technorati Tags: , , , , ,

17 thoughts on “The OWASP Top Ten and ESAPI – Part 1 – Cross Site Scripting (XSS)

  1. String safeOutput = ESAPI.encoder().encodeForURL( “/admin/” + request.getParameter( “dangerousInput” ) );

    is a little off. The URL would actually “break” since ? would get encoded. I’d go this route:

    String safeURlToDisplay= “/admin/” + ESAPI.encoder().encodeForURL(request.getParameter( “dangerousInput”));

    Of course, I’d also do input validation (at least) as well.

    – Jim Manico
    ESAPI Project Manager

  2. John, the article is great, I would suggestion renaming the titles of posts to start from zero, like that the posts part numbers will match those on owasp top 10 vulnerabilities.

  3. Hi John,
    Great article, one of the few out there showing how to actually use the ESAPI API.

    In regards to output escaping, do you have any techniques for somehow escaping the outputs without having to manually wrap each output with an ESAPI tag?

    It’s not too bad if your just outputing to screen, but if your outputing into text fields using JSTL or struts tags you can’t really take advantage of these tag libraries if you have to manually escape their outputs with ESAPI.

    It would great to get your input.


  4. @ Brett
    As to your question, that’s a really good point. A couple of thoughts.
    – The option you mention is what I’ve used in the past, and I agree – rather irritating, but that’s what it is for now as far as I know
    – Another option posed on the ESAPI message boards has been to extend some of the framework classes (like the struts/spring/etc input/output tags), and hardwire ESAPI calls into some sort of an “extended framework” – this is an option that would limit the amount of code for the UI, but also means you have more code to maintain now
    – Yet another option would be to convince the framework developers to use ESAPI or ESAPI-strength output encoding for their tags, but good luck convincing them of that
    – Finally, Google and others are doing some interesting work in some other their templating frameworks that are J2EE compatible, or so I’ve heard, that do some contextual analysis so that the underlying framework actually detects the output context you’re in and encodes properly. Probably not an immediate solution, but something you might want to keep apprised of.

    Thanks for the note.

  5. Thanks for your response John,

    We’ve just retrofitted ESAPI into an existing application and apply all the escaping was not a fun job and there’s always a chance that you miss an input.

    I’m considering extending the struts2 tags myself for a new application we’re about to start building. The idea would be use the delegation pattern to wrap the tag library class and extend functionality as required.

    However, I have noticed in struts there are additional attributes (escapeHtml, escapeJavaScript, escapeCsv) to escape values for the property tag. I can only assume that this type of escaping is not strong enough.


  6. @ Brett,
    As for the struts2 tags, good luck with that process. I’m sure that’s something the ESAPI team would be interested in, so when you complete it, if possible, you might want to send an email to the ESAPI user and/or dev lists about what you did, and possibly contribute the code. I know there was someone a while back who planned to do something similar for the JSF tag set.
    As for those attributes you mentioned, I’m not sure what all of them are, but I know some of the struts2 tags (and those of other frameworks) have added the escJS=true/false type setup for deciding the output context. If you have that setup, you can just use the appropriate output encoding method from ESAPI. Otherwise, you’d just have to document what the safe context(s) is/are for your given tag. For instance, if your tag only html entity encodes, you’d have to document that that specific tag should never be used in the javascript context. Certain tags will make sense one place or another. That’s one of the development tasks you’ll have to work with.
    Good luck!

  7. Hey John,

    It’s a great article.

    I have a query regarding output encoding.
    How do i go about implementing it our J2EE application?
    I mean do I need to use tags to use escaping in all of the jsps that we have?
    And I still quite don’t understand “context”?

    Or can i use a filter to achieve escaping?

    Gosh, development seems to be the easier part. Securing the application is altogether different paradigm.


  8. @GirirajNo
    Thanks for the comment.
    As for how to implement output encoding, the general recommendation is to put it where your output is actually being rendered, so yes, that’s typically done in a jsp, whether it’s in a tag or a scriplet or something like that. Struts (1 and 2), SpringMVC, and other web frameworks have tags that usually do encoding (though not all of their tags do) that is sufficient for the html entity context, but nothing more.

    As for context I really can’t describe it any better than the cheatsheet – so see In short though, it just means the browser interprets the text differently depending on whether your between, say a <p> tag versus if you’re between <script> tags.

    I suppose a filter could be used to do output encoding via some parsing of the response or something along those lines, but it’s much more likely that it would be used for something like input validation. Something like checking all inputs (parameters, headers, etc) against some whitelist character set and for length might get you where you need to go, but that’s often not complete, because there will almost always be business requirements that force exceptions to that validation for proper usage of the app.

    As to your last point, I certainly understand where you are coming from. Development has been separate from security for a long time in the mind of most developers. Some in the educational system are trying to change that, and folks at owasp (and other groups that provide some sort of training/education) are trying as well, but it’s a long road ahead. I do know, however, learning the security piece well will make you a generally better developer, so it’s a worthy thing to spend your efforts on.

  9. Thanks a lot John,

    I think I am getting a feel of what this is all about.

    I can do validation in conjunction with Struts validator.(that’s what we use)
    As for output escaping/encoding I guess the best approach would be to change tag libraries and depending upon the 5 different contexts mentioned in the Prevention Cheat Sheet, start using encoding scheme selectively.

    I thought it would be straightforward but it isn’t. I would have to deviate from the usual developer’s mentality.

    What do you think?

    Thanks again for your time…

  10. @Giriraj
    That’s pretty much right. However, I will point out that it doesn’t necessarily mean that you must use a different tag library, though that is an option, and ESAPI does have good output encoding libraries via ESAPI.encoder()… Many of the struts tags do provide sufficient html entity output encoding (by far the most popular output context). In the end, you just have to think about where the data is going and see if the protection is good enough.

    XSS is rather simple to solve in *most* cases that I see (though that’s just the examples I see), but can be difficult to solve in others, especially in cases where you put data into locations that process url and or javascript content, because browsers often try to “help” and will actually process multiple contexts at once, so you may have to end up wrapping multiple contexts like javascriptencode(urlencode(data)), or something along those lines.

  11. Is there a forum for ESAPI? I need to disable strong password checks, and the complete lack of documentation is daunting.

  12. @Charles,
    Yes you can look at the esapi-dev and esapi-user mailing lists (and their archives). Thought it’s not a standard forum, it’s the mechanism you can use to get info. I’d start w/ the user list first.

  13. Hello,I have a question. when we use Validator.getValidInput method then we must have to use whilelist? In our case there are lots of input chars are allowed and finding all those are very difficult. Is there any other way validation can be done in ESAPI?

    Thanks for your early reply.

  14. Arivarasu,
    No, that’s how it’s setup for doing input validation. In general, you should use this mechanism however. You could easily implement new validators if need be, or possibly structure your regex to reject bad input. If you’re doing validation on input where you don’t know what you’re receiving, it’s likely you just want output encoding for unsafe characters for whatever downstream interpreter you might use (ie. html encoding for entities on a web page, prepared statements for database storage, etc.)

Leave a Reply

Your email address will not be published. Required fields are marked *