John Melton's Weblog
Java, Security and Technology

Year Of Security for Java – Week 36 – Solve SQL Injection

No Gravatar

What is it and why should I care?
SQL Injection (SQLi) is an issue that is caused because of poor code/data separation. The general issue is that a developer intends the user input to be interpreted as data, but an attacker can manipulate the input to cause the database to interpret the input as commands.

There have been a lot of devastating attacks recently using SQLi. You don’t have to look very far to see the amount of damage it’s caused, involving great financial, political and reputational impact. You probably won’t have a tough time selling the C-Level exec on the idea that SQLi protection is important.

What should I do about it?
Note: One great resource which I drew from for both this post as well as the more in-depth post is the SQL Injection Prevention Cheat Sheet over at OWASP.

I’ve already written pretty extensively here on the basic problems with and solutions for SQLi. However, I’ll cover the solutions briefly here for clarity’s sake.

1. Use Parameterized Queries (Note that in Java, parameterized queries = prepared statements)
Parameterized queries are a great solution because they’re fairly simple to learn,are ubiquitous and don’t require your developers to learn another API – it’s just SQL. (One nice additional benefit: it will make your security guy happy)

2. Stored Procedures
Stored procedures are a good option in some environments where you want to have the query management done separately from the code, and are often helpful for performance if properly tuned.

3. Output Encoding/Escaping
Output Escaping is more of a last resort. It can be properly done, but it’s usually more work than the other 2 options, requires more customization, and is more error-prone for most developers. If you go this route, it’s definitely advisable to use a common security framework such as ESAPI so that your controls are consistent across all your applications.

Now that we’ve covered the basic technology solution for SQLi in Java, let’s consider the larger context of eradicating entire classes of vulnerabilities from our codebase as I discussed last week. Let’s look at each specific sub-point I mentioned in that post and consider what you might do regarding SQLi.

1. Understand the problem.
I’d read available documentation on SQLi to make sure I understood the issue as well as I *think* I do. I’d also go read about SQLi for the database platform(s) that is used in my environment. I’d look for any specific weaknesses in implementation that I need to be aware of.

2. Consider all use cases where the issue can occur.
I would think about all the different types of database interactions that occur in my application portfolio. How many languages am I developing in or supporting (Java, C#, Cobol)? What platforms am I developing for (web, services, desktop, mobile, mainframe, html5)? What interaction paradigms are they using (OLAP, OLTP, Warehousing, Batch)?

3. Evaluate solutions.
I would look at the SQLi cheat sheet from OWASP (and any associated weaknesses with the approaches that it espouses). Once I understood the recommended solution(s), I would consider whether said solution(s) would work in the environment in which I’m working. I would focus on using the best solution possible for everyone who can use it, then consider exceptions if it’s not supported by a given interaction paradigm based on business need.

4. Institutionalize the chosen solution.
After deciding on a solution (say parameterized queries), you have to work with development teams to determine the current status of database interactions (are they using dynamic sql now or are they already on parameterized queries?). This step is a good place to (re)train developers on this specific topic from a security perspective, since they’ll be implementing the solution. You then just have to do the work of fixing queries, and re-testing applications to make sure everything still works as expected. Note: In addition to solving SQLi from the “code” perspective, there are also additional steps you can implement here that increase assurance such as lowering the privileges/access of users so they can’t cause certain types of harm even if they were to be malicious.

5. Add technology and processes for verification.
At this point, I would add a few tools in to make sure the technology is being used. I would make sure that there were static and dynamic analysis tools in place to check for SQLi. I would also make sure to code review closely any queries made by the application. I might go so far as to write custom rules for my static toolset that “trusts” my safe output functions to reduce any false positives. Lastly, I would monitor logs for suspicious activity. Using something like AppSensor is a way to automate some of that process.

Hopefully the approach I described makes sense to you. There are certainly additional steps you can take, but this should be a good start. The idea is that you don’t just go and squash a few bugs. The hope is that you do solve the immediate problem, but make it impossible (or as close as you can get) for the problem to resurface.

References
———–
http://www.jtmelton.com/2009/12/01/the-owasp-top-ten-and-esapi-part-3-injection-flaws/
https://www.owasp.org/index.php/SQL_Injection
https://www.owasp.org/index.php/SQL_Injection_Prevention_Cheat_Sheet

Technorati Tags: , ,

Preventing Log Forging in Java

No Gravatar

This article will provide a quick overview of log forging and discuss a couple simple solutions to prevent it.

First, what is log forging?

Logging is one of the most common things that an application does. Logging is a very generic term that can mean lots of different things, from debug style logging for the developer, up to and including functioning as the system audit log. The 3 most common logging mechanisms I see in real world use are:
1. System output (System.out.println)
2. 3rd party logging library (log4j, commons logging, slf4j, etc.)
3. DB Logging

System output
System output is a functional mechanism, but tends to provide spotty information at times, since developers often don’t log everything they should in order for the log to be functional. A simple example of this is that essentially all logging should include a timestamp and at least the logged in user if applicable, but System.out logging often doesn’t include these 2 basic pieces of information.

3rd party logging library
The 3rd party logging libraries are the workhorses of the java logging world. They are probably the most used option and for good reason. The popular frameworks provide very simple configuration, offer all the basic features (and even some advanced features), and can be setup to log the same “meta-information” for each log entry (such as timestamp, username, etc.).

DB Logging
Most often DB logging is used for log/audit data that needs to be maintained for a reasonable amount of time. If there is auditing going on for the application, a DB is a popular choice since the audit records can actually reference the data it is about using the relational DB mechanisms already built into the DB product. Also when logging to a DB, there is usually a reasonable amount of data that is stored. This is likely due to the fact that someone sat down and thought out what data should be stored, and created the DB schema to store that info – that way the standard for what data needs to be logged is perfectly clear to the developer.

Now that we’ve talked about a few basic types of logging, let’s see what log forging is. Let’s consider this bit of code.


String previousPage = request.getParameter("prev");
logger.info("Previous page from request was: " + previousPage);

This simple bit of code just writes a request parameter to the log. Pretty simple. Let’s consider two scenarios here. If the user is using the application as intended, something like the following might be sent in – “prev=viewCart”, telling the app the page the user was coming from was a view shopping cart page. This would result in the following being written in the log.


[2010-10-21 18:45:15] [bill] Previous page from request was: viewCart

Now what would happen if an attacker instead decided to modify the request parameter and instead entered “prev=viewCart\r\n[2010-10-21 18:45:15] [steve] Account successfully created” as the request parameter? (Note the data I entered above is not actually URL encoded for clarity, but would be in a real example) Now the log will look like the following:


[2010-10-21 18:45:15] [bill] Previous page from request was: viewCart
[2010-10-21 18:45:15] [steve] Account successfully created

Now the attacker has been able to create a forged entry in the application log which reduces the value of the logs, and frustrates any forensic type activities. This is the essence of log forging.

So how do I prevent it?

There are various options to prevent log forging. I’ll discuss just a few and my opinions of them.

1. Validate all input that could be put in the log file. This would involve doing strict whitelist validation to make sure that characters that pose issues in logs are not available to the attacker. This option, in my opinion, would be the least likely to work. The truth is that you’d have to validate lots of data, and if it failed validation, it wouldn’t get logged. That data could be valuable debugging information or even audit data. This is my least favorite option, and I’ve only actually seen it once in practice.

2. Log to a database. Essentially logging to a database does prevent log forging since carriage return / newline mean nothing in that context, but it introduces another possible exploit: SQL Injection. For information about how to prevent SQL Injection properly, please see: The OWASP Top Ten and ESAPI – Part 2 – Injection Flaws. This is an ok option if you need to use a database for auditing, but I wouldn’t use it for standard logging. If you do use it, be sure to use a proper SQL Injection prevention mechanism.

3. Encode all output. This is a simple and elegant solution, and my personal favorite. There’s not that much to it – just encode those characters you need to in order to prevent log forging, and everything else get’s logged the same. You can develop a simple wrapper around your favorite logging utility in order to do the log forging prevention work, and use the logging utility for all it’s other features. Luckily the good people who built ESAPI already did this for you. They have a logging component built into ESAPI that contains this log forging protection. An example for wrapping the popular Log4j library is here. Here is a snippet of code that shows their encoding in action.


// ensure no CRLF injection into logs for forging records
String clean = message.replace( '\n', '_' ).replace( '\r', '_' );
if ( ESAPI.securityConfiguration().getLogEncodingRequired() ) {
    clean = ESAPI.encoder().encodeForHTML(message);
    if (!message.equals(clean)) {
        clean += " (Encoded)";
    }
}

As you can see, the class simply replaces all carriage returns and line feeds with underscores, then, if configuration dictates, HTML encodes the message. If the original message to be logged is different from the encoded version, a simple ” (Encoded)” string is tacked onto the end of the log entry to denote that it required encoding of some type.

Other notes

Another possibility when writing data to a log that may or may not be true in your environment is XSS. Though it doesn’t sound clear at first, consider this possibility. You log all of your data to a flat file, then your team, an auditor, system admin, or anyone comes along and opens your log file in a fancy web based log viewer. If XSS attacks are stored in the log, then the log viewer would then be susceptible to XSS. The best solution, in my opinion, would be to ensure your log viewer is protected from XSS. I feel that when you output data to a particular location, you should protect against the attacks specific to that location, ie log output should protect against log forging. However, it is helpful to be aware of issues like this to understand the full situation. Also note in the ESAPI code above, HTML output encoding is performed, and could help prevent this situation as well.

Wrap-up

So, as you can see, log forging is an issue that affects pretty much every J2EE application out there. It’s fairly simple to understand and fairly simple to solve. Hope this helps. Let me know if you found it useful, or if you have dealt with log forging differently.

Technorati Tags: , , , , , ,

The OWASP Top Ten and ESAPI – Part 2 – Injection Flaws

No Gravatar

This article will describe how to protect your J2EE application from injection (SQL and others) attacks using ESAPI. As with all of the detail articles in this series, if you need a refresher on OWASP or ESAPI, please see the intro article The OWASP Top Ten and ESAPI.

What’s the problem

Now to discuss injection. Again, let’s begin the the definition of injection from OWASP:
“Injection flaws, particularly SQL injection, are common in web applications. Injection occurs when user-supplied data is sent to an interpreter as part of a command or query. The attacker’s hostile data tricks the interpreter into executing unintended commands or changing data.”

Here I should make a quick note about “input”. While the vast majority of attacks are usually due to malicious user-supplied input, often directly through the browser, this is not necessary for the attack to work. Dangerous (whether malicious or not) could come from a variety of sources, like the filesystem, or the database, or even other applications. The rule is, if you take in input, always understand it could be dangerous. OK, back to injection …

If you look at injection alongside XSS, you’ll notice that they are very similar types of vulnerabilities in spirit. Consider this working definition: “The application takes in input and without proper data validation or encoding, outputs that data to a destination where certain values or entities have special meaning”. This could be describing XSS or Injection. In the case of XSS, our output destination (we’ll use OWASP’s “interpreter” term from here on out) happens to be a browser. For injection, it could be a database, LDAP server, operating system or any other number of interpreters. Each of the interpreters has its’ own semantics and processing rules for what it understands and what symbols and characters have semantic meaning as input.

For the browser, for instance, the bold tag has semantic meaning, which results in making certain text bold-faced.

<b>bold-faced</b>

As for a SQL compliant database, the keyword SELECT has semantic meaning in that it performs a read of some portion of the database.


SELECT first_name, last_name FROM employees;

As we saw previously in the XSS article, it is possible for a user to insert (inject) data that will cause issues if the input is not properly validated and/or encoded (my strong suggestion is to do both). The same thing is true with injection attacks. If the input is not valid for the interpreter, or more specifically contains some commands that will be processed by the given interpreter, then the user can cause the interpreter to perform functions on the user’s behalf that were not intended to be allowed by the application.

Let’s take a classic SQL injection attack as an example to clarify the issue. Here’s some simple Java code creating a SQL string to perform an authentication check so that a user can login. Let’s assume the Java code gets the results back and tests to see if there are any results, and if so, allows the user to be authenticated.


String sqlString = "SELECT * FROM users WHERE fullname = '"
    + form.getFullName() + "' AND password = '" + form.getPassword() + "'";

So, let’s assume the user is presented with a simple login screen, and let’s also assume the user is benevolent. The user might enter something like John Melton for the username and 123pass for the password. That would create the following SQL string to be executed against the database after it is processed by Java (ie. this is what you would see as the SQL string if you watched it through the debugger, or printed it out after the values have been populated at runtime).


SELECT * FROM users WHERE username = 'John Melton' AND password = '123pass'

OK, so far, everything is great right? Sure, for those specific inputs no problem. Now, to pay homage to a good friend of mine, I’ll include his oft-quoted refrain – “The web wasn’t made for Irishmen”.

Let me explain that. For those who are unaware, many Irishmen have names like O’Brien or O’Connor. Note the apostrophe in the last name. Now, let’s show what the SQL string looks like after being processed now.


SELECT * FROM users WHERE username = 'Paul O'Malley' AND password = '789pass'

Are we still good? Nope, this time, we’re not so lucky. What will happen here is a SQLException will be thrown because the SQL cannot be processed by the database interpreter. The issue is in the extra apostrophe in the name. The apostrophe is used in SQL, among other reasons, to open and close a string representation. So, what the SQL interpreter sees in this case is that the username string is populated with Paul O and then closed, and the actual SQL string starts again, only <b> is not valid SQL syntax, so processing stops and an exception is thrown.

(Note: I’m using interpreter generically. In the cases I’m describing, it actually would get booted by the parser, but that’s quibbling :>. )

All right, now that we’ve covered the basic “it works” case and also seen there is a way to break it, how can we break it to our advantage if we wanted to attack the site? Let’s say an attacker sends these two crafted inputs.


Full Name: blah blah
Password: ' OR '1' = '1

Update: modified code above according to Ben’s comment below.

Now that’s a funny looking string to be entered in the password field – let’s see what the SQL string produced looks like.


SELECT * FROM users WHERE username = 'blah blah' AND password = ' ' OR '1' = '1'

What the above SQL actually does is returns all records from the users table that use the given username and password OR where 1 = 1, which is … EVERY RECORD!!!. This grants the attacker access to the application as an authenticated user. This was an actual issue in a real (fairly well-known at the time) web application many years ago. It would check the results that were returned and take the first one as the user to log in. In the case where they were being attacked and all records were returned, it took the first result, which like most other systems first user, was the admin account! Most decent sites have closed this issue on the login screen thankfully, but it is a very prevalent issue in systems across the world.

The above is a simple example of what can happen with SQL injection. The vast majority of reported issues with injection have to do with SQL injection. The others are not as popular, probably because they’re less common types of systems (b/c almost every app uses a DB) and SQL injection is so easy and prevalent, there’s no real driver for attackers to do anything else. Additionally, most of the sensitive data is likely stored in the DB anyhow, so we might as well attack it directly if possible.

Where do we go from here? ….

So, now that we see the problem, how do we fix it? Well there are essentially 2 reasonably reliable techniques as I see it. The first option has to do with using PreparedStatements with bound parameters, and the second option uses the ESAPI framework.

Update: From Jeff’s (the chair of OWASP) comment below, it should be clear that PreparedStatements and the ESAPI output encoding mechanisms are not mutually exclusive. They can be complementary and used together for increased assurance. It is a task left up to the development organization securing their app to make a decision over which option (or both) to choose. As with most choices, each option (PS, ESAPI, or both) has its’ advantages and disadvantages.

Note: with either technique, white-list input validation should be used since it’s relatively easy to do, and helps avoid so many types of problems. Additionally, standard security measures like least privilege for the user account connecting to the DB for your application should also be utilized.

Before getting into the detail of the 2 options, I want to point out that the ESAPI folks have put together another article called the SQL Injection Prevention Cheat Sheet and it is a great resource about this topic, specifically SQL injection.

PreparedStatements (with bound parameters)
——————
Those familiar with DB programming in java will recognize the 3 statement types in Java: Statement, PreparedStatement and CallableStatement. These 3 types represent varying levels of security. The standard Statement class is not a good idea – there are no security measures built-in and you are on your own if you try and use this class. As for the CallableStatement, it is used with stored procedures and functions. *Typically* stored procs are secure. However, it is very possible to make them insecure by not validating the input properly or even constructing dynamic sql within the procedure for execution. Finally, there is the PreparedStatement. When used properly (with a safe driver and bound variables), these are typically viewed as the most secure option.

There are 2 important things to remember about PreparedStatements when it comes to security.
1. SQL executed with PreparedStatements is only as safe as the drivers that implement the classes. The actual runtime class that implements the PreparedStatement interface is part of a 3rd party package. There are some out there that are generally thought to be safe, and some that aren’t. The most popular ones expectedly tend to be more respected (not necessarily earned, but true nonetheless). The only suggestion I can give here is that you can look at the code of popular open source drivers yourself and make up your mind about their safety. If you use a commercial driver, understand the support structure in case there is a security issue that must be resolved.
2. The code leveraging the PreparedStatements must use them properly. This involves the use of strongly typed bound parameters. This means that if you concatenate your SQL string together using dynamic input, the PreparedStatement does you no good. The safety comes into play when using the setXXX methods on the PreparedStatement class. The snippet below shows how to properly use dynamic input (which should already be validated) as input to the SQL string by doing parameter substitution. The setInt method below (if the driver is coded properly) should be DB specific and will escape any necessary characters for that DB to prevent breaking out of the interpreter and allowing the user to specify code that will be run by the DB.


myPrepStmt = conn.prepareStatement("SELECT name FROM users WHERE id = ?");
myPrepStmt.setInt(1, validatedUserId);
...
//execute statement and get results

and a query with multiple input parameters looks very similar …


myPrepStmt = conn.prepareStatement
    ("SELECT name FROM users WHERE id = ? AND date_created >= ?");
myPrepStmt.setInt(1, validatedUserId);		//first param
myPrepStmt.setDate(2, validatedStartDate);	//second param
...
//execute statement and get results

As you can see, using the PreparedStatement can solve this issue, but only when used properly. One very important thing to note here is about ORM frameworks. Most of these frameworks *can* be used securely, but you have to know what you’re doing. Hibernate, for instance, can be used securely or insecurely depending on which classes you choose. Be sure to do your homework, read solid tutorials, pick good frameworks, and above all understand what you’re using when you let a framework do some of the work for you. OK, now on to our other option …

ESAPI
——————
ESAPI takes a similar approach to solving this issue, but the code will look slightly different. The approach is very simple, and boils down to escaping/encoding all dynamic input to the SQL statement using an appropriate encoder. For this section of the article, I relied heavily on the ESAPI section of the SQL Injection Prevention Cheat Sheet at OWASP, so it’s a worthwhile time to go read it if you haven’t already. Let’s look at one quick example to show how it works. We’ll modify the second query above to use ESAPI instead of PreparedStatements.


//ESAPI version of query
Codec ORACLE_CODEC = new OracleCodec();		//we're using oracle
String query = "SELECT name FROM users WHERE id = " +
   ESAPI.encoder().encodeForSQL( ORACLE_CODEC, validatedUserId)
   + " AND date_created >= '"
   + ESAPI.encoder().encodeForSQL( ORACLE_CODEC, validatedStartDate) +"'";
myStmt = conn.createStatement(query);
...
//execute statement and get results

As you can see above, all that we did was encoded the input data (already validated) for SQL by using the OracleCodec, since we are using the Oracle database for our application. This has some obvious drawbacks if you are using multiple databases, and can use the ORM features of cross-database query generation, but that is rare for many apps. Most applications know the 1 database they are going to be running on, and I would argue that’s a good thing from a security perspective because then you can better understand the security ins and outs of that one DB.

Looking at the code, it’s fairly simple to encode most anything – you just have to be diligent to do so. As for the other types of injection, you may have to get your hands dirty and write an encoder for your specific application, or you may not. You’ll just have to check what the ESAPI team has already done. Additionally, it might be possible to get on the mailing list and request a new codec. Certainly, if you have any that you’ve written, I’m sure the dev team would love to at least see them, and possibly include them – you could be helping others out!.

Clearly, using the ESAPI framework, it’s possible to safely and securely prevent injection from the DB as well as other types of interpreters if you use the framework properly.

In summary, I’ll say that while I think either of these techniques work from a purely technical perspective, there are a couple of things you should consider before implementing protection. First, PreparedStatements are far more prevalent in industry. They are a solid solution to the SQL injection problem, and are widely understood and available. They’ve also been very widely tested both for performance and security. Essentially, they are a known entity. However, they do not solve any of the other types of injection. The equivalent to PreparedStatements do not exist for XML or XPath injection or others. You’d have to handle them differently. As for ESAPI, it will also work well, and has been tested for security and performance, but not nearly as much. Additionally, it does not have the industry reach that PreparedStatements do. However, ESAPI has a solution for the other types of injection as well, and a clear model for implementing new encoding mechanisms for new interpreter types. You can even write your own custom encoders if you choose. No matter which you choose (or both), my hope is that you’ve seen there is a need to be responsible and protect resources from injection. These solutions make it fairly trivial to implement solid protection for applications against injection.

Other articles in this series:
Part 0: The OWASP Top Ten and ESAPI
Part 1: The OWASP Top Ten and ESAPI – Part 1 – Cross Site Scripting (XSS)
Part 2: The OWASP Top Ten and ESAPI – Part 2 – Injection Flaws
Part 3: The OWASP Top Ten and ESAPI – Part 3 – Malicious File Execution
Part 4: The OWASP Top Ten and ESAPI – Part 4 – Insecure Direct Object Reference
Part 5: The OWASP Top Ten and ESAPI – Part 5 – Cross Site Request Forgery (CSRF)
Part 6: The OWASP Top Ten and ESAPI – Part 6 – Information Leakage and Improper Error Handling
Part 7: The OWASP Top Ten and ESAPI – Part 7 – Broken Authentication and Session Management
Part 8: The OWASP Top Ten and ESAPI – Part 8 – Insecure Cryptographic Storage
Part 9: The OWASP Top Ten and ESAPI – Part 9 – Insecure Communications
Part 10: The OWASP Top Ten and ESAPI – Part 10 – Failure to Restrict URL Access

Update: Added inline comments referencing changes made regarding Jeff and Ben’s comments below.

Technorati Tags: , , , , , , , ,