The OWASP Top Ten and ESAPI – Part 3 – Malicious File Execution

No Gravatar

This article will describe how to protect your J2EE application from malicious file execution attacks using ESAPI. As with all of the detail articles in this series, if you need a refresher on OWASP or ESAPI, please see the intro article The OWASP Top Ten and ESAPI.

What’s the problem?

So what exactly is malicious file execution? It could mean different things to different people. Here, I want to limit the discussion to the execution, either immediate or delayed, of files or file handles that can be manipulated in some way by user input. This is a fairly broad definition that can cover a few different situations. Let’s first discuss a few of the different types of issues that can come up. This collection is by no means exhaustive, but should give us decent coverage of the types of issues that arise when dealing with file input.

1. Included file name is partially or wholly determined by dynamic input (think dynamic include files)
2. File is uploaded and written to disk with “as is” (no validation of uploaded file – just write to disk)
3. File name to display or retrieve is partially or wholly determined by dynamic input
4. Command / data file is uploaded (think a batch process being kicked off with data from an uploaded spreadsheet)

So, what types of issues do these particular situations present us with. In general, an attacker can manipulate this process to gain increased privileges, access data for which they are not authorized and even execute code. Since these aren’t the only 4 examples of issues that exist, we won’t specifically delve into each issue in detail, but we’ll try to extract some common problems that are present in one or more of the scenarios listed above. This set of problems is related to common file upload / processing situations.

1. Missing or insufficient input validation – This is almost always true with file processing. This also is a broad topic, since it deals both with the file name and path, as well as the file contents. Many times, this could express itself as uploading a file as is and simply writing it out with a filename that is either the name of the file as uploaded, or a name specified by the user. This could cause issues like directory traversal (“../../etc/passwd” – overwriting the passwd file – sorry for the old cliche example), or overwriting other files in the filesystem (“../../jsps/login.jsp” – overwriting the login page). Something like this could allow the attacker to break the site significantly up to and including owning the site, and even the server it’s hosted on. While most file processing code I’ve seen is not bad enough to allow something like this, some of it is just that bad .. scary.

2. No virus scanning – This goes along with input validation to some extent, but if you are uploading a file that is going to be executed in any way at any time, it should be virus scanned. There are engines out there, even free ones, that you can call out to in order to scan the file. If the scan fails, the file should be deleted, and the incident should be logged as a security incident. At that point, the normal security incident response process in your organization can be executed.

3. No size checks – One of the simplest tasks you can perform that is often overlooked is to check the file size of the uploaded file. There is most likely a reasonable limit to the size of your accepted files. For instance, a spreadsheet would have to have quite a bit of data to be any more than a few MB, but an attacker could upload files of many times more than that to try and fill up the file system, or just hang processing on the server to block out other valid users.

4. Invalid file type processing – This task is a bit more difficult, but makes sense in many cases. If you are an image hosting site, there’s no reason to accept Word files. This should involve creating a whitelist of the types of files you will accept and then verifying that only those are uploaded. This does NOT mean file extension checks. Although these can be used as a superficial first test, they are trivial to change and provide no security assurance. Often times, this check will involve output encoding (see #7).

5. Giving too much control over file name input – This occurs when users are allowed to influence the final filename that is output to the filesystem. This could happen by allowing a user to type in the desired filename or by just accepting their filename and using that on the server side. However, it is much safer to generate the filename that is used to save the file to the filesystem. Generally, this would have some securely random string of characters, as well as the author, date, time, etc. If there is a specific requirement to accept input from the user as to what the filename should be, just ensure that the name given is validated against an acceptable whitelist before writing the file out.

6. Direct object reference (DOR) problems – While the DOR issue affects more than just filenames, it is relevant here so we’ll quickly discuss it. DOR is simply an issue where the actual filename is pointed to directly. This filename could be in a hidden field, stored in a cookie, or some other place that is not directly seen by the user on the screen. Typically the filename is taken directly and used to retrieve or execute the file. This could cause issues of broken authorization (attacker changes filename to access another user’s file), elevation of privileges (touching a file belonging to a user with higher privileges or the system user), or many other problems. This topic is covered in greater detail in the next article in the series.

7. No output encoding – This issue doesn’t always exist since not every filetype can be thought of in this way. The basic issue is when a file is not validated properly before being written out. There are several filetypes that are valid, that could have extra information in them. However, of those, there are some file types which can be read in, and then written out again to ensure the “bad stuff” is not still there. This does not apply to all “bad stuff” that can be contained in those files, nor does it apply to every file type, but it can be useful when the option is available, though it rarely if ever is used in many applications.

8. Not authorizing access – This issue exists because many applications contain reasonably appropriate authentication, but horrible if any authorization. Once a user is authenticated, many times they can perform any function in the application. The classic example of this is not restricting access to a certain page, let’s say the admin screen. The home page might not show the link, but if they type the link into the url bar, there’s nothing stopping them from getting there. Depending on the type of site, this could be especially harmful with file upload. Often times, only “trusted” people are expected to be able to upload files, but if the authorization is broken, it could allow anyone to upload files at will. Again, this depends on the type of site you have – it may be perfectly fine for everyone to upload files (like flickr or youtube).

Where do we go from here? ….

So, we’ve discussed lots of issues that can crop up when dealing with uploading files, so how does ESAPI and general best practices say we should deal with these issues? Let’s take them one by one.

1. Missing or insufficient input validation – As discussed in part 2 of this series, ESAPI makes it fairly simple to do proper input validation through the framework. The code is very simple, and extensible through the addition of new regular expressions. Here is an example of how ESAPI checks the filename of the uploaded file to verify that it is valid.

if (!ESAPI.validator().isValidFileName("upload", filename, allowedExtensions, false)) {
	throw new ValidationUploadException("Upload only simple filenames with the following extensions " + allowedExtensions, "Upload failed isValidFileName check");


2. No virus scanning – ESAPI does not directly support virus scanning through the base APIs, but several of the antivirus vendors support API access for Java. I’ve heard that there are also free AV tools that offer this feature as well.

3. No size checks – There is no built in support for this in ESAPI. However, several of the web frameworks do provide support for this. Additionally, most if not all file upload libraries also provide support for this. Lastly, it’s trivial to perform your own check even if your specific library does not support it directly. Just load the uploaded file into a file object and call the length() method on it to determine the size of the file in bytes. Here is what ESAPI uses to prevent large files. The following is a snippet:

ServletFileUpload upload = new ServletFileUpload(factory);


4. Invalid file type processing – This task is supported by the input validation portion of ESAPI again. First, the filename could be validated by the application. Then, each input from the file that is processed could again be validated by the application. Think of processing a spreadsheet. The filename should first be tested, then not only each line, but each cell should be checked for validity before being used by the application.

5. Giving too much control over file name input – In order to at least partially solve this issue, ESAPI has the SafeFile object. This is a trivial extension to the object that additionally performs checks of the file and directory names to check for unsafe characters. However, this is a blacklist check. It is likely that you could do a better job if you know what characters are allowed in your filenames and only allow those (whitelist), though the characters in the ESAPI blacklist should minimally be disallowed.

6. Direct object reference (DOR) problems – Again, this will be covered in greater detail in the next article, but here I will just mention that ESAPI does have a mechanism to hide DORs. This involves essentially creating a map of direct -> indirect references and then using the indirect references for display, then mapping back during the processing. Come back for the next article for more details on how this is actually done in code.

7. No output encoding – There is no built in support for this in ESAPI. However, there are ways of doing this that *may* work for different filetypes. The most obvious one is images. If an image is uploaded, you can open the file, load the contents into an image object, then save that object. This guarantees that the image is a valid object. Otherwise, loading the image will fail. (Note: this is the expected behavior, though there have been some defects that may cause this not to work – ensure your library version functions correctly here). It may also be possible to do this with the help of some 3rd party libraries depending on the file type you are processing.

8. Not authorizing access – ESAPI does have authorization coverage in the framework, but it deserves its’ own discussion. Actually, the last article in this series will cover restricting URL access which will discuss the authorization capabilities of ESAPI. Additionally, CSRF (Cross Site Request Forgery) protections can help deal with this issue as well, and this is also going to be covered in a later article in the series.

I’d like to mention a couple of other helpful things that are more systems management related when dealing with uploaded files.

– Use a safe base directory when saving your uploaded files that is outside the standard web directory that your site is served from. This will force your application to be the only thing that can access these files. That way, an attacker can’t drop a file in an accessible directory.

– Use the file-system access controls that are offered in your operating system and possibly your web/application server. Ensure that only the appropriate users have privileges and that they have the least privilege necessary to do their jobs. This is a necessary stopgap in case your other protections fail.

– Audit properly. This can’t be stressed enough. If you audit and log properly, and then *READ* those logs, then you can notice when strange behavior occurs. This can help you locate things that need to be resolved.

Hopefully this article has been useful in helping you recognize the types of issues related to file upload and some best practices for dealing with those. As in all the other articles, I recommend trying to be as safe as possible, and only having your app do what is necessary. If you don’t need to upload files, don’t do it. If you have to do it, spend time making sure it’s as safe as you can make it. Let me know if you have other suggestions or best practices you use with file upload.

Editorial Note: I realize it’s been quite a while since I posted any articles in this series. However, I plan to finish up the series in the next few months by posting an article every week or two now that I can spend a bit more time on it. I also realize that in the gap of time I had off, the new OWASP Top Ten (2010) has been released and it supercedes the 2007 list. However, all the issues in the 2007 list either still exist in the 2010 list, are covered by it, or are still issues that may not make that high of a ranking anymore, but are still important. For these reasons, and the fact that I’d already started using the 2007 list, I’ll finish up with the 2007 list. I may go back at the end and catch any new ones from the 2010 list if relevant.

Other articles in this series:
Part 0: The OWASP Top Ten and ESAPI
Part 1: The OWASP Top Ten and ESAPI – Part 1 – Cross Site Scripting (XSS)
Part 2: The OWASP Top Ten and ESAPI – Part 2 – Injection Flaws
Part 3: The OWASP Top Ten and ESAPI – Part 3 – Malicious File Execution
Part 4: The OWASP Top Ten and ESAPI – Part 4 – Insecure Direct Object Reference
Part 5: The OWASP Top Ten and ESAPI – Part 5 – Cross Site Request Forgery (CSRF)
Part 6: The OWASP Top Ten and ESAPI – Part 6 – Information Leakage and Improper Error Handling
Part 7: The OWASP Top Ten and ESAPI – Part 7 – Broken Authentication and Session Management
Part 8: The OWASP Top Ten and ESAPI – Part 8 – Insecure Cryptographic Storage
Part 9: The OWASP Top Ten and ESAPI – Part 9 – Insecure Communications
Part 10: The OWASP Top Ten and ESAPI – Part 10 – Failure to Restrict URL Access

Be Sociable, Share!

Technorati Tags: , , , , , ,