What is it and why should I care?
While trust spawns interesting philosophical discussions, here I want to discuss the implications of trust within the applications we build. Trust is a funny thing in that we implicitly give it frequently without considering what we’re trusting. A simple example:
//bad bad do not use executeDbQuery("select * from my_table where id = " + request.getParameter("my_id")); //bad bad do not use
Here we’ve said that we trust that the user of the application has not tampered with the my_id request parameter in any way that may cause problems for our application. Obviously this is a poor assumption. We can do better by moving the above query to a prepared statement with parameter binding to prevent SQL injection and we can also validate the my_id parameter for appropriate input, but why do we do that?
It’s because we don’t trust the input to our system. We don’t (and shouldn’t) trust that a user or system is going to use our application in the way we would expect, or even the ways we’ve thought of necessarily (a good reason against blacklisting for security). We must build systems that not only are functional (use) but stand up under attack (abuse) or ignorant usage. Our systems must be robust or as some have called it, rugged. Whatever your term, the idea of trust is either explicitly or implicitly central to the idea. We can’t trust the environment.
If we can’t trust the environment, what does that mean? Does that mean we deal with XSS and SQLi? Yes, but much more than that, it’s a different way of thinking about the application. It becomes that simple picture of input-processing-output at varying levels of scope. A single request has inputs (request parameters, headers, database input, etc.), processing (authn/z, logic, etc.) and outputs (DB, screen, file, etc.). The application as a whole has inputs, processing and outputs that are essentially the combination of all the individual components of the application, and then you can scale on up to systems and organizations.
The “environment” I’m referring to changes depending on your specific situation, and it’s difficult to say that you simply can’t trust anything, because that’s usually a non-starter. You may have to trust your configuration files or your external SSO system, or any number of other entities. The idea is that you specifically label those things as trusted (an assumption) and treat everything else as being tainted.
These types of issues are considered in threat modelling, which is another planned topic in this series. For now, it’s sufficient to simply note that you should be thinking in terms of what data am I taking in, processing and sending out?
What should I do about it?
Now that we’ve established the environment can’t be trusted, the next logical question is what constitutes the environment?
This could be a long answer depending on your setup, but a decent starting list for web applications in particular might look like the following:
- web request data (parameters, headers, body, cookies)
- database data
- directory data (ldap)
- filesystem data
- web service data (any data in headers or body)
- external system data (any data you receive from another system – software you’re integrating with)
- network connection data (any data you receive while acting as the “server” – generally socket-based communication)
- user input (command line input)
- system environment variables
- third party software (libraries that you call that provide you data)
This list is incomplete I’m sure, but the idea is there. Any data you receive from any of these users or systems is generally untrusted, possibly with certain organization/application-specific well defined exceptions. When you start to view your applications in this way, you start to build better protections around them. You build better defences, and better logging/auditing so that you can detect when something actually does break (it will, I promise). However, thinking in this way can go a long way to helping you build safer and more secure systems.