I’ve already written a bit about frameworks, both about using others’ and about building your own. This post will look at using existing frameworks a bit more, specifically around interesting security features.
Most of the discussion in the security area as it relates to frameworks is about CVEs (Common Vulnerabilities and Exposures). That’s reasonable, since there is a readily available pool of data and it’s possible to build a point solution to address the issue with relative ease. This is evidenced by some great open source tools (such as OWASP Dependency Check and Retire.js) as well as several commercial tools.
However, there are many other issues with frameworks outside of CVEs. I’d like to look at one of those issues in this post and consider how it might affect your security posture, as well as what can be done to address it.
Frameworks vary in shape and size, but generally, they all represent the ideal of reusable code. If some code is going to be written more than once, we should generalize it and work from a common code base so everyone will benefit. This has huge obvious advantages for development, but also for security. It’s one place to focus our efforts, and one set of code to review and analyze.
However, having one codebase to analyze doesn’t mean it happens (many eyes shallow bugs anyone ???). I pulled some numbers from here about several popular frameworks and posted their estimated lines of code below.
Hibernate (1.1 MLOC)
Spring Core (1.25 MLOC)
Log4J (80 KLOC)
Lucene (500 KLOC)
Commons IO (30 KLOC)
Commons Lang (70 KLOC)
Struts2 (830 KLOC)
Spring Boot (130 KLOC)
It’s a safe bet that many Java projects are relying on millions of lines of code in their deployed application, not to mention the container/server and operating system they are running on!
So, we know frameworks have significant benefits from the development perspective, and we know they can have known vulnerabilities in them (CVEs), but what about other security issues?
The particular topic I want to look at is frameworks that have some sort of “magic”. There is lots of functionality that could be classified as “magic”. One common example is auto-binding of web forms, ie. mapping request parameters to a backing object of some type. This has bitten many frameworks repeatedly, but the functionality is desirable, so people keep building it. Another example might be dynamic finder methods in ActiveRecord. No matter what the magic is, it can usually be recognized by the response people give when they learn about it – usually some variant of surprise + (excitement | horror) depending on the person.
I’d like to consider a few examples (of which there are many) that I would call “magic” in popular frameworks.
Struts2 offers a plugin architecture that allows any JAR bundled inside a WAR to have a special configuration file (struts-plugin.xml), and the framework will auto-discover and enable it. That means that a JAR can now self-configure and has access to the same capabilities related to struts2 that the hosting application does. This is an interesting feature that allows for modularity, but very likely will catch both developers and security folks off-guard the first time they hear about it.
Note that the framework is pretty generous and gives plugins many extension points to override, including the ability to take over the core object creation functionality. This obviously produces many security concerns.
In the servlet 3.0 spec, a new capability was introduced: web fragments. (similar to the struts2 convention plugin) This spec essentially allows any JAR bundled inside a WAR to have a special file (web fragment) that semantically gets copy-pasted into the web.xml. That means that a JAR can now self-configure and expose endpoints. (note: similar functionality is available via annotations) This is a significant change for many developers. By simply upgrading to the latest spec, I now have this significant new functionality. Also, this functionality is not necessarily heavily advertised. An example is below
welcome com.mysite.WelcomeServlet ... welcome /Welcome
Part of the significance with this issue is that it doesn’t matter really what frameworks you’re using. If you are deploying WAR files, this could affect you.
The last example I want to cover is in Spring Boot. Spring Boot is a framework that is “opinionated”. It tries to provide some useful defaults that are helpful to get you going, then allow you to customize as needed. The framework emphasizes rapid development.
One of the very nice features it provides is the “actuator”. This is a feature that brings in the great metrics project, and sends data from the framework itself into the metrics by default. This is a fantastic feature if you’re in an environment such as micro-services (one of the primary targets for spring boot) because there is rich machine-accessible data to drive automated decision making and reporting.
Note that this portion of the framework is off by default. However, many of the examples online, both in the official documentation and many other tutorials, configure the actuator for use. One reason many people may not be aware of the tool is that the only thing required to enable it is to pull in a dependency. Add a maven or gradle dependency and spring boot auto-configures it.
The framework provides a startling amount of information by default. One of the standard spring applications (http://start.spring.io) is built with Boot, and exposes these endpoints to the world. Below I’ve listed most of the different endpoints along with some of the information they contain. It’s a significant amount of information. Consider this information being exposed in your own application.
– http://start.spring.io/trace – essentially the web server log with information about all clients connected (IPs/countries, etc.), cookies (may be useful for session stealing), referers, etc.
– http://start.spring.io/mappings – accessible urls – bad if you have any urls you don’t want exposed.
– http://start.spring.io/metrics – OS data – actually see the effect of DOS attempts as you go
– http://start.spring.io/info – dependency versions – helpful for checking against CVE’s
– http://start.spring.io/env – tons of data, OS, filesystem – versions of installed software, more dependency info, info about the environment (cloud space id, instance running port, etc.)
– http://start.spring.io/autoconfig – what’s autoconfiged and why/why not
– http://start.spring.io/configprops – multipart file sizes, jmx info
– http://start.spring.io/beans – classes that are loaded
– http://start.spring.io/dump – thread dump
Important note: if you also add spring security, many of the exposed endpoints are automatically protected by spring security. However, not all spring boot projects will use spring security and this protection does not apply to other security frameworks.
Many frameworks have some amount of “magic”. Above I’ve listed a few examples I find interesting, but there are countless more. So, what should you do with this information? Here are a few concrete steps you can take:
1. Know frameworks – Understanding the features provided by the framework (both well-known and hidden) is an important pre-requisite to using a framework. Using something you don’t understand almost always ends badly.
2. Assess frameworks – evaluate frameworks from a security perspective. This could be done with both manual and automated review techniques. This could be done as a private organization or in an open community. No matter the specifics, evaluating the framework is important. This is really a subpoint to knowing frameworks, but it is security specific.
3. Manage frameworks – as an organization, use only “approved” 3rd party libraries and only specific versions of those libraries. This step depends on 1 and 2. There is a wide range of technological solutions to this problem, but the idea is to ensure you know what you’re using and that you’ve decided (through a review of some defined acceptable level of rigor) that what you’re using provides an acceptable level of safety.
Frameworks provide lots of functionality. It’s our job as secure software implementors to understand the tools we’re using and their safety properties so that we can use them properly and provide secure applications.
As part of wrapping up the past year-long series, I decided to put all of the posts into a single PDF. All of them will remain posted on the site, but you can grab all of the content in one convenient place.
Hope you enjoy!
Year Of Security for Java
This will serve as the conclusion to a year-long series on security topics for Java. Let’s first look at the original motivations from the series introduction.
There are several motivations for this series:
1. Get some old topics written down
2. Research some new technologies
5. Answer questions from Java friends
I can safely say that I’ve achieved all of these. I covered a pretty wide variety of topics along the way and noticed a few interesting trends:
– The posts that got the most reads were about technical controls, specifically those that were related to configuration settings or response headers. I’m not sure what this means, but it’s possible these were read more because they don’t specifically apply to Java and can be handled at the web server level or with a WAF, etc.
– The posts that got the most interaction (comments/emails/etc) were process topics that are repeated constantly in the security echo chamber (audit, access control, thread modeling, security training). The interesting thing to me was that these topics were poorly understood by many commenters, and those “experts” that do understand them often have no hard data backing up their assumptions about the topics.
– As I worked through the year, the vast majority of topics I wrote about had less to do with Java specifically – they were security topics that applied across languages – this was actually to be expected. The unexpected thing to me was that many of the topics were referenced in the context of security, but are really just a specific use case of basic software engineering best practice. I knew good development usually spurs much better security, but writing on this many topics really drove the point home for me.
It’s my sincere hope that some, if not all, of these topics are helpful to you. I (mostly) enjoyed writing them and had some great discussions with folks along the way.
Below I’ve added the full list of links for all the posts from the series. Hope you enjoyed it!
What is it and why should I care?
Information security is a quickly growing field that is changing rapidly in many ways. We are tasked with securing all sorts of technologies and those technologies are moving quickly.
The implication here is that even to maintain the status quo requires significant work. However, we don’t want to just maintain the status quo – we actually want to improve. How, then, should we proceed?
What should I do about it?
We have to work hard to consistently improve. Technology is a field that requires a lot of effort to stay current, but that effort pays dividends with experience. I recently had a conversation with a colleague where we discussed how quickly things were moving, but how similar the technologies were, particularly when you look across platforms or deployment models. There are certainly differences about mobile from the web, but there’s a lot that’s the same. If you account for web and desktop, there are even more similarities. Similar ideas have been bandied about when discussing the cloud and mainframes.
I certainly understand that there are nuances to most technologies that make them unique or valuable in some way (else they don’t gain traction). However, it remains that exposure to different situations (experience) gives you a significant upper hand in this space. Below are a few thoughts about what I’ve seen to be successful with this approach.
1. Learn the fundamentals
This is the core of every good security person I know. Their quality is often directly reflective of their understanding of core principles. This is logical when you consider that we repeat technological decisions repeatedly. A good understanding of what we do and why we do it is essential to being a good security person over the long haul.
2. Work with lots of things
Try to get exposure to different technologies, platforms, toolsets, development methodologies, risk analysis techniques, etc. The more you see, the more you can build a mental framework around which to base your decision making. You see the components for what they are and how they fit together – a powerful piece of information.
3. Build a nice toolset
A natural extension to having a good grasp of the fundamentals and getting exposure to different things is that you build a solid toolset. You may be a specialist (that’s great), but work with others and try to understand what they do. That knowledge lets you further understand your role in the process and gives you a way to add even greater value.
4. Look for novel solutions
Even though it’s rare, there are good new ideas that come along. Many of the best ones in technology were generated in the 50’s and 60’s, but there are still good new things that come up all the time. Be on the watch for good ideas that can fundamentally improve how we secure systems. As one friend put it, look for things that make it cheaper for us to secure things than it is for the bad guys to break things.
In conclusion, security (along with technology) is quickly evolving in the particulars, but pretty steady in the fundamentals. In order to improve the overall security of our systems, we need to stay ahead of that curve. We can do that by having a solid understanding of the basics, getting exposure to different tried-and-true techniques and solutions, and then finding those new solutions that move us forward. Following these steps we can make sure that we are moving the field forward and that we never stop improving the security of our systems.
What is it and why should I care?
As I mentioned last week, this series is comming to a close. I also said that I have two concepts that I find myself sharing more than any others. The first I shared last week was to – Think.
This week I’ll briefly cover the second topic … “Document Everything”.
Again, this is a simple thing, but it is usually not done, especially by programmers. There are exceptions to the rule, but programmers are stereotypically bad at documentation. We generally either a) don’t do it or b) do it once and never update it. The example I usually go to is – think about some of the better-documented open source projects you’ve used (spring, grails, bootstrap), and then think of some others that have poor documentation (won’t name them :>). Consider which ones you generally enjoy working with more and how much of a role documentation has to play in that. Also, think about how long it took you to come up to speed on those frameworks and how long it might take you to refresh version changes, etc. Documentation can make all the difference in usability and desirability of a product.
What should I do about it?
Documentation has a reputation for requiring huge amounts of time and effort. Depending on what you are doing, that might be true. However, in practice, you can ease those burdens quite a bit if you follow a few simple steps:
1. Document as you go.
If you are writing comments for code, add them before coding or inline while you’re writing the code. Don’t commit the code without comments. If you’re writing to produce end-user or other more formal documentation, write as you do the design, or while working on the component you need to capture documentation for. The process of writing will help you clarify issues in your mind as well as root out areas of uncertainty that you might have not considered before.
2. Write as simply and clearly as possible.
Don’t add fluff. Write down the steps that are necessary and stop. Over time, you may expand your documentation, but don’t start with lots of documentation that is unnecessary, especially since it’s more to maintain over time.
3. Get a good proofreader.
A good proofreader is invaluable for documentation. Your editor will help you with some dumb mistakes, but your proofreader will catch others. Much more than that, your proofreader will likely think a little bit differently from you and help you find areas of your documentation where you might have implicit assumptions or unclear statements and help point you to areas that need work. Try to team up with someone else who is writing and trade proofreading tasks – reading for others will also make you a better writer.
4. Update your documentation on a regular basis.
Stale documentation is often worse than no documentation, because it’s misleading to those who are trying to use it. You may be actively giving out false information. It’s a good idea to consider what level of freshness your documentation needs, and then set calendar reminders to review and update the documentation on a periodic basis.
As long as you follow these basic steps, you’ll have a good start on producing solid documentation, and you will improve over time.
Documentation is a powerful tool. The reason I started this blog was actually because I needed to document programming concepts that took me some time to figure out and I wanted to remember them and not have to re-learn them. Over time, folks started finding the site and letting me know that they found them helpful as well and requested I write on different topics. Some of these requests came from co-workers and others came from strangers on the Internet. That was the genesis of this series, in fact – writing down what is essentially an FAQ from folks over the years about various topics related to Java security.
In conclusion, there are two mantras I repeat to folks constantly – think and document everything. Thinking means you should be coming up with the right (or at least better considered) solutions and documenting them means you are preserving both the solution as well as your discrete thoughts for posterity. By using these in tandem, you should be able to perform at a higher level, and share your work in a more meaningful way.
What is it and why should I care?
With the current series coming to a close (wow, finally :>), I’m going to do a bit of wrap-up.
While all the posts in the series hopefully have something to offer, I’ve saved my 2 most oft-repeated pieces of advice for last. Actually, neither is specific to Java, security, or even technology. However, they are the two concepts I find myself sharing the most with others. I’m sure you will relate.
The first concept and focus of this post is … “Think”.
Sounds fairly simple and clear, but not done nearly enough. It’s not that people don’t assimilate information and spit out an answer – they do. The problem is that a) they often don’t consider all the variables and b) they don’t second guess the existing solution or consider if a better solution is even possible. I’m not suggesting that you have to re-think every decision you make, but if you’re making an impactful (time, money, people, etc.) decision, it’s worth your effort to consider the implications and make the best decision possible.
What should I do about it?
There’s a whole host of decisions that this can apply to. As a for instance, here are a few examples using other posts in this series:
– You need to build an application securely, so you create a threat model. Certainly feel free to reuse, but think about why those threat actors and attack vectors are there in the first place. Do they even apply to your application? Are there others that do apply that you’re missing?
– You are storing user passwords, and need to make sure you’re doing it securely. Well, don’t just grab the library someone used on another project (without a good reason). Think about the actual requirements you are trying to fulfill, and evaluate solutions based on that.
– You are tasked with architecting a reusable authentication and access control solution for your organization. Consider the requirements that you know of right now as well as those that are coming down the road (as best you can). Consider the needs of various stakeholders. Another organization’s solution may or may not be right for you.
– You perform a code review on an application that is using a new-fangled data storage and query solution. You’ve never seen it before, but it looks a lot like nosql. Consider the security issues related to nosql (authentication, access control, injection, data management, etc.) and extrapolate that these issues likely apply to the new solution. Also look for differences, and consider what issues might crop up because of the deltas.
I could make a really long list here, but I think the basic idea is clear. In order to be effective, you must give thought to why you are doing what you are doing. There’s a lot of security advice floating around (I gave a lot this year), some good and some bad. However, even the good advice must be tailored to your specific needs. Maybe a solution might apply for 99.5% of people out there, but your organization may not be one of them. That’s ok, but solving the problem for you means a) you don’t blindly accept the advice as law, and b) you know enough about your environment to decide what the correct solution is.
There are many problems that exist (and most that get solved) in security and elsewhere that are generally resolved by considering the problem and choosing a well-known solution from your toolbox. That’s logical because many of us encounter similar problems and come up with similar solutions, contributing to the communal mind (think OWASP documentation and projects).
However, there are some issues and problems that require a bit more work to solve. You start hearing things like “think outside the box”, be a “creative thinker”, and a whole host of other catchy phrases. The point is that the answer hasn’t been easy enough or popular enough to be solved (or solved properly/completely) yet, so you must put some real thought into developing a solution.
I relatively recently found an old talk on creativity by John Cleese (brilliant). One of the ideas he puts forth is that in order to be creative, one must have time set aside on a regular basis to dedicate to being creative. You have to give yourself the window of time. You can’t expect to be in meetings all day, and suddenly catch an epiphany in the break room. It just doesn’t work that way for most people.
Unfortunately, after working as a developer and security person for the last 10 years, I’ve seen very few people have any sort of dedicated time to “think”. The only common example I’ve seen are those who are going back to school after work, and therefore, have a reason outside of work to study and think hard about certain things. Apart from that, though, I’ve not seen much in the way of thinking hard thoughts. Development is bad (the exception would be design work), but security is worse. To an extent, the industry is currently so bad at security that we’re still dealing with low hanging fruit. When we don’t have to try that hard, often we won’t.
I have a theory I’ve built over time that places people in 4 rough buckets with regards to their workplace contributions and abilities. I’m specifically talking about dev/security folks, but the same is probably true elsewhere. I’ve tagged them with musical monikers for clarity (or confusion :>).
1. Tone-deaf – These are people that don’t try or are in the wrong job. It would be better both for them and you if they were not on your team (and if you didn’t have to listen to them).
2. Knows Basic Chords – These are people that can learn a task and repeat it properly. In order to move to a new task, they need to be trained on that task, but they will then perform it well. They can be solid contributors, but are not your “idea guys/gals”
3. Knows Scales – These folks can learn concepts, and apply them in various situations. They’ll often put out ideas that start with “what if we tried …?”. They have the ability to learn new things on their own and apply previous lessons learned to new situations. Folks like this often thrive in multi-disciplinary environments. They can take the abilities they have as a software security person or a plumber or a pilot and apply them to solving problems in other fields. This is the highest level most people will achieve.
4. Virtuoso – This is what a “thought leader” should mean (heavily abused term in security given those that are called that). These folks are rare … very rare. They come up with entirely new concepts. They develop ideas from scratch. In practice, and in retrospect, it’s usually a combination of a couple things that I’ve seen when working with these folks. They are a) knowledgeable about a lot of things (places to draw ideas), b) their solutions are elegantly simple (frustratingly so), and c) their solutions are further out (bigger) than everyone else’s. If you find one or more of these people in your organization, do what you can to keep them happy and learn from them.
Note 1: These buckets have nothing to do with formal education. There’s no implication that a higher level of education correlates to a higher level of capacity. However, more education might mean a broader and/or deeper set of specific skills.
Note 2: These buckets are not meant to imply that people can’t move from one bucket to another – they can and do. They often move buckets when they become more or less interested in the work they’re doing.
Note 3: There’s immense value to be found with people in each bucket – it’s just a matter of applying their capabilities properly.
In conclusion, one of the things I find myself recommending most often is “think”. Don’t just do something, think about why you are doing it and if that’s what you should be doing. This concept is universal and is not restricted to security or even technology in general. Part of thinking is being creative and doing the hard work of problem solving. There are ways to work towards bringing out your creativity and you can exercise them and get better at them. Lastly, people generally fall into a few buckets when it comes to their “thinking” tendencies at work. Recognizing these traits and using people to their strengths can help you get the most out of your people.
What is it and why should I care?
Today’s topic is about two of the areas that are weakest in application security – data collection and sharing. We do a pretty terrible job as an industry in both areas, though there have been some marked improvements in the last couple of years that bring hope.
While there is no confusion around what data collection and sharing means in general, there is a lot of disagreement about both topics in specific areas. Let’s briefly define both here for clarity:
Data Collection: The gathering and storage (collection) of specific points of interest (data).
Data Sharing: The distribution (sharing) of collected points of interest (data) with interested parties.
Both of these definitions are intentionally broad. IMHO, The issues brought up about what constitutes data vs. information (collection) and who gets the data (sharing) are fruitless when you consider how very little data we’re talking about to start with. If we have broad data collection and sharing (different, better problem), we can then address the needs of the community as far as standardizing the what to share and with whom to share it.
The lack of data collection and sharing in any industry essentially means that you are unaware of what others are seeing and doing. That can be particularly challenging in security as we all share a common resource (the network) and we all are using a relatively small subset of tooling to perform very similar tasks. In many cases, data that comes from one organization could be helpfully utilized by another. This applies to our industry much more than in some other industries with wider variances on tools and processes. The reverse is also true: sharing positively benefits our industry more quickly and to a greater degree specifically because there is so much commonality.
What should I do about it?
My basic hope is that you look for ways to share your data. We all benefit from it. We are theoretically a science and engineering based field, but have a rough track record of sharing actual data to support our hypotheses. However, we do have some shining examples, and that should both give us hope and motivation to get better. On the collection side, we can look at folks like Etsy. They decided that data collection would be a central part of their DNA and invested engineering resources into building tools for data collection and monitoring – a great success story. For data sharing, there are a few popular ones, like the Veracode State of Software Security Report, the Verizon Data Breach Investigations Report (DBIR) and the WhiteHat Website Security Statistics Report. (Full Disclosure: I currently work for WhiteHat) All of these are great examples of organizations sharing the data they see for the benefit of the community. One other great example is that of Security Ninja sharing 3 years worth of data that he’s collected. He also makes a poignant quote in that article – “If you have the data don’t hide it”. Note: He’s speaking of sharing with internal teams – an extremely valuable (and nearly equally as uncommon) form of sharing in addition to sharing publicly.
As I mentioned above we have some bright spots to give us hope that we can do better. Now let’s touch on a few points you should consider before sharing to make sure you’re doing your due diligence.
Stay Legal and Compliant
Certain organizations can’t share certain data. That’s a legal and regulatory reality. In general, security practitioners have been fairly conservative with what data is shared because it tends to be sensitive, however sharing is becoming more commonplace, and that seems to be helping us all get a handle on the reality in practice. Just because you may have some restrictions on what or how you can share doesn’t mean those restrictions are completely limiting. A good example is the FS-ISAC which shares data within the financial services industry, allowing similar organizations with similar concerns to share their data in a controlled environment. In short, make sure you are allowed to share something before you share it.
The data you share may or may not have privacy-related issues. If it does, you have to make sure you anonymize the data. No one’s private information should ever be shared publicly, especially when the data that is desirable to share is not affected by the private information at all. Make sure you take care of your users and customers and don’t share anything you shouldn’t. There are lots of tools that can help with this process if you need them.
Try to Share Raw Data
As much as possible (making inherent inferences is difficult to avoid, even pre-collection), you should try to stick to sharing raw data (anonymized of course). That way, others can analyze your data and evaluate (support or contradict) your conclusions. For instance, my recent post about password storage referenced a great spreadsheet considering the cost tradeoffs to attackers and defenders for password protection schemes. This raw (even generated) data gives others a way to make their own analysis and makes us all more aware of the actual data.
In conclusion, we discussed a weak spot for the application security industry: data collection and sharing. While we have historically been pretty bad at this, there are some bright spots and we’re starting to see both collection and sharing happening on a broader scale, which is hopeful. Following a few simple due diligence tasks, we can make sure that we’re sharing safely and can help the industry as a whole move forward.
What is it and why should I care?
You will get hacked. That is not meant to be a sensationalist line, but rather a functional reality in the environment we currently occupy. There are a few reasons I feel safe in stating that assumption:
– Many have already been openly hacked, including those that are well known for being “more secure” and even those that are security vendors. Also, I know of many companies that have been hacked and have not gone public, and I’m sure that’s the case for many others.
– Information security is a relatively young field, with weak solutions compared to other “security” fields.
– The change of pace in the technologies we protect is significantly different than that of our solutions. New technologies are coming online faster than we can secure them, and the security of those technologies is no better (and often worse) than what we already have.
I should at least define what I mean by “hacked” in order to clarify my original point. I mean that there is some successful attack against your systems. That could be a data breach through a web service, a website defacement, an insider stealing data, a backup disk or laptop getting lost, hardcoded keys being found on your hardware, or any other of a number of possible successful attacks you’ve heard about recently. There are lots of ways you can lose, and as the old adage goes, it only takes an attacker one successful attempt, and you have to successfully defend against all of them.
If that sounds bleak, it should. However, I’m not suggesting we throw our hands up and quit, but rather we plan for the eventuality that an attack will succeed. Banks have been successfully robbed since they’ve been in existence. However, they don’t just shut down and go out of business because they know they will get broken into, but rather they put security controls in place, they monitor, they work with law enforcement, as well as use lots of other techniques to prevent the attack in the first place, and when necessary, recover from a successful one.
What should I do about it?
If we work from the assumption that the successful attack will eventually occur, it changes our mindset a bit. We now don’t just have to work towards preventative security, but we also have to think about recovery, or incident management. There are many components to a successful program to manage security when you plan ahead for successful attacks, and here are just a few for consideration:
Disaster Recovery Planning
While disaster recovery planning is traditionally associated with a natural disaster, there are significant benefits to performing this exercise for continuity (DR is also known as business continuity planning) in the wake of an attack. Tasks of interest here would include ensuring you have a solid backup plan for both your systems (servers, network, etc) and your data. For larger organizations that can afford it, this could mean having additional hardware at an off-site location generally a significant distance (far enough away that the same hurricane/tornado/flood, etc. won’t affect it) that can be enabled if the primary site goes down. If you are “in the cloud”, you should build your applications with the capability to roll over to different “regions”, effectively accomplishing the same goal, but on rented hardware.
Devops is all the rage these days, and has numerous benefits. In the context we’re discussing today, one benefit worth noting is that updates and fixes can be pushed to production quickly and safely. You have processes in place that allow for code to be built, tested, reviewed, and pushed extremely quickly, and you also have the capability to roll back updates when needed. These are critical capabilities if you want to stay functional when under attack. If you found out you had an injection that could be (and was being) actively exploited, would you rather shut the site down until further notice, leave it up and vulnerable, or be able to fix it quickly and push out the update and keep the remainder of your site up?
Application Layer Intrusion Detection and Prevention
I’ve already written about this a couple of times before, but will provide a brief recap here for the relevant portions. This is an area where intrusion detection and prevention really shines. If we know we’re being attacked, and we are keeping track of our users and the “bad” things they are doing, we can take some actions to protect ourselves. One of the side effects of knowing people are attacking you and tracking those users is that you start to think about how you’d like to be able to respond. For example, with AppSensor, you can respond by disabling the user account, notifying an administrator of nefarious activity, or you can block resources within the application. The capability to block resources is interesting for our context. Let’s say you have a management console that allows you to do this, and you find out there’s a vulnerability within your application, but it only resides on a certain page. With this capability, you can just block access to that page/function, and leave the rest of the site up. That will give you time to get the fix deployed, and then you can re-enable the feature.
Incident Management Team
This is more of a resource issue, but can greatly improve your ability to respond. Depending on the size of your organization, this could be part of a person’s role, or the responsibility of a large group. By focusing effort here and building out a team to handle incidents, you can have focus and move towards better planning, root cause analysis, security controls, monitoring, etc.
Bug Bounty Program
Let me be clear that I would not suggest this for organizations that have either a)policy or regulatory concerns and consequences, or b)immature secure groups.
With those caveats, anecdotal evidence from organizations that have tried this suggest it can be very effective. You open up your applications to being attacked intentionally, and reward those who find issues and disclose them “responsibly”, as defined by you. You essentially are getting a lot of cheap testing (though it’s in prod). There’s a lot of concerns to consider here, but it can certainly guarantee that you get a good testbed of attacking activity against your application.
Normal Security Program
The above are a few ideas for things that I haven’t necessarily discussed in this series that come in handy when thinking about dealing with successful attacks. However, they all still fit into your normal security program and planning process. Also, they don’t come in to the exclusion of anything in your existing program. DEVOPS does not win out over code review. They are complementary and both should be used. Actually, in most cases, you’ll find that doing one is either a requirement for or benefits the other. Continue doing the things you’re already doing, and just consider these additional program components.
In conclusion, at some point, your applications and systems will get successfully attacked – it is only a matter of time. It is wise to recognize that and deal with the situation in advance. With solid planning, you can put in place controls that include prevention, monitoring and detection, response, and recovery. These controls should be added to your overall security program in order to further strengthen your overall capabilities.
What is it and why should I care?
Encryption (specifically talking symmetric encryption here) is a critical component of many applications, and the storage of the encryption key can be tricky to get right. Encryption falls under that area of secure programming that you don’t come into contact with casually, hence you might not be practiced in secure key storage. You might run across SQLi since it’s common to use a database, or XSS because you’re programming for the web, but you usually run into encryption explicitly. There are a couple of extremely common usages of encryption in normal programming: encrypting application data (credit card, SSN, etc.) or encrypting system account credentials (user/pw pair for connecting to a DB or web service call). Both of these cases require the keys to be accessible to the application at runtime in order to function properly, so we have to figure out how to solve that problem correctly.
In addition to being less common or prolific than other security issues (SQLi, XSS), there is generally less information available about how to accomplish the task properly. To be fair, that’s not necessarily because of its less common nature, but more because there is no one right answer for how you should protect your encryption key – there are various options and your specific circumstances will at least partially dictate the solution. It’s more nuanced than “use prepared statements and no dynamic sql!”. While there’s no single right answer, there are some common good and bad approaches which I’ll attempt to address.
What should I do about it?
Before looking at solutions, I want to briefly address 2 base assumptions I’ve made which are worth considering before digging in deeper:
1. Store as little sensitive data as needed
You should only store the sensitive data that you must. If you can throw it away, then by all means throw it away. There are lots of factors that go into this decision, but from a security perspective, you can’t lose data you don’t have.
2. Use well-vetted implementations of strong encryption algorithms in the proper configuration
There are 3 subpoints here:
– Strong encryption algorithms: Check NIST for the most up-to-date list of acceptable encryption algorithms. A simple example is that you should no longer be using DES for encryption, but AES is currently ok (as of late 2012)
– Proper configuration: Again, check NIST. You should be using the appropriate mode and padding for your encryption algorithm in order for it to be more resistant to cryptanalysis. A simple example here is that you should not be using ECB as a mode – it is not safe.
– Well-vetted implementations: You should be using algorithms and configurations that are approved, but the implementation should be well-known and reasonably vetted. The level of rigor for the vetting will differ depending upon your environment, but in general, java programmers are usually safe with either the built-in implementations from Oracle, or using BouncyCastle, a popular crypto library.
Now that we’ve considered our assumptions, let’s look at a few things you should not do when considering how to store your encryption key:
– Don’t store your key with the data it is protecting.
You shouldn’t store your key together with your data, because if the attacker gets one, they likely have the other. That means the attacker has nothing but time to get your data. Two common ways this shows itself is either with the key stored in the database with the data, or the key stored in a flat file with the username and password it’s protecting. This is a bad practice.
– Don’t store your key on the same host as the data it is protecting.
This is very similar to the previous point, but also takes into account the key in one file and the username/password in another file. If they’re both on the same server, it’s more likely the attacker will be able to get to both. Also, if the DB is local to the appserver as is the key file, then the likelihood increases that both could be grabbed at the sametime for an offline attack.
- Don’t rely on security by obscurity.
Encryption key storage is an area where I see obscurity used more often than in other places in applications. People try to do clever things like break the key up across several files or obfuscate the key in some way, and then reverse it out in the application. I’m not saying these tactics couldn’t increase the security marginally – they might depending on the attack vector. However, they also usually complicate processes like key rotation and data re-keying. My vote is to stick to simpler solid practices. If you do go this route, make sure you have a complete solution that takes into account your full key lifecycle.
We’ve looked at a few bad techniques, now let’s consider some good practices.
– Separate key and protected data
I’ve already covered this above with the bad practices, but making sure the encryption key and the data it is protecting are separate is certainly a good practice. Make sure it’s separate across tiers as well.
– Lock down your permissions
Make sure that your account credentials are locked down appropriately, including DB, filesystem, web service calls, etc. Ensure that you are using strong authentication and access control as well as least privilege. For instance, there’s usually no reason for the encryption key to be writable by an application – read-only makes sense in this case.
– Consider various storage options.
There are a multitude of ways you could store your encryption key depending on your needs. A few might be a keystore (java keystore), an HSM (hardware security module), flat file, DB, etc. You may also allow for the key to be manually entered by operations personnel at application startup time. This puts key storage out of band generally, but does mean that there are additional runtime requirements you must live with, but that may be the right option for you. Again, I want to point out that not all of these will work for everyone, but it’s good to know they are available so you can see which solution will work in your environment.
– Plan for key rotation and re-keying data
Whether it’s driven by internal or external requirements, a data breach, or just good practice, you should be doing key rotation and data re-keying. A key should have a general lifecycle that takes it from generation to destruction. By following this process, you shorten the window for which the key is useful, and you have a recourse if you are breached. The hope during a breach is that either the data or the key is captured but not both. When planning for this step, be sure to consider issues like disaster recovery and backed up data that may need to be retrieved. Those processes should play into the key lifecycle as well.
– Know your policies, standards and regulations
Depending on your industry or even just your organization, you will have both internal and external requirements that you must account for. For instance, if you process and store credit cards, you have to live with PCI/DSS. There are certain requirements that these standards levy upon you, and those may impact your key management process. Be aware of what they are and make sure you are abiding by those when you build out your plan.
In conclusion, encryption key storage is not a simple, clear-cut security issue. It is nuanced, and the appropriate solution depends on your policy and regulatory environment. However, there are some well known good and bad practices you should consider before coming up with a solution. With attention to these details, you can come up with a solution that fits your needs and provides solid security as well.
What is it and why should I care?
Note 1: I’ve actually wanted to finish this post for quite a while, but every time I tried, I would do some more research and find more rabbit holes to enter. At this point, I’m going to cut my losses, and post what I have now. Unfortunately, any solution discussed on topic inherently has weaknesses – it’s shades of gray even with good solutions.
Note 2: This post is about user passwords, not system account passwords. That is a different issue with different requirements, and therefore different solutions.
Password breaches have been unusually common over the last year or so, hitting many large and popular companies. What’s been particularly disheartening is the weak protections applied to passwords in most of these cases. Several have been protected with a simple md5 or sha-1 hash. We’ve been able to hash our password locally and compare it to the publicly available list and see our hash right there in the list. That’s not a pleasant feeling for anyone, much less those of us in security. But … what would we tell them to do better?
I’ve been looking at this issue for some time, and talking with various folks about what solutions they generally recommend. I’ve heard a handful of different ideas, some better than others, but all with weaknesses. At the end of the day, the best solutions recognize they are not perfect, and compensate for that. That is the course of action I would recommend.
What should I do about it?
There are a lot of different ideas about what to do in order to store passwords securely (just check the references :>). That being said, most fall under a basic collection of ideas(hash, salt + hash, salt + hash + key stretching, adaptive hash). While there are various opinions of what is best at this point, there is a reasonable amount of agreement on what’s no longer acceptable. With that, let’s look at a few bad ideas:
Plaintext Password Only (BAD)
This is a poor choice for obvious reasons. If your password data is breached, the attacker has no work to do in order to gain access to the records.
Encrypted Password Only (BAD)
This is a poor choice for a couple reasons. If you have a breach, an attacker has only to find the decryption key to gain access to the records. This may be a difficult task for external threat actors, but does nothing to mitigate internal attackers. Additionally, this is a poor privacy practice because the password is reversible, and the system can always see the password for the user, opening up another avenue for internal attackers to exploit.
Hashed Password Only (BAD)
This is a poor choice because it’s not enough protection. Rainbow tables have made cracking hashed passwords trivial, and effectively equivalent to plaintext from the attacker’s perspective.
Salted and Hashed Password Only (BAD)
This is a poor choice also because it’s not enough protection. Though salting is a good practice, today’s (late 2012) hardware makes simple hash salting a sufficiently weak protection that it’s no longer considered viable. Brute forcing a salted hashed password is a practical option for even a poorly funded attacker today.
So, we’ve seen some bad options … what are the possible solutions? Here are the available proposals that I’m currently aware of that _may_ be considered reasonable depending upon your environment.
Salted and Hashed and Iterated Password
In this option, you essentially perform a salted hash, then hash that, then hash that, then hash that … repeatedly for some large number of iterations (50,000X, 100,000X, 500,000X). The goal here is to slow down your password verification process. By doing this, you slow down your code when the user is logging in, but the theory is that login is a fairly rare request for applications, so you’ll only run it, say, once per user per day, but the attacker has to run it for every try. You make brute force ineffective again. While this theory is nice, verify whether the assumption is accurate for your environment. In this helpful spreadsheet, jOHN Steven points out that the cost of performing login on a sufficiently large site can certainly cost money in additional hardware, while it may not slow down an attacker as much as you think. This may or may not be a concern for your environment, but it’s certainly a consideration.
Adaptive Hash Functions
This option is very similar to the salted/hashed/iterated password option externally, but with varied internal operations, most calling the “iteration” bit a work factor. The currently popular implementations of these concepts are PBKDF2, bcrypt and scrypt. These options will have the same consideration as above with regards to hardware cost and attacker prevention.
Encrypted Adaptive Hash Functions
This is one of the options proposed in the threat model for secure password storage that jOHN put together (a fantastic piece of work). It is an idea which has some interesting strengths, but has not been heavily discussed yet in the industry (at least not openly). It does solve some of the problems that other solutions have, but not all. Additionally, there is a measure of extra effort and complexity added as part of the solution.
So, which solution should you choose? Obviously pick a reasonable one, but the answer is – it depends. In my mind, however, as much or more important than the specific encryption/hash solution you choose (assuming you choose a reasonable one) are the additional related tasks you should undertake.
You honestly need to consider your threat model and who you are protecting against and let the results of that inform the steps you take. If you’re protecting against a low-grade attacker, you have different concerns than if you are looking at a “hacking” collective, a disgruntled internal employee, or a nation-state. Each of these scenarios will have different requirements and will inform your protection scheme.
Know That You Will Lose
This is critical. Go into your planning with the assumed certainty of a breach (though I hope that never happens to you, I promise). What will you do when you see your name on CNN under the heading “120M User Accounts Stolen”? What will your story be? What will your process be?
You need to have a plan to deal with a compromise of your data. You should have a way to 1)protect your users whose accounts have been breached and 2)roll out the necessary updates to your system. All of this is a lot simpler if you’ve planned (and practiced) it beforehand. See the “workflow under attack” section of the threat model for more info.
Consider Multi-Factor Authentication
While multi-factor authentication doesn’t preclude you from appropriately handling all of the steps discussed above, it can significantly lower the likelihood of compromise of your customers. If it is possible in your situation, it’s an option to heavily consider.
In conclusion, while password storage is clearly no easy feat, and no solution is perfect, hopefully you see there are much better ways to handle it than the current common scenario. There are technical solutions for the specific storage mechanism, but that is just one part of the puzzle. You also need to take into account how you will deal with issues like your specific threat environment, planning for what to do when you do lose your password data, and additional protections like multi-factor authentication. There’s a lot to account for, but if you take each step into account, it’s possible to significantly improve the password storage in your applications.
Secure Password Storage Threat Model
jOHN Steven PSM code / docs