Risk: The False Prophet of Cybersecurity Decision-making
Model errors are poisoning your risk metric leading to worse decision-making
I believe we have a crisis within cybersecurity. Liabilities in the form of impact from events AND spending on security are going exponential. Given the severity of this issue, it’s surprising that very little is being said about the fundamentals of our decision-making frameworks. Instead all of the investment goes towards tools and techniques to plug holes in the ship. If more investment is leading to increasing costs, does this not indicate that our current approach is failing?
I’ve asked myself this question for the past few years. Through the ideas of those smarter than me, I have come to the following conclusion: Our model of Risk, as defined by the formulaic model of Impact x Probability contains significant enough error that the results are at best, little better than random.
Courtesy of the Real World Risk Institute (RWRI) and Nassim Taleb I was able to shed light on the mathematical errors in our decision-making and I hope to help you, the reader gain a brief understanding of why you should change the way you make cybersecurity resource allocation decisions.
Cybersecurity is a fat-tailed domain
First, we must acknowledge that our field is dominated by fat tails. Fat tails are events that have a very large impact, but have a very low probability of occurring. One of the key points about fat tails you need to understand is that sometimes it takes a very long time or many iterations for them to occur. This means you may never see a fat-tailed risk, but that doesn’t mean it is not lying in wait.
Given this, it seems easy enumerate the possibility space and then assign probability values to each possibility. However, we are presented with a few problems.
We cannot observe a probability distribution of any particular event, only the outcomes of the random generating function: We only see what the function, in this case the interaction of an attacker with our network, outputs in the discrete time of measurement. We gain little to no information about the probability distribution across all possibilities through observation.
The possibility space is constantly evolving: In a digital network, changes are constantly being made and with each change, comes a change in the possibility space for a compromise. This also means that with every additional change in possibility, the probabilities for each possibility also change.
Observing the possibility space changes the probability distribution: Simply observing the possibility space within a network will change the probabilities assigned to each possibility. This is because our attention is now on these possibilities which kicks off other processes of monitoring and mitigations.
With these issues probability of a particular event becomes incalculable with the exceptions of 0% and 100%.
Nasty Effects of Model Error
This is the one of the most important things I took away from RWRI. The effects of model error are exponential. Let that sink in for a second. Errors in your model have exponential effects on the output of the model. This means relatively small errors in your model can lead to completely unexpected and wildly incorrect outputs. When it comes to estimating risk, this point is incredibly important.
Risk, in its fundamental form is a formulaic model consisting of Impact and Probability. Impact and Probability themselves are the outputs of another formulaic model deriving the financial impact to the business and the likelihood of that particular event.
As shown above, since probability is incalculable, it is certain that we will have a high model error in our calculation of probability. Thus we will introduce a significant amount of error into our Risk calculation and because the effect is exponential, it will create wildly innaccurate results.
You may be asking yourself, well isn’t Impact subject to the same problem? It is, but there is an important distinction. Impact has a couple of key features that make it much less susceptible to error than probability.
Impact has a firm ceiling: It can be no higher than the dollar amount the company can absorb without going bankrupt:
It can be calculated from known quantities within the business for which you have high fidelity data: Financial statements have less error than predictions of what any given attacker in the world will do.
It is less susceptible to errors by orders of magnitude: It is much harder to mistake a $10 million impact for a $100 million impact vs a 1% probability for a 10% probability
Example
Consider we have the following risk calculation:
Probability = 10%
Impact = $1,000,000
Risk = $100,000
Now consider we have a theoretical model error of .3. Since model error is additive this would yield.
.3 = Error of Impact + Error of Probability
In the best case, the model error of impact and probability are both .15 which yields a linear effect on the result (green line below). In the worst case and more realistic case, the distribution of error is more asymmetric yielding a non-linear effect on our outcome.
This means that errors in our model have non-linear effects on our predicted outcome. Needless to say this makes our risk equation significantly more noisy than one would initial predict.
This is inevitable to happen in the real world due to the information assymetry we have about impact and probability. We have access to the information necessary to make a decent judgement on impact, but we do not have an equal level of information for probability. For example, it is much harder to mistake a $10 Million impact for a $100 Million impact than a 1% chance for a %10 chance. This leads to the error in the risk model to be skewed towards probability creating exponential effects on the predicted total outcome.
Watering Down of Risks
If you can only take one thing from this article it’s this: Multiplying the low information probability by the higher information variable impact exponentially increases how wrong your risk equation can be. The simplest solution to this is to remove the low information variables from the equation and focus our efforts based on impact. If we are able to get high information probability metrics, like 0-day Proof-of-Concept code, then it is valuable to use in risk judgements, but as a general rule we should be extremely selective due to its dangerous effects.