Security & Organisation - Mathematical basics

Disclaimer: For this course, it's assumed that you know the basics of statistics. The following represents my own way of making sense of the central concepts, and may contain errors or omissions (I'm not a mathematician). The same caution applies to the Wikipedia links provided. For professional coverage, consult a book on probability theory and statistics. Feedback is welcome!

## Very basic concepts without too much math

• A probability represents the likelihood of an event taking place as a value between 0 (certainly not) and 1 (certainly). This concerns a single event. For example, the probability of throwing a 6 with a dice is 1/6, or the probability that an earthquake hits an area in the next 10 years could be 10% (0.1). The latter is typically interpreted as at least one earthquake.
• A frequency or rate represents how often a certain event is expected to happen within a time interval (e.g. per year). This concerns multiple events of a certain type. Frequencies can for example be two times per year (2) or once every 10,000 years (1/10,000).
• A probability distribution represents the probability of an event happening at a certain time. For example, it could be very likely that an event happens within a few days and less likely that it happens later. We often use the exponential distribution, which represents the probability distribution of the time until the next event if a previous event has no influence on the probability.

## More extensive coverage

### Probabilities

• A probability P(A) represents the likelihood of a random event A as a value between 0 and 1. For example, the probability of a six showing when rolling a dice is 1/6. A probability can be seen as the relative number of occurrences of the event compared to the number of experiments, if you run the associated experiment often enough (throw a dice a lot of times and about 1/6 of the outcomes will be 6; frequentist interpretation). The unit of a probability is 1.
• A random variable can have different outcomes in an experiment. For example, the number shown by a dice can take any of the values 1 to 6 when throwing it. An event can be any combination of outcomes, for example the number being higher than 3, or the number being exactly 6. If X is the random variable representing the outcome of throwing a dice, the associated probabilities are denoted P(X>3) and P(X=6).
• The conditional probability of event B given event A, denoted P(B|A), represents the likelihood that B occurs given that A has occurred. Two events are independent if the occurrence of one doesn't influence the probability of the other. For example, if you throw two dice, the event that one shows a six is independent of the other showing a six. On the other hand, the event of one showing a six is not independent from the event of the same dice showing a number higher than 3. The a priori probability P(X=6) is 1/6, and the conditional probability P(X=6|X>3) is 1/3.
• The probability of the simultaneous occurrence of two independent events can be calculated by multiplying the probabilities. For example, the probability of two dice showing a six is 1/6 × 1/6 = 1/36. This calculation is used in AND-nodes in fault and attack trees.
• If the events are not independent, this does not hold. The probability of the dice showing a number higher than 3 AND the dice showing a six is still 1/6 (and not 1/3 × 1/6). In case of dependent events, one has to calculate P(X>3) × P(X=6|X>3) instead: 1/2 × 1/3 = 1/6. This calculation is used in consequence trees and Bayesian belief networks.
• The probability of either event A or event B occurring (or both) is trickier. The intuition may be to add the probabilities, but this is not correct. If we want to know the probability of at least one of the dices showing a six, this is not 1/6 + 1/6, because we would count the event in which both show a six double. So we have to subtract the associated probability: 1/6 + 1/6 - 1/36 = 11/36. Alternatively, we can calculate the probability of neither showing a six and take it from there. This probability is 5/6 × 5/6 = 25/36. The probability that this is not the case (thus at least one showing a six) is 1 - 25/36 = 11/36. Again, this assumes that the events are independent. This calculation is used in OR-nodes in fault and attack trees.

### Time and probability distributions

• Time can be a random variable, when we are interested in when something unpredictable can happen. In the following, "event" is used in the common sense way (failure, incident, etc.) rather than the probability theory interpretation (set of outcomes). An "event" in the probability theory interpretation is the occurrence of a common sense event within a certain time range (and time is the random variable)!
• A probability distribution is a function that assigns probabilities to possible outcomes. For example, an age probability distribution would map an age to the probability that a random person is of that age. In this case, assigning individual probabilities to each age is possible (probability mass function), because there is a limited number of possible outcomes (the domain is discrete; 0, 1, 2, ... up to 120 or so).
• This is not possible if the domain is continuous, such as time. There are infinitely many moments in each time interval. If something happens at a random time, there are infinitely many possible outcomes, which makes the probability that the event occurs exactly at time t zero. The solution is to use a cumulative distribution function, representing the probability that the event occurs before time t, and the derivative, the probability density function. Integrating the probability density function over the time interval [t1,t2] gives the probability that the event occurs within that time interval.
• A Poisson process models the random distribution of events of a certain type over time, with a certain average frequency. The events are distributed such that it doesn't matter for the next event when the previous one took place (stochastically independent). For example, the fact that a flood just took place wouldn't influence the likelihood of the next one taking place within any particular time interval, if it is modelled as a Poisson process.
• The exponential distribution is the probability distribution of the time of the next event in a Poisson process. Because the events are stochastically independent, it doesn't matter at which point in time you start counting: at any point in time, the probabilities of the next event happening within a certain time interval are the same.
• The Poisson distribution is a probability mass function mapping the expected number of events in a time interval to their associated probability (rather than being a distribution of the time of occurrence of a single event, as in the exponential distribution). Both exponential distribution and Poisson distribution model a Poisson process, but they model different random variables (time of occurrence of the first event versus number of occurrences in a fixed time interval).

### Frequencies (under construction)

• The expected value represents the expected average value of a random variable when repeating the experiment often enough. For example, the expected value of the number shown by a dice is 3.5 (=(1+2+3+4+5+6)/6).
• A frequency or rate represents how often a certain event happens within a time interval. The unit of a frequency is for example y-1 ((events) per year). In case of randomly distributed events, we typically deal with the expected frequency. Frequencies or rates are not probabilities, and they can exceed 1 (multiple events per year). If people relate a frequency to a probability, they typically refer to the probability of at least one event in a time frame.
• The failure rate λ represents the expected frequency of failure of a system (expected number of failures per unit of time). The mean time between failures (MTBF) is the reciprocal of the failure rate (1/λ).
• For a Poisson process, the failure rate is constant over time. If this is not the case (failure rate varies over time), one has to work with density functions instead. The hazard rate h(t) is a density function for failure events. Its integral represents the expected number of events in a time frame. The bath tub curve represents a variable failure rate, and is therefore an example of the representation of a hazard rate as a function of time. For the exponential distribution (Poisson process; constant failure rate), h(t) = λ.

### Impact and risk

• The impact is a metric for the damage or harm caused by an event. Money may be used as a unit, but other dimensions such as lives lost are also prominent. It depends on the situation whether just one dimension or multiple dimensions are used.
• The risk is a combination of the likelihood and impact of an event. For example, the annual loss expectancy is calculated as the expected frequency per year times the expected impact per event.

### Odds and relative risk

• The odds is the likelihood of an event happening divided by the likelihood of an event not happening. For example, the odds of throwing a six with a dice is 1/5. This in contrast to the notion of probability or risk (in case of unwanted events), which would be 1/6.
• The odds ratio is the ratio of odds in two different situations. For example, compare a regular dice against one with 8 faces (octaeder). The odds of throwing a 6 is 1/5 with the former and 1/7 with the latter. This means that the odds ratio is (1/5)/(1/7) = 7/5.
• The relative risk, by contrast, is the ratio of the associated event probabilities. In the above example, the RR is (1/6)/(1/8) = 8/6 = 4/3. For relative risk, one can say that one has a 4/3 times higher chance of throwing a six with a regular dice than with one with 8 faces. This cannot be said if the odds ratio is used for comparison (the chance is not 7/5 times higher). The odds ratio is higher than the relative risk, but with very small probabilities, the difference becomes negligible.