In my last post, I concluded that the topic of (performance) Anomaly Detection is now solved and I will switch over to (performance) Anomaly Prediction.
For the prediction, I distinguish three time windows:
- The past time, which will be in the input for the model, to decide whether performance problems are imminent.
- The immediate future, which will be ignored.
- The more distant future, which is the time when the performance problems will manifest.
This visualization will hopefully clarify things:,
When we are in time t0 (or now), all we have are the performance metrics and measurements from the past. However, the goal is not to predict whether some performance problems will start now. The alarm should be raised well before the performance problems will occur, to have some time to proactively fix issues, if possible. In the example of the picture, I chose to take only the last 30 minutes as input for the model, because the older the performance data is, the less relevant it is for the future performance. And the prediction timeframe is 60 minutes, to have enough time for the admins to react. However, trying to predict too far into the future will definitely fail. Just note that other timeframes could be chosen, and need to be evaluated for their practicability.
This leads to another topic, called hyperparameters.
Now is the time to start training an artificial neuronal network. Lots of decisions have to be taken, in order to get the best results. Here is my list of hyperparameters I had to set, initially just by guessing. Then later I could optimize them by comparing the prediction performance of various models. This is the fun part when actual data gets processed, in my case with the Keras software package.
The actual results after tuning all these hyperparameters will follow in part 6.