04. Exercise: Prometheus Monitoring
Exercise: Prometheus Monitoring
Monitoring is an essential component of DevOps best practices. In this exercise, you will set up Prometheus Monitoring.
Instructions
- Install and configure Prometheus as show here.
- Setup monitoring on to allow Prometheus to monitor itself as shown here.
- Consider a normal distribution, "six sigma" and the 68-95-99.7 rule. Computer systems events are often normally distributed, meaning that all events within three standard deviations from the mean occur with 99.7 of the occurrences.
- Design a process that alerts senior engineers when events are greater than three standard deviations from the mean and write up how the alerts should work, i.e.
- Who should get a page when an event is greater than three standard deviants from the mean?
- Should there be a backup person who gets alerted if the first person doesn’t respond within five minutes?
- Should an alert wake up a team member at one standard deviation? What about two?