04. Exercise: Prometheus Monitoring

Exercise: Prometheus Monitoring

Monitoring is an essential component of DevOps best practices. In this exercise, you will set up Prometheus Monitoring.

Instructions

  • Install and configure Prometheus as show here.
  • Setup monitoring on to allow Prometheus to monitor itself as shown here.
  • Consider a normal distribution, "six sigma" and the 68-95-99.7 rule. Computer systems events are often normally distributed, meaning that all events within three standard deviations from the mean occur with 99.7 of the occurrences.
  • Design a process that alerts senior engineers when events are greater than three standard deviations from the mean and write up how the alerts should work, i.e.
    • Who should get a page when an event is greater than three standard deviants from the mean?
    • Should there be a backup person who gets alerted if the first person doesn’t respond within five minutes?
    • Should an alert wake up a team member at one standard deviation? What about two?