
A few months ago our friends Guillaume Lefevre and Etienne Coutaud made an appearance at the 2017 KubeCon to present their work on a monitoring solution based on Prometheus and Grafana. (The video can be found here) A bit over a month later, we faced a problem in production, the platform started to behave abnormally. The graphs showed bursts in metrics which was unusual and some people were wondering if a deployment could have changed something in the configuration and triggered the problem. Even if…
Read more