Prometheus host metrics and graphing just solved a mysterious 'machine out of memory without OOM notification' we had earlier today. I feel like I just got a win.

(In retrospect it's not much of a mystery, we'd just forgotten that we were enforcing strict overcommit limits on this class of machines.)

Sign in to participate in the conversation

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!