SREcon19 Americas Talk Resources

At SREcon19 Americas, I gave a talk called "Operating within Normal Parameters: Monitoring Kubernetes". I also reprised this talk at the Cloud Native PDX meetup in October 2019 and the Portland DevOps meetup in May 2020. Here's some links and resources related to my talk, for your reference.

Operating within Normal Parameters: Monitoring Kubernetes

Talk slides (pdf download)
Talk video, hosted on YouTube
Try it yourself: sample code on GitHub
Prometheus documentation
kube-state-metrics documentation
metrics-server documentation
Grafana documentation (for dashboards and visualization)

Additional Prometheus metrics sources

Related readings

I'm including these documents for reference to add some context around what's currently happening (as of 2019Q1) in the Kubernetes instrumentation SIG and wider ecosystem.

Note that GitHub links are pinned to their most recent commit to ensure they will not break; if you want the latest version, make sure to switch the branch to "master".

SIG Instrumentation Meeting Minutes (note: you must join the Google Group to be able to access these)
Kubernetes 1.14 metrics overhaul (KEP-0031)
Core Metrics proposal
Kubelet Resource Metrics (formerly "Core Metrics") Endpoint proposal
Kubernetes monitoring architecture
Kubernetes instrumentation guidelines