7.4 KiB
Executable File
title | description | pubDate | heroImage | category | tags | readTime | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Setting Up Prometheus Monitoring in Kubernetes | A comprehensive guide to implementing Prometheus monitoring in your Kubernetes cluster | 2025-04-19 | /blog/images/posts/prometheusk8.png | devops |
|
9 min read |
Setting Up Prometheus Monitoring in Kubernetes
Effective monitoring is crucial for maintaining a healthy Kubernetes environment. Prometheus has become the de facto standard for metrics collection and alerting in cloud-native environments. This guide will walk you through setting up a complete Prometheus monitoring stack in your Kubernetes cluster.
Why Prometheus?
Prometheus offers several advantages for Kubernetes monitoring:
- Pull-based architecture: Simplifies configuration and security
- Powerful query language (PromQL): For flexible data analysis
- Service discovery: Automatically finds targets in dynamic environments
- Rich ecosystem: Wide range of exporters and integrations
- CNCF graduated project: Strong community and vendor support
Components of the Monitoring Stack
We'll set up a complete monitoring stack consisting of:
- Prometheus: Core metrics collection and storage
- Alertmanager: Handles alerts and notifications
- Grafana: Visualization and dashboards
- Node Exporter: Collects host-level metrics
- kube-state-metrics: Collects Kubernetes state metrics
- Prometheus Operator: Simplifies Prometheus management in Kubernetes
Prerequisites
- A running Kubernetes cluster (K3s, EKS, GKE, etc.)
- kubectl configured to access your cluster
- Helm 3 installed
Installation Using Helm
The easiest way to deploy Prometheus is using the kube-prometheus-stack Helm chart, which includes all the components mentioned above.
1. Add the Prometheus Community Helm Repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
2. Create a Namespace for Monitoring
kubectl create namespace monitoring
3. Configure Values
Create a values.yaml
file with your custom configuration:
prometheus:
prometheusSpec:
retention: 15d
resources:
requests:
memory: 256Mi
cpu: 100m
limits:
memory: 2Gi
cpu: 500m
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: standard
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi
alertmanager:
alertmanagerSpec:
storage:
volumeClaimTemplate:
spec:
storageClassName: standard
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
grafana:
persistence:
enabled: true
storageClassName: standard
size: 10Gi
adminPassword: "prom-operator" # Change this!
nodeExporter:
enabled: true
kubeStateMetrics:
enabled: true
4. Install the Helm Chart
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--values values.yaml
5. Verify the Installation
Check that all the pods are running:
kubectl get pods -n monitoring
Accessing the UIs
By default, the components don't have external access. You can use port-forwarding to access them:
Prometheus UI
kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090
Then access Prometheus at http://localhost:9090
Grafana
kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80
Then access Grafana at http://localhost:3000 (default credentials: admin/prom-operator)
Alertmanager
kubectl port-forward -n monitoring svc/prometheus-alertmanager 9093:9093
Then access Alertmanager at http://localhost:9093
For Production: Exposing Services
For production environments, you'll want to set up proper ingress. Here's an example using a basic Ingress resource:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: prometheus-ingress
namespace: monitoring
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
rules:
- host: prometheus.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: prometheus-operated
port:
number: 9090
- host: grafana.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: prometheus-grafana
port:
number: 80
- host: alertmanager.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: prometheus-alertmanager
port:
number: 9093
Configuring Alerting
1. Set Up Alert Rules
Alert rules can be created using the PrometheusRule custom resource:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: node-alerts
namespace: monitoring
labels:
release: prometheus
spec:
groups:
- name: node.rules
rules:
- alert: HighNodeCPU
expr: instance:node_cpu_utilisation:rate1m > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
description: "CPU usage is above 80% for 5 minutes on node {{ $labels.instance }}"
2. Configure Alert Receivers
Configure Alertmanager to send notifications by creating a Secret with your configuration:
apiVersion: v1
kind: Secret
metadata:
name: alertmanager-prometheus-alertmanager
namespace: monitoring
stringData:
alertmanager.yaml: |
global:
resolve_timeout: 5m
slack_api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
route:
group_by: ['job', 'alertname', 'namespace']
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: 'slack-notifications'
routes:
- receiver: 'slack-notifications'
matchers:
- severity =~ "warning|critical"
receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#alerts'
send_resolved: true
title: '{{ template "slack.default.title" . }}'
text: '{{ template "slack.default.text" . }}'
type: Opaque
Custom Dashboards
Grafana comes pre-configured with several useful dashboards, but you can import more from Grafana.com.
Some recommended dashboard IDs to import:
- 1860: Node Exporter Full
- 12740: Kubernetes Monitoring
- 13332: Prometheus Stats
Troubleshooting
Common Issues
- Insufficient Resources: Prometheus can be resource-intensive. Adjust resource limits if pods are being OOMKilled.
- Storage Issues: Ensure your storage class supports the access modes you've configured.
- ServiceMonitor not working: Check that the label selectors match your services.
Conclusion
You now have a fully functional Prometheus monitoring stack for your Kubernetes cluster. This setup provides comprehensive metrics collection, visualization, and alerting capabilities essential for maintaining a healthy and performant cluster.
In future articles, we'll explore advanced topics like custom exporters, recording rules for performance, and integrating with other observability tools like Loki for logs and Tempo for traces.