laforceit-blog/public/blog/posts/prometheus-monitoring.md

7.4 KiB
Executable File

title description pubDate heroImage category tags readTime
Setting Up Prometheus Monitoring in Kubernetes A comprehensive guide to implementing Prometheus monitoring in your Kubernetes cluster 2025-04-19 /blog/images/posts/prometheusk8.png devops
kubernetes
monitoring
prometheus
grafana
observability
9 min read

Setting Up Prometheus Monitoring in Kubernetes

Effective monitoring is crucial for maintaining a healthy Kubernetes environment. Prometheus has become the de facto standard for metrics collection and alerting in cloud-native environments. This guide will walk you through setting up a complete Prometheus monitoring stack in your Kubernetes cluster.

Why Prometheus?

Prometheus offers several advantages for Kubernetes monitoring:

  • Pull-based architecture: Simplifies configuration and security
  • Powerful query language (PromQL): For flexible data analysis
  • Service discovery: Automatically finds targets in dynamic environments
  • Rich ecosystem: Wide range of exporters and integrations
  • CNCF graduated project: Strong community and vendor support

Components of the Monitoring Stack

We'll set up a complete monitoring stack consisting of:

  1. Prometheus: Core metrics collection and storage
  2. Alertmanager: Handles alerts and notifications
  3. Grafana: Visualization and dashboards
  4. Node Exporter: Collects host-level metrics
  5. kube-state-metrics: Collects Kubernetes state metrics
  6. Prometheus Operator: Simplifies Prometheus management in Kubernetes

Prerequisites

  • A running Kubernetes cluster (K3s, EKS, GKE, etc.)
  • kubectl configured to access your cluster
  • Helm 3 installed

Installation Using Helm

The easiest way to deploy Prometheus is using the kube-prometheus-stack Helm chart, which includes all the components mentioned above.

1. Add the Prometheus Community Helm Repository

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

2. Create a Namespace for Monitoring

kubectl create namespace monitoring

3. Configure Values

Create a values.yaml file with your custom configuration:

prometheus:
  prometheusSpec:
    retention: 15d
    resources:
      requests:
        memory: 256Mi
        cpu: 100m
      limits:
        memory: 2Gi
        cpu: 500m
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: standard
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 20Gi

alertmanager:
  alertmanagerSpec:
    storage:
      volumeClaimTemplate:
        spec:
          storageClassName: standard
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 10Gi

grafana:
  persistence:
    enabled: true
    storageClassName: standard
    size: 10Gi
  adminPassword: "prom-operator"  # Change this!
  
nodeExporter:
  enabled: true

kubeStateMetrics:
  enabled: true

4. Install the Helm Chart

helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --values values.yaml

5. Verify the Installation

Check that all the pods are running:

kubectl get pods -n monitoring

Accessing the UIs

By default, the components don't have external access. You can use port-forwarding to access them:

Prometheus UI

kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090

Then access Prometheus at http://localhost:9090

Grafana

kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80

Then access Grafana at http://localhost:3000 (default credentials: admin/prom-operator)

Alertmanager

kubectl port-forward -n monitoring svc/prometheus-alertmanager 9093:9093

Then access Alertmanager at http://localhost:9093

For Production: Exposing Services

For production environments, you'll want to set up proper ingress. Here's an example using a basic Ingress resource:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus-ingress
  namespace: monitoring
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  rules:
  - host: prometheus.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-operated
            port:
              number: 9090
  - host: grafana.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-grafana
            port:
              number: 80
  - host: alertmanager.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-alertmanager
            port:
              number: 9093

Configuring Alerting

1. Set Up Alert Rules

Alert rules can be created using the PrometheusRule custom resource:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: node-alerts
  namespace: monitoring
  labels:
    release: prometheus
spec:
  groups:
  - name: node.rules
    rules:
    - alert: HighNodeCPU
      expr: instance:node_cpu_utilisation:rate1m > 0.8
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High CPU usage on {{ $labels.instance }}"
        description: "CPU usage is above 80% for 5 minutes on node {{ $labels.instance }}"

2. Configure Alert Receivers

Configure Alertmanager to send notifications by creating a Secret with your configuration:

apiVersion: v1
kind: Secret
metadata:
  name: alertmanager-prometheus-alertmanager
  namespace: monitoring
stringData:
  alertmanager.yaml: |
    global:
      resolve_timeout: 5m
      slack_api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
    
    route:
      group_by: ['job', 'alertname', 'namespace']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: 'slack-notifications'
      routes:
      - receiver: 'slack-notifications'
        matchers:
          - severity =~ "warning|critical"
    
    receivers:
    - name: 'slack-notifications'
      slack_configs:
      - channel: '#alerts'
        send_resolved: true
        title: '{{ template "slack.default.title" . }}'
        text: '{{ template "slack.default.text" . }}'    
type: Opaque

Custom Dashboards

Grafana comes pre-configured with several useful dashboards, but you can import more from Grafana.com.

Some recommended dashboard IDs to import:

  • 1860: Node Exporter Full
  • 12740: Kubernetes Monitoring
  • 13332: Prometheus Stats

Troubleshooting

Common Issues

  1. Insufficient Resources: Prometheus can be resource-intensive. Adjust resource limits if pods are being OOMKilled.
  2. Storage Issues: Ensure your storage class supports the access modes you've configured.
  3. ServiceMonitor not working: Check that the label selectors match your services.

Conclusion

You now have a fully functional Prometheus monitoring stack for your Kubernetes cluster. This setup provides comprehensive metrics collection, visualization, and alerting capabilities essential for maintaining a healthy and performant cluster.

In future articles, we'll explore advanced topics like custom exporters, recording rules for performance, and integrating with other observability tools like Loki for logs and Tempo for traces.