laforceit-blog/src/content/posts/prometheus-monitoring.md

---
title: Setting Up Prometheus Monitoring in Kubernetes
description: A comprehensive guide to implementing Prometheus monitoring in your Kubernetes cluster
pubDate: 2025-04-19
heroImage: /blog/images/posts/prometheusk8.png
category: devops
tags:
  - kubernetes
  - monitoring
  - prometheus
  - grafana
  - observability
readTime: 9 min read
---

# Setting Up Prometheus Monitoring in Kubernetes

Effective monitoring is crucial for maintaining a healthy Kubernetes environment. Prometheus has become the de facto standard for metrics collection and alerting in cloud-native environments. This guide will walk you through setting up a complete Prometheus monitoring stack in your Kubernetes cluster.

## Why Prometheus?

Prometheus offers several advantages for Kubernetes monitoring:

- **Pull-based architecture**: Simplifies configuration and security
- **Powerful query language (PromQL)**: For flexible data analysis
- **Service discovery**: Automatically finds targets in dynamic environments
- **Rich ecosystem**: Wide range of exporters and integrations
- **CNCF graduated project**: Strong community and vendor support

## Components of the Monitoring Stack

We'll set up a complete monitoring stack consisting of:

1. **Prometheus**: Core metrics collection and storage
2. **Alertmanager**: Handles alerts and notifications
3. **Grafana**: Visualization and dashboards
4. **Node Exporter**: Collects host-level metrics
5. **kube-state-metrics**: Collects Kubernetes state metrics
6. **Prometheus Operator**: Simplifies Prometheus management in Kubernetes

## Prerequisites

- A running Kubernetes cluster (K3s, EKS, GKE, etc.)
- kubectl configured to access your cluster
- Helm 3 installed

## Installation Using Helm

The easiest way to deploy Prometheus is using the kube-prometheus-stack Helm chart, which includes all the components mentioned above.

### 1. Add the Prometheus Community Helm Repository

```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
```

### 2. Create a Namespace for Monitoring

```bash
kubectl create namespace monitoring
```

### 3. Configure Values

Create a `values.yaml` file with your custom configuration:

```yaml
prometheus:
  prometheusSpec:
    retention: 15d
    resources:
      requests:
        memory: 256Mi
        cpu: 100m
      limits:
        memory: 2Gi
        cpu: 500m
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: standard
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 20Gi

alertmanager:
  alertmanagerSpec:
    storage:
      volumeClaimTemplate:
        spec:
          storageClassName: standard
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 10Gi

grafana:
  persistence:
    enabled: true
    storageClassName: standard
    size: 10Gi
  adminPassword: "prom-operator"  # Change this!

nodeExporter:
  enabled: true

kubeStateMetrics:
  enabled: true
```

### 4. Install the Helm Chart

```bash
helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --values values.yaml
```

### 5. Verify the Installation

Check that all the pods are running:

```bash
kubectl get pods -n monitoring
```

## Accessing the UIs

By default, the components don't have external access. You can use port-forwarding to access them:

### Prometheus UI

```bash
kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090
```

Then access Prometheus at http://localhost:9090

### Grafana

```bash
kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80
```

Then access Grafana at http://localhost:3000 (default credentials: admin/prom-operator)

### Alertmanager

```bash
kubectl port-forward -n monitoring svc/prometheus-alertmanager 9093:9093
```

Then access Alertmanager at http://localhost:9093

## For Production: Exposing Services

For production environments, you'll want to set up proper ingress. Here's an example using a basic Ingress resource:

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus-ingress
  namespace: monitoring
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  rules:
  - host: prometheus.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-operated
            port:
              number: 9090
  - host: grafana.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-grafana
            port:
              number: 80
  - host: alertmanager.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-alertmanager
            port:
              number: 9093
```

## Configuring Alerting

### 1. Set Up Alert Rules

Alert rules can be created using the PrometheusRule custom resource:

```yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: node-alerts
  namespace: monitoring
  labels:
    release: prometheus
spec:
  groups:
  - name: node.rules
    rules:
    - alert: HighNodeCPU
      expr: instance:node_cpu_utilisation:rate1m > 0.8
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High CPU usage on {{ $labels.instance }}"
        description: "CPU usage is above 80% for 5 minutes on node {{ $labels.instance }}"
```

### 2. Configure Alert Receivers

Configure Alertmanager to send notifications by creating a Secret with your configuration:

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: alertmanager-prometheus-alertmanager
  namespace: monitoring
stringData:
  alertmanager.yaml: |
    global:
      resolve_timeout: 5m
      slack_api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'

    route:
      group_by: ['job', 'alertname', 'namespace']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: 'slack-notifications'
      routes:
      - receiver: 'slack-notifications'
        matchers:
          - severity =~ "warning|critical"

    receivers:
    - name: 'slack-notifications'
      slack_configs:
      - channel: '#alerts'
        send_resolved: true
        title: '{{ template "slack.default.title" . }}'
        text: '{{ template "slack.default.text" . }}'
type: Opaque
```

## Custom Dashboards

Grafana comes pre-configured with several useful dashboards, but you can import more from [Grafana.com](https://grafana.com/grafana/dashboards/).

Some recommended dashboard IDs to import:
- 1860: Node Exporter Full
- 12740: Kubernetes Monitoring
- 13332: Prometheus Stats

## Troubleshooting

### Common Issues

1. **Insufficient Resources**: Prometheus can be resource-intensive. Adjust resource limits if pods are being OOMKilled.
2. **Storage Issues**: Ensure your storage class supports the access modes you've configured.
3. **ServiceMonitor not working**: Check that the label selectors match your services.

## Conclusion

You now have a fully functional Prometheus monitoring stack for your Kubernetes cluster. This setup provides comprehensive metrics collection, visualization, and alerting capabilities essential for maintaining a healthy and performant cluster.

In future articles, we'll explore advanced topics like custom exporters, recording rules for performance, and integrating with other observability tools like Loki for logs and Tempo for traces.