Skip to content

Observability

Monitor the operator and your RADIUS infrastructure with Prometheus metrics.


Operator Metrics

The operator exposes Prometheus metrics on :8080/metrics. These cover the operator's own reconciliation performance — not FreeRADIUS traffic metrics.

Available Metrics

Metric Type Labels Description
freeradius_operator_reconcile_total Counter namespace, name, kind, result Total reconciliation attempts. result is success or error.
freeradius_operator_reconcile_duration_seconds Histogram Time spent in each reconciliation loop.

Scrape Configuration

If you're using the Prometheus Operator, create a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: freeradius-operator
  namespace: freeradius-system
spec:
  selector:
    matchLabels:
      app: freeradius-operator
  endpoints:
    - port: metrics
      interval: 30s
      path: /metrics

For plain Prometheus, add a scrape config:

scrape_configs:
  - job_name: freeradius-operator
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        regex: freeradius-operator
        action: keep
      - source_labels: [__meta_kubernetes_pod_container_port_number]
        regex: "8080"
        action: keep

FreeRADIUS Server Metrics

Every RadiusCluster ships with a built-in Prometheus exporter sidecar that talks to the co-located freeradius process over the loopback via the RFC 5997 Status-Server protocol. The exporter is the same operator binary invoked with --mode=exporter, so there is no extra image to publish or pin — only the operator image.

What's exposed

The sidecar listens on TCP port 9812 and exposes /metrics and /healthz. Every metric carries a cluster="<RadiusCluster name>" label.

Metric Type Description
freeradius_up Gauge 1 if the last scrape of the status-server succeeded, 0 otherwise.
freeradius_scrape_duration_seconds Gauge Duration of the last scrape. Useful for latency SLOs.
freeradius_exporter_scrape_errors_total Counter Total number of scrape attempts that failed.
freeradius_access_requests_total Counter Total Access-Request packets received.
freeradius_access_accepts_total Counter Total Access-Accept packets sent.
freeradius_access_rejects_total Counter Total Access-Reject packets sent.
freeradius_access_challenges_total Counter Total Access-Challenge packets sent.
freeradius_auth_{duplicate,malformed,invalid,dropped,unknown_types}_requests_total Counter Per-listener error/abuse counters.
freeradius_acct_*_total Counter Mirror set for the accounting listener.
freeradius_proxy_access_*_total, freeradius_proxy_acct_*_total Counter Mirror sets for proxied traffic.
freeradius_queue_len_{internal,proxy,auth,acct,detail} Gauge Internal queue depths — rising values indicate saturation.

Enabling

Server metrics are on by default. The operator injects the exporter sidecar into every pod that runs an auth listener (standalone mode, and the auth role in split-mode). Acct- or CoA-only pods get no sidecar because the status-server only binds to the auth listener.

apiVersion: radius.operator.io/v1alpha1
kind: RadiusCluster
metadata:
  name: production
spec:
  image: freeradius/freeradius-server:3.2.3
  replicas: 3
  metrics:
    enabled: true                # default
    # port: 9812                 # default
    # image: my-registry/op:v1   # defaults to operator image
    resources:
      requests: { cpu: 25m, memory: 32Mi }
      limits:   { cpu: 100m, memory: 64Mi }
    serviceMonitor:
      enabled: true
      interval: 30s
      labels:
        release: kube-prometheus-stack
    prometheusRule:
      enabled: true
      labels:
        release: kube-prometheus-stack

Setting spec.metrics.enabled: false removes the sidecar and the metrics Service port.

ServiceMonitor and PrometheusRule

When serviceMonitor.enabled: true, the operator creates a monitoring.coreos.com/v1 ServiceMonitor selecting the cluster's auth Service on the metrics port. Set labels so the Prometheus Operator selects it (for kube-prometheus-stack this is typically release: kube-prometheus-stack).

When prometheusRule.enabled: true, the operator creates a PrometheusRule with four starter alerts:

Alert Expression For Severity
RadiusClusterDown freeradius_up == 0 2m critical
RadiusHighRejectRate Reject rate > 50% of total 5m warning
RadiusAuthLatencyHigh freeradius_scrape_duration_seconds > 0.5 10m warning
RadiusQueueDepthGrowing freeradius_queue_len_internal > 100 10m warning

Both resources are best-effort: if the Prometheus Operator CRDs are not installed, the operator logs a single line and skips them. The sidecar and the /metrics endpoint still work — you can scrape with a plain Prometheus scrape config.

Grafana dashboard

A starter Grafana dashboard is included at docs/dashboards/freeradius.json. Import it and select the Prometheus datasource that scrapes the cluster.

End-to-end example

See example/metrics/ for a complete working configuration.

Status Conditions

Beyond metrics, the operator writes structured conditions to each resource's status. These are queryable with kubectl and useful for alerting.

Check cluster health

# Quick overview
kubectl get radiuscluster -n radius

# Detailed conditions
kubectl get radiuscluster production -n radius -o jsonpath='{.status.conditions}' | jq .

Alert on degraded clusters

A Degraded condition means the operator detected a problem (usually a missing Secret) and is retrying. You can alert on this with a Prometheus rule that watches the kube_customresource_status_condition metric (if using kube-state-metrics with CRD support) or by polling the Kubernetes API.

Useful kubectl Commands

# List all RADIUS resources
kubectl get rc,rcl,rp -n radius

# Watch reconciliation in real time
kubectl get radiuscluster -n radius -w

# Check pod health
kubectl get pods -n radius -l app.kubernetes.io/managed-by=freeradius-operator

# View operator logs
kubectl logs -n freeradius-system deploy/freeradius-operator -f

# Check pod restart count from status
kubectl get radiuscluster production -n radius \
  -o jsonpath='{.status.podRestarts}'