Configure Metrics for Consul on Kubernetes

Consul on Kubernetes integrates with Prometheus and Grafana to provide metrics for Consul service mesh. The metrics available are:

Mesh service metrics
Mesh sidecar proxy metrics
Consul agent metrics
Ingress, terminating, and mesh gateway metrics

Specific sidecar proxy metrics can also be seen in the Consul UI Topology Visualization view. This section documents how to enable each of these.

Mesh Service and Sidecar Metrics with Metrics Merging

Prometheus annotations are used to instruct Prometheus to scrape metrics from Pods. Prometheus annotations only support scraping from one endpoint on a Pod, so Consul on Kubernetes supports metrics merging whereby service metrics and sidecar proxy metrics are merged into one endpoint. If there are no service metrics, it also supports just scraping the sidecar proxy metrics.

Metrics for services in the mesh can be configured with the Helm values nested under connectInject.metrics.

Metrics and metrics merging can be enabled by default for all connect-injected Pods with the following Helm values:

connectInject:
  metrics:
    defaultEnabled: true # by default, this inherits from the value global.metrics.enabled
    defaultEnableMerging: true

They can also be overridden on a per-Pod basis using the annotations consul.hashicorp.com/enable-metrics and consul.hashicorp.com/enable-metrics-merging.

In most cases, the default settings are sufficient. If you encounter issues with colliding ports or service metrics not being merged, you may need to change the defaults.

The Prometheus annotations specify which endpoint to scrape the metrics from. The annotations point to a listener on 0.0.0.0:20200 on the Envoy sidecar. You can configure the listener and the corresponding Prometheus annotations using the following Helm values. Alternatively, you can specify the consul.hashicorp.com/prometheus-scrape-port and consul.hashicorp.com/prometheus-scrape-path Consul annotations to override them on a per-Pod basis:

connectInject:
  metrics:
    defaultPrometheusScrapePort: 20200
    defaultPrometheusScrapePath: "/metrics"

The Helm values specified in the previous example result in the following Prometheus annotations being automatically added to the Pod for scraping:

metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/path: "/metrics"
    prometheus.io/port: "20200"

When metrics and metrics merging are both enabled, metrics are combined from the service and the sidecar proxy, and exposed through a local server on the Consul Dataplane sidecar for scraping. This endpoint is called the merged metrics endpoint and defaults to 127.0.0.1:20100/stats/prometheus. The listener targets the merged metrics endpoint in the above case. It can be configured with the following Helm values (or overridden on a per-Pod basis with consul.hashicorp.com/merged-metrics-port:

connectInject:
  metrics:
    defaultMergedMetricsPort: 20100

The endpoint to scrape service metrics from can be configured only on a per-Pod basis with the Pod annotations consul.hashicorp.com/service-metrics-port and consul.hashicorp.com/service-metrics-path. If these are not configured, the service metrics port defaults to the port used to register the service with Consul (consul.hashicorp.com/connect-service-port), which in turn defaults to the first port on the first container of the Pod. The service metrics path defaults to /metrics.

Consul Agent Metrics

Metrics from the Consul server Pods can be scraped with Prometheus by setting the field global.metrics.enableAgentMetrics to true. Additionally, one can configure the metrics retention time on the agents by configuring the field global.metrics.agentMetricsRetentionTime which expects a duration and defaults to "1m". This value must be greater than "0m" for the Consul servers to emit metrics at all. As the Prometheus deployment currently does not support scraping TLS endpoints, agent metrics are currently unsupported when TLS is enabled.

global:
  metrics:
    enabled: true
    enableAgentMetrics: true
    agentMetricsRetentionTime: "1m"

Gateway Metrics

Metrics from the Consul ingress, terminating, and mesh gateways can be scraped with Prometheus by setting the field global.metrics.enableGatewayMetrics to true. The gateways emit standard Envoy proxy metrics. To ensure that the metrics are not exposed to the public internet, as mesh and ingress gateways can have public IPs, their metrics endpoints are exposed on the Pod IP of the respective gateway instance, rather than on all interfaces on 0.0.0.0.

global:
  metrics:
    enabled: true
    enableGatewayMetrics: true

Metrics in the UI Topology Visualization

Consul's built-in UI has a topology visualization for services that are part of the Consul service mesh. The topology visualization has the ability to fetch basic metrics from a metrics provider for each service and display those metrics as part of the topology visualization.

The diagram below illustrates how the UI displays service metrics for a sample application:

UI Topology View

The topology view is configured under ui.metrics. This configuration enables the Consul UI to query the provider specified by ui.metrics.provider at the URL of the Prometheus server ui.metrics.baseURL, and then display sidecar proxy metrics for the service. The UI displays some specific sidecar proxy Prometheus metrics when ui.metrics.enabled is true and ui.enabled is true. The value of ui.metrics.enabled defaults to "-" which means it inherits from the value of global.metrics.enabled.

ui:
  enabled: true
  metrics:
    enabled: true # by default, this inherits from the value global.metrics.enabled
    provider: "prometheus"
    baseURL: http://prometheus-server

Deploying Prometheus (for demo and non-production use-cases only)

The Helm chart contains demo manifests for deploying Prometheus. It can be installed with Helm with prometheus.enabled. This manifest is based on the community manifest for Prometheus. The Prometheus deployment is designed to allow quick bootstrapping for trial and demo use cases, and is not recommended for production use-cases.

Prometheus is be installed in the same namespace as Consul, and gets installed and uninstalled along with the Consul installation.

Grafana can optionally be utilized with Prometheus to display metrics. The installation and configuration of Grafana must be managed separately from the Consul Helm chart. The Layer 7 Observability with Prometheus, Grafana, and Kubernetes tutorial provides an installation walkthrough using Helm.

prometheus:
  enabled: true