Telemetry
The Vault server process collects various runtime metrics about the performance
of different libraries and subsystems. These metrics are aggregated on a
10-second interval and retained for one minute in memory. High-cardinality
metrics, like vault.secret.kv.count
, report every 10 minutes or at an interval
configured with in the telemetry
stanza.
Telemetry from Vault must be streamed and stored in metrics aggregation software to monitor Vault and collect durable metrics.
Vault uses the go-metrics
package to export telemetry and supports the
following aggregation agents for time-series monitoring:
Config prefix | Name | Company |
---|---|---|
circonus | Circonus | Circonus |
dogstatsd | DogStatsD | Datadog |
prometheus | Prometheus | Prometheus / Open source |
stackdriver | Cloud Operations | |
statsd | Statsd | Open source |
statsite | Statsite | Open source |
Start with key health metrics
The Well-Architected Framework documentation includes a best practices guide on key Vault metrics for common health checks. We recommend reviewing the key metric recommendations to identify metrics you may want to start monitoring immediately.
Working with raw telemetry data
You can view raw telemetry data for debugging purposes by interrupting the Vault
process with USR1
(on *nix) or BREAK
(on Windows). When the Vault process
receives this signal, it dumps telemetry data for the last 10 seconds to
stderr
.
Raw telemetry data is prefixed with the relevant metric type:
[C]
indicates the metric is a counter.[G]
indicates the metric is a gauge.[S]
indicates the metric is a summary.
Example raw telemetry dump