Metrics
The HCP Terraform Agent emits numerous metrics describing the agent's performance.
Metric naming conventions
All metrics emitted by the HCP Terraform Agent follow some general naming conventions which convey useful context about the measurements.
- All metrics are prefixed by
tfc-agent.
to distinguish them from any other metrics in your environment. - When a metric requires a unit in order to be understood, an un-abbreviated
unit will be the last component of the metric name. For example,
.bytes
, or.milliseconds
. - All timing metrics (other than runtime metrics) are measured in milliseconds.
- All data size metrics are measured in bytes.
- Underscores are used to separate words in metric names when submitted to the collector over the OTLP protocol.
Metric data types
HCP Terraform agents emit three types of metric data:
- Gauges submit an absolute number value at the end of a measurement period.
- Counters accumulate number values during a measurement period. The sum of the numbers is recorded at the end of a measurement period and the value is reset to zero for the next period.
- Timers measure the time taken to complete a task.
Core metrics
The following metrics are generated by the HCP Terraform Agent core program,
and are related to generic operations performed regularly by all agents. All
metrics in this section are prefixed by tfc-agent.core.
.
Meric name | Type | Description |
---|---|---|
status.busy | Gauge | Number of agents in busy status. |
status.idle | Gauge | Number of agents in idle status. |
register.milliseconds | Timer | Time to register the agent with HCP Terraform. |
fetch_job.milliseconds | Timer | Time to complete a job dequeue request. |
update_status.milliseconds | Timer | Time to send a status update over from the agent to HCP Terraform. |
Runtime metrics
The HCP Terraform Agent produces a number of metrics which are generated by
the application runtime, and are primarily useful in debugging the HCP Terraform Agent.
These metrics are emitted periodically throughout the entire agent process
lifecycle. It is important to note that these metrics do not represent a
complete picture of resource utilization by the agent. The agent may fork child
processes (such as the Terraform binary or other programs) which maintain their
own distinct runtimes and consume resources independently of the agent. To
monitor resource utilization comprehensively, consider monitoring VM or
container metrics. All metrics in this section are prefixed by
tfc-agent.core.runtime.
.
Metric name | Type | Description |
---|---|---|
go.mem.heap_alloc.bytes | Gauge | Memory allocated to heap objects by the Go runtime. |
go.mem.heap_idle.bytes | Gauge | Memory allocated to the heap which are unused. |
go.mem.heap_inuse.bytes | Gauge | Memory allocated to the heap which are in use. |
go.mem.heap_sys.bytes | Gauge | Memory obtained from the OS for the heap. |
go.mem.heap_released.bytes | Gauge | Memory returned to the OS during GC. |
go.mem.heap_objects.count | Gauge | Number of allocated heap objects. |
go.mem.lookups.count | Gauge | Number of pointer lookups. |
go.mem.malloc.count | Gauge | Cumulative count of heap objects allocated. |
go.mem.free.count | Gauge | Cumulative count of heap objects freed. |
go.gc.count | Gauge | Number of completed GC cycles. |
go.goroutines.count | Gauge | Number of running goroutines. |
go.gc.pause_total.nanoseconds | Timer | Cumulative time spent in stop-the-world pauses. |
uptime.milliseconds | Timer | Cumulative time since the agent started. |
Resource utilization metrics
HCP Terraform Agents emit metrics about overall system resource utilization any
time the agent's status is busy
(i.e., when the agent is handling a plan,
apply, policy evaluation, etc.) These metrics use kernel-level information to
produce metrics and should match closely with what you would get from familiar
tools such as top
and free
. All metrics in this section are prefixed by
tfc-agent.core.profiler.
.
Metric name | Type | Description |
---|---|---|
cpu.busy.percent | Gauge | Percentage of CPU time spent in a busy state. |
memory.used.percent | Gauge | Percentage of memory used. |
memory.used.bytes | Gauge | Used memory size in bytes. |
io.read.bytes_per_second | Gauge | Read throughput of the agent and descendant processes in bytes per second. |
io.write.bytes_per_second | Gauge | Write throughput of the agent and descendant processes in bytes per second. |
Terraform component metrics
The following metrics are emitted by the terraform
component, which is
responsible for handling Terraform operations like plans and applies. All
metrics in this section are prefixed by tfc-agent.core.terraform.
.
Metric name | Type | Description |
---|---|---|
configure_terraform_cli.milliseconds | Timer | Time spent configuring the Terraform CLI utility prior to execution. |
handle_signal.milliseconds | Timer | Time spent handling a signal from HCP Terraform. |
execute.milliseconds | Timer | Time spent handling a Terraform operation. |
output_stream.upload_chunk.bytes | Gauge | Size of a chunk of Terraform output uploaded. |
output_stream.upload_chunk.milliseconds | Timer | Time spent uploading a single chunk of Terraform output. |
output_stream.upload_full.bytes | Gauge | Size of a full Terraform log uploaded. |
output_stream.upload_full.milliseconds | Timer | Time spent uploading the full Terraform log. |
output_stream.close.milliseconds | Timer | Time spent finalizing a Terraform output stream. |
override_sensitive_variables.milliseconds | Timer | Time spent generating a configuration file to override variable sensitivity. |
override_sensitive_variables.count | Gauge | Number of variables in a run of which HCP Terraform modified the sensitivty flag. |
persist_filesystem.milliseconds | Timer | Time spent packing and uploading a filesystem image. |
persist_filesystem.pack.bytes | Gauge | Size of a packed up filesystem image. |
persist_filesystem.pack.milliseconds | Timer | Time spent packing the contents of a filesystem. |
persist_filesystem.upload.milliseconds | Timer | Time spent uploading a packed up filesystem image. |
plan_json.generate.bytes | Gauge | Size of a generated JSON-formatted plan. |
plan_json.generate.milliseconds | Timer | Time spent generating a JSON plan. |
plan_json.upload.milliseconds | Timer | Time spent uploading a JSON plan. |
provider_schemas_json.generate.bytes | Gauge | Size of a generated JSON-formatted provider schemas file. |
provider_schemas_json.generate.milliseconds | Timer | Time spent generating the provider schemas document. |
provider_schemas_json.upload.milliseconds | Timer | Time spent uploading a provider schemas document. |
restore_filesystem.milliseconds | Timer | Time spent downloading and unpacking a filesystem image. |
restore_filesystem.download.bytes | Gauge | Size of a downloaded filesystem image. |
restore_filesystem.download.milliseconds | Timer | Time spent downloading a filesystem image. |
restore_filesystem.unpack.milliseconds | Timer | Time spent unpacking the contents of a filesystem image. |
run_meta.additions | Gauge | Number of resources added or proposed to be added in a Terraform operation. |
run_meta.changes | Gauge | Number of resources changed or proposed to change in a Terraform operation. |
run_meta.destructions | Gauge | Number of resources destroyed or proposed to be destroyed in a Terraform operation. |
setup_backend.milliseconds | Timer | Time spent configuring Terraform CLI for HCP Terraform. |
setup_ssh_key.milliseconds | Timer | Time spent configuring an SSH key for Terraform to use while downloading modules. |
setup_ssh_key.check_git_version.milliseconds | Timer | Time spent ensuring the local version of "git" is adequate for leveraging SSH auth. |
setup_terraform_binary.milliseconds | Timer | Time spent downloading and unpacking a Terraform Community release. |
setup_terraform_binary.download.bytes | Gauge | Size of a downloaded Terraform Community version. |
setup_terraform_binary.download.milliseconds | Timer | Time spent downloading a Terraform Community release. |
setup_terraform_binary.unpack.bytes | Gauge | Size of an unpacked Terraform Community release. |
setup_terraform_binary.unpack.milliseconds | Timer | Time spent unpacking a Terraform Community release. |
setup_terraform_config.milliseconds | Timer | Time spent downloading and unpacking a Terraform configuration. |
setup_terraform_config.download.bytes | Gauge | Size of a downloaded Terraform configuration. |
setup_terraform_config.download.milliseconds | Timer | Time spent downloading a Terraform configuration. |
setup_terraform_config.unpack.milliseconds | Timer | Time spent unpacking a downloaded Terraform configuration. |
setup_terraform_config.verify.milliseconds | Timer | Time spent verifying a Terraform configuration. |
setup_terraform_variables.milliseconds | Timer | Time spent configuring Terraform variables provided by TFC. |
setup_terraform_variables.write_file.bytes | Gauge | Size of a tfvars file, generated by TFC-provided input variables. |
terraform_init.milliseconds | Timer | Time spent running terraform init . |
terraform_apply.milliseconds | Timer | Time spent running terraform apply . |
terraform_plan.milliseconds | Timer | Time spent running terraform plan . |
terraform_version.milliseconds | Timer | Time spent running terraform version . |
Policy component metrics
The following metrics are emitted by the policy
component, which is
responsible for handling OPA policy enforcement operations. All metrics in this
section are prefixed by tfc-agent.core.policy.
.
Metric name | Type | Description |
---|---|---|
execute.milliseconds | Timer | Time spent handling a policy operation. |
policy_set.download.bytes | Gauge | Size of a downloaded policy set. |
policy_set.download.milliseconds | Timer | Time spent downloading a policy set. |
policy_set.unpack.milliseconds | Timer | Time spent unpacking a policy set. |
generate_opa_input_file.milliseconds | Timer | Time spent generating the OPA input file. |
parse_opa_config.milliseconds | Timer | Time spent parsing the OPA configuration. |
plan_json_download.milliseconds | Timer | Time spent downloading the Terraform JSON plan document. |
plan_json_download.bytes | Gauge | Size of a downloaded Terraform JSON plan document. |
run_opa_eval.milliseconds | Timer | Time spent evaluating OPA policies. |
setup_policies.milliseconds | Timer | Time spent setting up individually managed policies. |
setup_policy_engines.milliseconds | Timer | Time spent setting up policy runtimes. |
setup_subjects.milliseconds | Timer | Time spent setting up subjects for policy enforcement. |
write_sentinel_config.milliseconds | Timer | Time spent generating the Sentinel config file. |
run_sentinel_apply.milliseconds | Timer | Time spent evaluating Sentinel policies. |