Health
Terraform Cloud can perform automatic health assessments in a workspace to assess whether its real infrastructure matches the requirements defined in its Terraform configuration. Health assessments include the following types of evaluations:
- Drift detection determines whether your real-world infrastructure matches your Terraform state file.
- Continuous validation determines whether custom conditions in the workspace’s configuration continue to pass after Terraform provisions the infrastructure.
When enabled, Terraform Cloud automatically runs a health assessment for your workspace about every 24 hours. Refer to Health Assessment Scheduling for details.
Permissions
Working with health assessments requires the following permissions:
- To view health status for a workspace, you need read access to that workspace.
- To change organization health settings, you must be an organization owner.
- To change a workspace’s health settings, you must be an administrator for that workspace.
Workspace Requirements
Workspaces require the following settings to receive health assessments:
- Terraform version 0.15.4+ for drift detection only
- Terraform version 1.3.0+ for drift detection and continuous validation
- Remote execution mode or Agent execution mode for Terraform runs
The latest Terraform run in the workspace must have been successful. If the most recent run ended in an errored, canceled, or discarded state, Terraform Cloud pauses health assessments until there is a successfully applied run.
The workspace must also have at least one run in which Terraform successfully applies a configuration. Terraform Cloud does not perform health assessments in workspaces with no real-world infrastructure.
Enable Health Assessments
You can enforce health assessments across all eligible workspaces in an organization within the organization settings. Enforcing health assessments at an organization-level overrides workspace-level settings. You can only enable health assessments within a specific workspace when Terraform Cloud is not enforcing health assessments at the organization level.
To enable health assessments within a workspace:
- Verify that your workspace satisfies the requirements.
- Go to the workspace and click Settings > Health.
- Select Enable under Health Assessments.
- Click Save settings.
Health Assessment Scheduling
The timing of the first health assessment in the workspace depends on whether you enable health assessments during active Terraform runs:
- No active runs: The first health assessment starts a few minutes after you enable the feature.
- Active speculative plan: The first health assessment starts soon after that plan's completion.
- Other active runs: The first health assessment starts in about 24 hours.
After the first health assessment, Terraform Cloud starts a new health assessment if at least 24 hours have passed since the last assessment and there are no active runs in the workspace. Health assessments may take longer to complete when you enable health assessments in many workspaces at once or your workspace contains a complex configuration with many resources.
A health assessment never interrupts or interferes with runs. If you start a new run during a health assessment, Terraform Cloud cancels the current assessment and runs the next assessment in 24 hours. This behavior may prevent Terraform Cloud from performing health assessments in workspaces with frequent runs.
Terraform Cloud pauses health assessments if the latest run ended in an errored state. This behavior occurs for all run types, including plan-only runs and speculative plans. Once the workspace completes a successful run, Terraform Cloud restarts health assessments after 24 hours.
Terraform Enterprise administrators can modify their installation's assessment frequency and number of maximum concurrent assessments from the admin settings console.
Concurrency
If you enable health assessments on multiple workspaces, assessments may run concurrently. Health assessments do not affect your concurrency limit. Terraform Cloud also monitors and controls health assessment concurrency to avoid issues for large-scale deployments with thousands of workspaces. However, Terraform Cloud performs health assessments in batches, so health assessments may take longer to complete when you enable them in a large number of workspaces.
Notifications
Terraform Cloud sends notifications about health assessment results according to your workspace’s settings.
Workspace Health Status
On the organization's Workspaces page, Terraform Cloud displays a Health warning status for workspaces with infrastructure drift or failed continuous validation checks.
On the right of a workspace’s overview page, Terraform Cloud displays a Health bar that summarizes the results of the last health assessment.
- The Drift summary shows the total number of resources in the configuration and the number of resources that have drifted.
- The Checks summary shows the number of passed, failed, and unknown statuses for objects with continuous validation checks.
Drift Detection
Infrastructure drift means that your real-world infrastructure no longer matches your Terraform state file. Drift occurs when a user modifies resources outside of the Terraform workflow. For example, a colleague may update resource configuration directly in the cloud provider console to resolve a production incident. This action changes the real resource attributes from those tracked in the state file.
View Workspace Drift
To view the continuous validation results from the latest health assessment, go to the workspace and click Health > Drift. If there is drift, Terraform Cloud shows how the real infrastructure differs from the latest version of the workspace’s state file.
Resolve Drift
You can use one of the following approaches to correct workspace drift:
- Overwrite drift: Queue a new plan to realign your real-world infrastructure with your Terraform configuration.
- Update Terraform state and configuration: Queue a refresh-only plan to update your Terraform state to match your real-world infrastructure. We recommend also modifying your Terraform configuration to include any new or changed resources. Otherwise, Terraform will overwrite the updated state file during the next apply. Refer to our Manage Resource Drift tutorial for a detailed example.
Continuous Validation
Continuous validation regularly verifies whether your configuration’s custom assertions continue to pass, validating your infrastructure. For example, you can monitor whether your website returns an expected status code, or whether an API gateway certificate is valid. Identifying failed assertions helps you resolve the failure and prevent errors during your next time Terraform operation.
Continuous validation evaluates preconditions, postconditions, and check blocks as part of an assessment, but we recommend using check blocks for post-apply monitoring. Use check blocks to create custom rules to validate your infrastructure's resources, data sources, and outputs.
Preventing false positives
Health assessments create a speculative plan to access the current state of your infrastructure. Terraform evaluates any check blocks in your configuration as the last step of creating the speculative plan. If your configuration relies on data sources and the values queried by a data source change between the time of your last run and the assessment, the speculative plan will include those changes. Terraform Cloud will not modify your infrastructure as part of an assessment, but it can use those updated values to evaluate checks. This may lead to false positive results for alerts since your infrastructure did not yet change.
To ensure your checks evaluate the current state of your configuration instead of against a possible future change, use nested data sources that query your actual resource configuration, rather than a computed latest value. Refer to the AMI image scenario below for an example.
Example use cases
Review the provider documentation for check
block examples with AWS, Azure, and GCP.
Monitoring the health of a provisioned website
The following example uses the HTTP Terraform provider and a scoped data source within a check
block to assert the Terraform website returns a 200
status code, indicating it is healthy.
Continuous Validation alerts you if the website returns any status code besides 200
while Terraform evaluates this assertion. You can also find failures in your workspace's Continuous Validation Results page. You can configure continuous validation alerts in your workspace's notification settings.
Monitoring certificate expiration
Vault lets you secure, store, and tightly control access to tokens, passwords, certificates, encryption keys, and other sensitive data. The following example uses a check
block to monitor for the expiration of a Vault certificate.
Asserting up-to-date AMIs for compute instances
HCP Packer stores metadata about your Packer images. The following example check fails when there is a newer AMI version available.
View continuous validation results
To view the continuous validation results from the latest health assessment, go to the workspace and click Health > Continuous validation.
The page shows all of the resources, outputs, and data sources with custom assertions that Terraform Cloud evaluated. Next to each object, Terraform Cloud reports whether the assertion passed or failed. If one or more assertions fail, Terraform Cloud displays the error messages for each assertion.
The health assessment page displays each assertion by its named value. A check
block's named value combines the prefix check
with its configuration name.
If your configuration contains multiple preconditions and postconditions within a single resource, output, or data source, Terraform Cloud will not show the results of individual conditions unless they fail. If all custom conditions on the object pass, Terraform Cloud reports that the entire check passed. The assessment results will display the results of any precondition and postconditions alongside the results of any assertions from check
blocks, identified by the named values of their parent block.