Restart a workload based on health checks
The check_restart
stanza instructs Nomad when to restart
tasks with unhealthy service checks. When a health check in Consul has been
unhealthy for the limit specified in a check_restart stanza, it is restarted
according to the task group's restart policy. Restarts are local to the node
running the task based on the tasks restart
policy.
The limit
field is used to specify the number of times a failing health check
is seen before local restarts are attempted. Operators can also specify a
grace
duration to wait after a task restarts before checking its health.
You should configure the check restart on services when its likely that a restart would resolve the failure. An example of this might be restarting to correct a transient connection issue on the service.
The following check_restart
stanza waits for two consecutive health check
failures with a grace period and considers both critical
and warning
statuses as failures.
The following CLI example output shows health check failures triggering restarts until its restart limit is reached.