Application leader election
This topic describes the process for building client-side leader elections for service instances using Consul's session mechanism for building distributed locks and the Consul key/value store, which is Consul's key/value datastore.
This topic is not related to Consul's leader election. For more information about the Raft leader election used internally by Consul, refer to consensus protocol documentation.
Background
Some distributed applications, like HDFS or ActiveMQ, require setting up one instance as a leader to ensure application data is current and stable.
Consul's support for sessions and watches allows you to build a client-side leader election process where clients use a lock on a key in the KV datastore to ensure mutual exclusion and to gracefully handle failures.
All service instances that are participating should coordinate on a key format. We recommend the following pattern:
Requirements
- A running Consul server
- A path in the Consul KV datastore to acquire locks and to store information about the leader. The instructions on this page use the following key:
service/leader
. - If ACLs are enabled, a token with the following permissions:
session:write
permissions over the service session namekey:write
permissions over the key- The
curl
command
Expose the token using the CONSUL_HTTP_TOKEN
environment variable.
Client-side leader election procedure
The workflow for building a client-side leader election process has the following steps:
For each client trying to acquire the lock:
- Create a session associated with the client node.
- Acquire the lock on the designated key in the KV store using the
acquire
parameter. - Watch the KV key to verify if the lock was released. If no lock is present, try to acquire a lock.
For the client that acquires the lock:
- Periodically, renew the session to avoid expiration.
- Optionally, release the lock.
For other services:
- Watch the KV key to verify there is at least one process holding the lock.
- Use the values written under the KV path to identify the leader and update configurations accordingly.
Create a new session
Create a configuration for the session. The minimum viable configuration requires that you specify the session name. The following example demonstrates this configuration.
Create a session using the /session
Consul HTTP API endpoint. In the following example, the node's hostname
is the session name.
The command returns a JSON object containing the ID of the newly created session.
Verify session
Use the /v1/session/list
endpoint to retrieve existing sessions.
The command returns a JSON array containing all available sessions in the system.
You can verify from the output that the session is associated with the hashicups-db-0
node, which is the client agent where the API request was made.
With the exception of the Name
, all parameters are set to their default values. The session is created without a TTL
value, which means that it never expires and requires you to delete it explicitly.
Depending on your needs you can create sessions specifying more parameters such as:
TTL
- If provided, the session is invalidated and deleted if it is not renewed before the TTL expires.ServiceChecks
- Specifies a list of service checks to monitor. The session is invalidated if the checks return a critical state.
By setting these extra parameters, you can create a client-side leader election workflow that automatically releases the lock after a specified amount of time since the last renew, or that automatically releases locks when the service holding them fails.
For a full list of parameters available refer to the /session/create
endpoint documentation.
Acquire the lock
Create the data object to associate to the lock request.
The data of the request should be a JSON object representing the local instance. This value is opaque to Consul, but it should contain whatever information clients require to communicate with your application. For example, it could be a JSON object that contains the node's name and the application's port.
Acquire a lock for a given key using the PUT method on a KV entry with the
?acquire=<session>
query parameter.
This request returns either true
or false
. If true
, the lock was acquired and
the local service instance is now the leader. If false
, a different node acquired
the lock.
This example used the node's hostname
as the key data. This data can be used by the other services to create configuration files.
Be aware that this locking system has no enforcement mechanism that requires clients to acquire a lock before they perform an operation. Any client can read, write, and delete a key without owning the corresponding lock.
Watch the KV key for locks
Existing locks need to be monitored by all nodes involved in the client-side leader elections, as well as by the other nodes that need to know the identity of the leader.
- Lock holders need to monitor the lock because the session might get invalidated by an operator.
- Other services that want to acquire the lock need to monitor it to check if the lock is released so they can try acquire the lock.
- Other nodes need to monitor the lock to see if the value of the key changed and update their configuration accordingly.
Monitor the lock using the GET method on a KV entry with the blocking query enabled.
First, verify the latest index for the current value.
The command outputs the key data, including the ModifyIndex
for the object.
Using the value of the ModifyIndex
, run a blocking query against the lock.
The command hangs until a change is made on the KV path and after that the path data prints on the console.
For automation purposes, add logic to the blocking query mechanism to trigger a command every time a change is returned.
A better approach is to use the CLI command consul watch
.
From the output, notice that once the lock is acquired, the Session
parameter contains the ID of the session that holds the lock.
Renew a session
If a session is created with a TTL
value set, you need to renew the session before the TTL expires.
Use the /v1/session/renew
endpoint to renew existing sessions.
If the command succeeds, the session information in JSON format is printed.
Release a lock
A lock associated with a session with no TTL
value set might never be released, even when the service holding it fails.
In such cases, you need to manually release the lock.
The command prints true
on success.
After a lock is released, the key data do not show a value for Session
in the results.
Other clients can use this as a way to coordinate their lock requests.