PKI secrets engine - considerations
To successfully deploy this secrets engine, there are a number of important considerations to be aware of, as well as some preparatory steps that should be undertaken. You should read all of these before using this secrets engine or generating the CA to use with this secrets engine.
Table of contents
- Be Careful with Root CAs
- One CA Certificate, One Secrets Engine
- Use a CA Hierarchy
- Keep certificate lifetimes short, for CRL's sake
- You must configure issuing/CRL/OCSP information in advance
- Automate Leaf Certificate Renewal
- Safe Minimums
- Token Lifetimes and Revocation
- Safe Usage of Roles
- Telemetry
- Auditing
- Role-Based Access
- Replicated DataSets
- Cluster Scalability
- Issuer Storage Migration Issues
Be careful with root CAs
Vault storage is secure, but not as secure as a piece of paper in a bank vault. It is, after all, networked software. If your root CA is hosted outside of Vault, don't put it in Vault as well; instead, issue a shorter-lived intermediate CA certificate and put this into Vault. This aligns with industry best practices.
Since 0.4, the secrets engine supports generating self-signed root CAs and creating and signing CSRs for intermediate CAs. In each instance, for security reasons, the private key can only be exported at generation time, and the ability to do so is part of the command path (so it can be put into ACL policies).
If you plan on using intermediate CAs with Vault, it is suggested that you let
Vault create CSRs and do not export the private key, then sign those with your
root CA (which may be a second mount of the pki
secrets engine).
Managed keys
Since 1.10, Vault Enterprise can access private key material in a
managed key. In this case, Vault never sees the
private key, and the external KMS or HSM performs certificate signing operations.
Managed keys are configured by selecting the kms
type when generating a root
or intermediate.
One CA certificate, one secrets engine
Since Vault 1.11.0, the PKI Secrets Engine supports multiple issuers in a single mount. However, in order to simplify the configuration, it is strongly recommended that operators limit a mount to a single issuer. If you want to issue certificates from multiple disparate CAs, mount the PKI secrets engine at multiple mount points with separate CA certificates in each.
A common pattern is to have one mount act as your root CA and to use this CA only to sign intermediate CA CSRs from other PKI secrets engines.
To keep old CAs active, there's two approaches to achieving rotation:
- Use multiple secrets engines. This allows a fresh start, preserving the old issuer and CRL. Vault ACL policy can be updated to deny new issuance under the old mount point and roles can be re-evaluated before being imported into the new mount point.
- Use multiple issuers in the same mount point. The usage of the old issuer can be restricted to CRL signing, and existing roles and ACL policy can be kept as-is. This allows cross-signing within the same mount, and consumers of the mount won't have to update their configuration. Once the transitional period for this rotation has completed and all past issued certificate have expired, it is encouraged to fully remove the old issuer and any unnecessary cross-signed issuers from the mount point.
Another suggested use case for multiple issuers in the same mount is splitting issuance by TTL lifetime. For short-lived certificates, an intermediate stored in Vault will often out-perform a HSM-backed intermediate. For longer-lived certificates, however, it is often important to have the intermediate key material secured throughout the lifetime of the end-entity certificate. This means that two intermediates in the same mount -- one backed by the HSM and one backed by Vault -- can satisfy both use cases. Operators can make roles setting maximum TTLs for each issuer and consumers of the mount can decide which to use.
Always configure a default issuer
For backwards compatibility, the default issuer is used to service PKI endpoints without an explicit issuer (either via path selection or role-based selection). When certificates are revoked and their issuer is no longer part of this PKI mount, Vault places them on the default issuer's CRL. This means maintaining a default issuer is important for both backwards compatibility for issuing certificates and for ensuring revoked certificates land on a CRL.
Key types matter
Certain key types have impacts on performance. Signing certificates from a RSA
key will be slower than issuing from an ECDSA or Ed25519 key. Key generation
(using /issue/:role
endpoints) using RSA keys will also be slow: RSA key
generation involves finding suitable random primes, whereas Ed25519 keys can
be random data. As the number of bits goes up (RSA 2048 -> 4096 or ECDSA
P-256 -> P-521), signature times also increases.
This matters in both directions: not only is issuance more expensive, but validation of the corresponding signature (in say, TLS handshakes) will also be more expensive. Careful consideration of both issuer and issued key types can have meaningful impacts on performance of not only Vault, but systems using these certificates.
Use a CA hierarchy
It is generally recommended to use a hierarchical CA setup, with a root certificate which issues one or more intermediates (based on usage), which in turn issue the leaf certificates.
This allows stronger storage or policy guarantees around protection of the root CA, while letting Vault manage the intermediate CAs and issuance of leaves. Different intermediates might be issued for different usage, such as VPN signing, Email signing, or testing versus production TLS services. This helps to keep CRLs limited to specific purposes: for example, VPN services don't care about the revoked set of email signing certificates if they're using separate certificates and different intermediates, and thus don't need both CRL contents. Additionally, this allows higher risk intermediates (such as those issuing longer-lived email signing certificates) to have HSM-backing without impacting the performance of easier-to-rotate intermediates and certificates (such as TLS intermediates).
Vault supports the use of both the allowed_domains
parameter on
Roles and the permitted_dns_domains
parameter to set the Name Constraints extension
on root and intermediate generation. This allows for several layers of
separation of concerns between TLS-based services.
Cross-Signed intermediates
When cross-signing intermediates from two separate roots, two separate
intermediate issuers will exist within the Vault PKI mount. In order to
correctly serve the cross-signed chain on issuance requests, the
manual_chain
override is required on either or both intermediates. This
can be constructed in the following order:
- this issuer (
self
) - this root
- the other copy of this intermediate
- the other root
All requests to this issuer for signing will now present the full cross-signed chain.
Keep certificate lifetimes short, for CRL's sake
This secrets engine aligns with Vault's philosophy of short-lived secrets. As such it is not expected that CRLs will grow large; the only place a private key is ever returned is to the requesting client (this secrets engine does not store generated private keys, except for CA certificates). In most cases, if the key is lost, the certificate can simply be ignored, as it will expire shortly.
If a certificate must truly be revoked, the normal Vault revocation function can be used; alternately a root token can be used to revoke the certificate using the certificate's serial number. Any revocation action will cause the CRL to be regenerated. When the CRL is regenerated, any expired certificates are removed from the CRL (and any revoked, expired certificate are removed from secrets engine storage). This is an expensive operation! Due to the structure of the CRL standard, Vault must read all revoked certificates into memory in order to rebuild the CRL and clients must fetch the regenerated CRL.
This secrets engine does not support multiple CRL endpoints with sliding date windows; often such mechanisms will have the transition point a few days apart, but this gets into the expected realm of the actual certificate validity periods issued from this secrets engine. A good rule of thumb for this secrets engine would be to simply not issue certificates with a validity period greater than your maximum comfortable CRL lifetime. Alternately, you can control CRL caching behavior on the client to ensure that checks happen more often.
Often multiple endpoints are used in case a single CRL endpoint is down so that clients don't have to figure out what to do with a lack of response. Run Vault in HA mode, and the CRL endpoint should be available even if a particular node is down.
Note: Since Vault 1.11.0, with multiple issuers in the same mount point, different issuers may have different CRLs (depending on subject and key material). This means that Vault may need to regenerate multiple CRLs. This is again a rationale for keeping TTLs short and avoiding revocation if possible.
NotAfter behavior on leaf certificates
In Vault 1.11.0, the PKI Secrets Engine has introduced a new
leaf_not_after_behavior
parameter on
issuers.
This allows modification of the issuance behavior: should Vault err
,
preventing issuance of a longer-lived leaf cert than issuer, silently
truncate
to that of the issuer's NotAfter
value, or permit
longer
expirations.
It is strongly suggested to use err
or truncate
for intermediates;
permit
is only useful for root certificates, as intermediate's NotAfter
expiration are checked when validating presented chains.
In combination with a cascading expiration with longer lived roots (perhaps on the range of 2-10 years), shorter lived intermediates (perhaps on the range of 6 months to 2 years), and short-lived leaf certificates (on the range of 30 to 90 days), and the rotation strategies discussed in other sections, this should keep the CRLs adequately small.
Cluster performance and quantity of leaf certificates
As mentioned above, keeping TTLs short (or using no_store=true
, preventing
revocation) and avoiding leases is important for a healthy cluster. However
it is important to note this is a scale problem: 10-1000 long-lived, stored
certificates are probably fine, but 50k-100k become a problem and 500k+
stored, unexpired certificates can negatively impact even large Vault
clusters--even with short TTLs!
However, once these certificates are expired, a tidy operation will clean up CRLs and Vault cluster storage.
Note that organizational risk assessments for certificate compromise might
mean certain certificate types should always be issued with no_store=false
;
even short-lived broad wildcard certificates (say, *.example.com
) might be
important enough to have precise control over revocation. However, an internal
service with a well-scoped certificate (say, service.example.com
) might be
of low enough risk to issue a 90-day TTL with no_store=true
, preventing
the need for revocation in the unlikely case of compromise.
Having a shorter TTL decreases the likelihood of needing to revoke a cert (but cannot prevent it entirely) and decrease the impact of any such compromise.
You must configure issuing/CRL/OCSP information in advance
This secrets engine serves CRLs from a predictable location, but it is not
possible for the secrets engine to know where it is running. Therefore, you must
configure desired URLs for the issuing certificate, CRL distribution points, and
OCSP servers manually using the config/urls
endpoint. It is supported to have
more than one of each of these by passing in the multiple URLs as a
comma-separated string parameter.
Note: when using Vault Enterprise's Performance Replication features with a PKI Secrets Engine mount, each cluster will have its own CRL; this means each cluster's unique CRL address should be included in the AIA information field separately, or the CRLs should be consolidated and served outside of Vault.
Automate leaf certificate renewal
As much as possible, for managing certificates for services at scale, it is
best to automate renewal of certificates. Vault agent has support for
automatically renewing requested certificates
based on the validTo
field. Other solutions might involve using
cert-manager in Kubernetes or OpenShift, backed
by the Vault CA.
Safe minimums
Since its inception, this secrets engine has enforced SHA256 for signature hashes rather than SHA1. As of 0.5.1, a minimum of 2048 bits for RSA keys is also enforced. Software that can handle SHA256 signatures should also be able to handle 2048-bit keys, and 1024-bit keys are considered unsafe and are disallowed in the Internet PKI.
Token lifetimes and revocation
When a token expires, it revokes all leases associated with it. This means that
long-lived CA certs need correspondingly long-lived tokens, something that is
easy to forget. Starting with 0.6, root and intermediate CA certs no longer have
associated leases, to prevent unintended revocation when not using a token with
a long enough lifetime. To revoke these certificates, use the pki/revoke
endpoint.
Safe usage of roles
The Vault PKI Secrets Engine supports many options to limit issuance via
Roles.
Careful consideration of construction is necessary to ensure that more
permissions are not given than necessary. Additionally, roles should generally
do one thing; multiple roles should be preferable over having too permissive
roles that allow arbitrary issuance (e.g., allow_any_name
should generally
be used sparingly, if at all).
allow_any_name
should generally be set tofalse
; this is the default.allow_localhost
should generally be set tofalse
for production services, unless listening onlocalhost
is expected.- Unless necessary,
allow_wildcard_certificates
should generally be set tofalse
. This is not the default due to backwards compatibility concerns.- This is especially necessary when
allow_subdomains
orallow_glob_domains
are enabled.
- This is especially necessary when
enforce_hostnames
should generally be enabled for TLS services; this is the default.allow_ip_sans
should generally be set tofalse
(but defaults totrue
), unless IP address certificates are explicitly required.- When using short TTLs (< 30 days) or with high issuance volume, it is
generally recommend to set
no_store
totrue
(defaults tofalse
). This prevents revocation but allows higher throughput as Vault no longer needs to store every issued certificate. This is discussed more in the Replicated Datasets section below. - Do not use roles with root certificates (
issuer_ref
). Root certificates should generally only issue intermediates (see the section on CA hierarchy above), which doesn't rely on roles. - Limit
key_usage
andext_key_usage
; don't attempt to allow all usages for all purposes. Generally the default values are useful for client and server TLS authentication.
Telemetry
Beyond Vault's default telemetry around request processing, PKI exposes count and
duration metrics for the issue, sign, sign-verbatim, and revoke calls. The
metrics keys take the form mount-path,operation,[failure]
with labels for
namespace and role name.
Note that these metrics are per-node and thus would need to be aggregated across nodes and clusters.
Auditing
Because Vault HMACs audit string keys by default, it is necessary to tune PKI secrets mounts to get an accurate view of issuance that is occurring under this mount.
Some suggested keys to un-HMAC for requests are as follows:
csr
- the requested CSR to sign,certificate
- the requested self-signed certificate to re-sign or when importing issuers,- Various issuance-related overriding parameters, such as:
issuer_ref
- the issuer requested to sign this certificate,common_name
- the requested common name,alt_names
- alternative requested DNS-type SANs for this certificate,other_sans
- other (non-DNS, non-Email, non-IP, non-URI) requested SANs for this certificate,ip_sans
- requested IP-type SANs for this certificate,uri_sans
- requested URI-type SANs for this certificate,ttl
- requested expiration date of this certificate,not_after
- requested expiration date of this certificate,serial_number
- the subject's requested serial number,key_type
- the requested key type,private_key_format
- the requested key format which is also used for the public certificate format as well,
- Various role- or issuer-related generation parameters, such as:
managed_key_name
- when creating an issuer, the requested managed key name,managed_key_id
- when creating an issuer, the requested managed key identifier,ou
- the subject's organizational unit,organization
- the subject's organization,country
- the subject's country code,locality
- the subject's locality,province
- the subject's province,street_address
- the subject's street address,postal_code
- the subject's postal code,permitted_dns_domains
- permitted DNS domains,policy_identifiers
- the requested policy identifiers when creating a role, andext_key_usage_oids
- the extended key usage OIDs for the requested certificate.
Some suggested keys to un-HMAC for responses are as follows:
certificate
- the certificate that was issued,issuing_ca
- the certificate of the CA which issued the requested certificate,serial_number
- the serial number of the certificate that was issued,error
- to show errors associated with the request, andca_chain
- optional due to noise; the full CA chain of the issuer of the requested certificate.
Note: These list of parameters to un-HMAC are provided as a suggestion and may not be exhaustive.
The following keys are suggested NOT to un-HMAC, due to their sensitive nature:
private_key
- this response parameter contains the private keys generated by Vault during issuance, andpem_bundle
this request parameter is only used on the issuer-import paths and may contain sensitive private key material.
Role-Based access
Vault supports path-based ACL Policies for limiting access to various paths within Vault.
The following is a condensed example reference of ACLing the PKI Secrets Engine. These are just a suggestion; other personas and policy approaches may also be valid.
We suggest the following personas:
- Operator; a privileged user who manages the health of the PKI subsystem; manages issuers and key material.
- Agent; a semi-privileged user that manages roles and handles revocation on behalf of an operator; may also handle delegated issuance. This may also be called an administrator or role manager.
- Advanced; potentially a power-user or service that has access to additional issuance APIs.
- Requester; a low-level user or service that simply requests certificates.
- Unauthed; any arbitrary user or service that lacks a Vault token.
For these personas, we suggest the following ACLs, in condensed, tabular form:
Path | Operations | Operator | Agent | Advanced | Requester | Unauthed |
---|---|---|---|---|---|---|
/ca(/pem)? | Read | Yes | Yes | Yes | Yes | Yes |
/ca_chain | Read | Yes | Yes | Yes | Yes | Yes |
/crl(/pem)? | Read | Yes | Yes | Yes | Yes | Yes |
/cert/:serial(/raw(/pem)?)? | Read | Yes | Yes | Yes | Yes | Yes |
/issuers | List | Yes | Yes | Yes | Yes | Yes |
/issuer/:issuer_ref/(json¦der¦pem) | Read | Yes | Yes | Yes | Yes | Yes |
/issuer/:issuer_ref/crl(/der¦/pem)? | Read | Yes | Yes | Yes | Yes | Yes |
/certs | List | Yes | Yes | Yes | Yes | |
/roles | List | Yes | Yes | Yes | Yes | |
/roles/:role | Read | Yes | Yes | Yes | Yes | |
/(issue¦sign)/:role | Write | Yes | Yes | Yes | Yes | |
/issuer/:issuer_ref/(issue¦sign)/:role | Write | Yes | Yes | Yes | ||
/config/ca | Read | Yes | Yes | |||
/config/crl | Read | Yes | Yes | |||
/config/issuers | Read | Yes | Yes | |||
/crl/rotate | Read | Yes | Yes | |||
/roles/:role | Write | Yes | Yes | |||
/issuer/:issuer_ref | Read | Yes | Yes | |||
/sign-verbatim(/:role)? | Write | Yes | Yes | |||
/issuer/:issuer_ref/sign-verbatim(/:role)? | Write | Yes | Yes | |||
/revoke | Write | Yes | Yes | |||
/tidy | Write | Yes | Yes | |||
/tidy-status | Read | Yes | Yes | |||
/config/ca | Write | Yes | ||||
/config/crl | Write | Yes | ||||
/config/issuers | Write | Yes | ||||
/config/keys | Read, Write | Yes | ||||
/config/urls | Read, Write | Yes | ||||
/issuer/:issuer_ref | Write | Yes | ||||
/issuer/:issuer_ref/sign-intermediate | Write | Yes | ||||
/issuer/issuer_ref/sign-self-issued | Write | Yes | ||||
/issuers/generate/+/+ | Write | Yes | ||||
/issuers/import/+ | Write | Yes | ||||
/intermediate/generate/+ | Write | Yes | ||||
/intermediate/cross-sign | Write | Yes | ||||
/intermediate/set-signed | Write | Yes | ||||
/keys | List | Yes | ||||
/key/:key_ref | Read, Write | Yes | ||||
/keys/generate/+ | Write | Yes | ||||
/keys/import | Write | Yes | ||||
/root/generate/+ | Write | Yes | ||||
/root/sign-intermediate | Write | Yes | ||||
/root/sign-self-issued | Write | Yes | ||||
/root/rotate/+ | Write | Yes | ||||
/root/replace | Write | Yes |
Note: With managed keys, operators might need access to read the mount
point's tunable data (Read on /sys/mounts
) and
may need access to use or manage managed keys.
Replicated DataSets
When operating with Performance Secondary clusters, certain data-sets are maintained across all clusters, while others for performance and scalability reasons are kept within a given cluster.
The following table breaks down by data type what data sets will cross the cluster boundaries. For data-types that do not cross a cluster boundary, read requests for that data will need to be sent to the appropriate cluster that the data was generated on.
Data Set | Replicated Across Clusters |
---|---|
Issuers & Keys | Yes |
Roles | Yes |
CRL Config | Yes |
URL Config | Yes |
Issuer Config | Yes |
Key Config | Yes |
CRL | No |
Revoked Certificates | No |
Leaf/Issued Certificates | No |
The main effect is that within the PKI secrets engine leaf certificates
issued with no_store
set to false
are stored local to the cluster that issued them.
This allows for both primary and Performance Secondary
clusters' active node to issue certificates for greater scalability. As a
result, these certificates and any revocations are visible only on the issuing
cluster. This additionally means each cluster has its own set of CRLs, distinct
from other clusters. These CRLs should either be unified into a single CRL for
distribution from a single URI, or server operators should know to fetch all
CRLs from all clusters.
Cluster scalability
Most non-introspection operations in the PKI secrets engine require a write to storage, and so are forwarded to the cluster's active node for execution. This table outlines which operations can be executed on performance standby nodes and thus scale horizontally across all nodes within a cluster.
Path | Operations |
---|---|
ca[/pem] | Read |
cert/serial-number | Read |
cert/ca_chain | Read |
config/crl | Read |
certs | List |
ca_chain | Read |
crl[/pem] | Read |
issue | Update * |
revoke/serial-number | Read |
sign | Update * |
sign-verbatim | Update * |
* Only if the corresponding role has no_store
set to true and generate_lease
set to false. If generate_lease
is true the lease creation will be forwarded to
the active node; if no_store
is false the entire request will be forwarded to
the active node.
Issuer storage migration issues
When Vault migrates to the new multi-issuer storage layout on releases prior
to 1.11.6, 1.12.2, and 1.13, and storage write errors occur during the mount
initialization and storage migration process, the default issuer may not
have the correct ca_chain
value and may only have the self-reference. These
write errors most commonly manifest in logs as a message like
failed to persist issuer ... chain to disk: <cause>
and indicate that Vault
was not stable at the time of migration. Note that this only occurs when more
than one issuer exists within the mount (such as an intermediate with root).
To fix this manually (until a new version of Vault automatically rebuilds the issuer chain), a rebuild of the chains can be performed:
This temporarily sets the manual chain on the default issuer to a self-chain
only, before reverting it back to automatic chain building. This triggers a
refresh of the ca_chain
field on the issuer, and can be verified with:
Tutorial
Refer to the Build Your Own Certificate Authority (CA) guide for a step-by-step tutorial.
Have a look at the PKI Secrets Engine with Managed Keys for more about how to use externally managed keys with PKI.
API
The PKI secrets engine has a full HTTP API. Please see the PKI secrets engine API for more details.