Manage AWS DynamoDB scale
AWS DynamoDB is a fully managed, serverless, key-value NoSQL database that supports a wide variety of use cases. AWS provides many configuration options to manage your DynamoDB tables' capacity and performance.
In this tutorial, you will learn how to configure these options with Terraform. First you will provision a DynamoDB table. Next you will use Terraform to configure features that help you manage DynamoDB scale and capacity. Then you will use Terraform to load data into your table, and query the data with the AWS CLI.
Prerequisites
You can complete this tutorial using the same workflow with either Terraform Community Edition or HCP Terraform. HCP Terraform is a platform that you can use to manage and execute your Terraform projects. It includes features like remote state and execution, structured plan output, workspace resource summaries, and more.
Select the Terraform Community Edition tab to complete this tutorial using Terraform Community Edition.
This tutorial assumes that you are familiar with the Terraform and HCP Terraform workflows. If you are new to Terraform, complete the [Get Started tutorials(/collections/terraform/aws-get-started) first. If you are new to HCP Terraform, complete the HCP Terraform Get Started tutorials first.
In order to complete this tutorial, you will need the following:
- Terraform v1.2+ installed locally.
- An AWS account.
- An HCP Terraform account with HCP Terraform locally authenticated.
- An HCP Terraform variable set configured with your AWS credentials. The AWS CLI, configured with the same credentials you use for HCP Terraform.
Note
Some of the infrastructure in this tutorial does not qualify for the AWS free tier. Destroy the infrastructure at the end of the tutorial to avoid unnecessary charges. We are not responsible for any charges that you incur.
Clone the example repository
Clone the example repository for this tutorial, which contains Terraform configuration for a DynamoDB table.
Change into the repository directory.
Review the configuration
The example configuration defines a DynamoDB table configured to store
environmental data such as temperature, pressure, and humidity. Devices such as
weather stations or environmental sensors often report this kind of data. The
example data, found in data/example_environments.csv
, consists of data for the
following table attributes:
- userId: A unique identifier for the owner of the device.
- deviceId: A unique identifier for the device.
- eventId: A unique identifier for a given event.
- geoLocation: The geographical location of the device.
- epochS: The time, in epoch seconds, when the device recorded the event.
- expiry: The time, in epoch seconds, past which an event no longer needs to be stored in your DynamoDB table.
- tempC: The temperature, in celsius.
- humidityPct: The humidity, represented as a percentage.
- pressurePa: The atmospheric pressure, in Pascals.
Open main.tf
to review the initial configuration, which defines a DynamoDB
table with a randomly generated name.
Note
AWS uses different terminology for some DynamoDB concepts in their documentation and the API. The Terraform AWS provider uses the API term and so does this tutorial. We include the documentation name in parentheses. For example: "hash (partition) key".
All DynamoDB tables require a name and a primary key. The
random_pet.table_name
resource provides a unique name for the table. This
configuration uses a composite primary key consisting of a hash (partition) key,
deviceId
, and a range (sort) key, epochS
.
The billing_mode
argument configures the table to use the pay-per-request billing (capacity) mode,
which allows AWS to set read and write capacity for you. The
pay-per-request mode can be more cost-effective than provisioned
capacity for some unpredictable workloads.
The example configuration sets attributes for the hash key, deviceId
, and
range key, epochS
. Only configure attribute blocks for table attributes that
you use in the keys or other indexes. You can store other attributes in your
DynamoDB table, but the AWS API will error if you define them in your Terraform
configuration. For example, the table will store values for tempC
, but there
is no attribute block for it in your configuration because your table does not
use that attribute in an index. If you were to include an attribute block for
tempC
, AWS would return an error:
The configuration in outputs.tf
defines two outputs for the table's name and
Amazon Resource Name (ARN).
Provision a minimal DynamoDB table
Set the TF_CLOUD_ORGANIZATION
environment variable to your HCP Terraform
organization name. This will configure your HCP Terraform integration.
Initialize your configuration. Terraform will automatically create the
learn-terraform-aws-dynamodb-scale
workspace in your HCP Terraform
organization.
Note
This tutorial assumes that you are using a tutorial-specific HCP Terraform organization with a global variable set of your AWS credentials. Review the Create a Credential Variable Set for detailed guidance. If you are using a scoped variable set, assign it to your new workspace now.
Apply the configuration. Respond to the confirmation prompt with a yes
to
create your DynamoDB table.
In this tutorial, you will configure several options to manage your table's capacity and performance.
- Configure provisioned capacity
- Configure Autoscaling
- Add secondary indexes
- Configure global tables
- Manage TTL
- Change the table class
You can configure most of these options independently from the others, except as noted below.
Configure provisioned capacity
The example configuration uses the pay-per-request billing (capacity) mode.
Provisioned capacity can be more cost effective, especially for predictable
workloads. When you configure provisioned capacity, you must specify the read
and write capacity for your table. For example, you may determine that your
application requires a read capacity of 5
, and a write capacity of 2
.
Add the following variable definitions to variables.tf
to configure your
table's read and write capacity.
Next, update your table to use provisioned billing (capacity) mode, and configure the read and write capacity with those variables.
Apply this change to update your table's billing mode. Respond to the
confirmation prompt with a yes
.
Changing the billing (capacity) mode or updating provisioned read and write capacity does not require recreating the table, so Terraform updated it in place.
AWS lets you change a table's billing mode once every 24 hours. If you tried to change it again within this time window, AWS would report an error:
Configure autoscaling
When you configure provisioned capacity for your table, you can also configure Application Autoscaling to have AWS manage the table's read and write capacity. When you do so, AWS will update your table's read and write capacity according to the autoscaling rules you define. Autoscaling will increase your table's capacity when more traffic is incoming, and reduce it after traffic subsides.
Before you enable autoscaling, instruct Terraform to ignore changes to your
table's read and write capacity attributes. Otherwise, whenever you run
terraform apply
in the future, Terraform will report that these values have
changed outside of Terraform's control, and revert them to the original values.
In main.tf
, add a lifecycle
block to your your table's configuration so
Terraform will ignore changes to read_capacity
and write_capacity
.
Also in main.tf
, add read and write policies targeting your table.
The min_capacity
and max_capacity
attributes of the
aws_appautoscaling_target
resources configure the upper and lower bounds for
your table's read and write capacities. Setting the target_value
to 70.0
for
each capacity type means that when your table is using less than 70 percent of
its current read or write capacity, autoscaling will reduce its capacity, and
when your table uses more than 70 percent of its currentt capacity, autoscaling
will increase capacity.
Apply this change to configure autoscaling for your table. Terraform will ignore
changes to read and write capacity parameters going forward. Respond to the
confirmation prompt with a yes
.
Adding or removing autoscaling from your DynamoDB tables does not require replacing your table, so Terraform updated it in place.
Add secondary indexes
Secondary indexes can speed up queries that do not involve the table's primary key. There are two types of secondary indexes: local, and global. Local secondary indexes use the same partition key as the table, but a different sort key, and are strongly consistent. Global secondary indexes use a different partition key, scale independently from the table, and are eventually consistent. AWS recommends using global secondary indexes unless you specifically need a Local secondary index, for example when you need strong consistency.
Add a local secondary index
Your application may often query a range of data based upon the
eventId
table attribute, and need strongly consistent reads
for these queries. Add a local secondary index to meet this need.
In the table resource in main.tf
, add the attribute that the local secondary index must query on, and the local secondary index itself.
The new table attribute, eventId
defines the required attribute for the range
key that your local secondary index uses.
Warning
Terraform destroys and recreates your table whenever you add, remove, or update a local secondary index, which deletes all the items in the table. This is because AWS requires that you define local secondary indexes when you create your table.
Apply this change to replace your table
with one that includes this local secondary index. Respond to the confirmation
prompt with a yes
.
Local secondary indexes use the base table's read and write capacity, so you must account for access to the secondary index when you configure provisioned or autoscaling capacity for your table.
Add a global secondary index
For use cases that do not require strong read consistency, AWS recommends that you use global secondary indexes instead of local secondary indexes. Global secondary indexes also allow your application to query your table by a different primary key from your main table.
Add a geoLocation
attribute to your table, and a global secondary index to
allow users of your table to query events by geoLocation
.
Like keys for local secondary indexes, you must include keys for global secondary indexes as attributes in the table definition.
AWS recommends autoscaling global secondary indexes whenever you use
autoscaling on the table itself in order to ensure consistent performance across
your table and GSIs. Add autoscaling configuration for the global secondary
index to main.tf
.
Apply this change to add the global secondary index and autoscaling
configuration to your table. Respond to the confirmation prompt with a yes
.
It may take a few minutes for AWS to provision your global secondary index.
Note
Since the global secondary index scales automatically, whenever you apply changes to this configuration, Terraform will report that the read and write capacity of the index have changed. A limitation in the DynamoDB API prevents Terraform's AWS provider from ignoring changes to these attributes.
AWS allows you to create, update, and remove global secondary indexes independently from the main table, and the indexes can have their own primary key and read and write capacities.
Configure global tables
Global tables replicate your DynamoDB table across AWS regions. You must enable
streaming on your DynamoDB table before you can use global tables, and your
table must either be in the PAY_PER_REQUEST
billing mode, or have autoscaling
configured. The DynamoDB stream publishes updates whenever you add, delete, or
modify the items in your table. Your global tables will consume updates from
your stream to stay in sync with your main table.
For example, your table is provisioned in the us-east-1
region, but you may
want to replicate your data to other regions to improve performance for
customers in those locations. Configure global table replicas in the us-west-1
and ap-northeast-1
regions.
Add a variable to configure the regions you want AWS to replicate your DynamoDB table across.
Next, create a new file called terraform.tfvars
and set a value for the
replica_regions
variable.
Now configure global tables by enabling streaming and adding replica blocks to
your table resource in main.tf
.
Terraform's dynamic
blocks repeat the named block once for each item assigned
to the for_each
meta argument. With the value for replica_regions
that you
set in terraform.tfvars
, ["us-west-1", "ap-northeast-1"]
, the configuration
above is equivalent to:
Apply this change to configure global tables for your DynamoDB table. Respond to
the confirmation prompt with a yes
. It may take several minutes for Terraform
to provision your global table replicas.
Since AWS requires that you enable autoscaling before you enable global tables, you could not use Terraform to create a new dynamoDB table with both features enabled during a single apply action. Terraform would attempt to create your table and global tables before it created the autoscaling resources, and the AWS API would report an error:
You can use dynamic blocks like the one in the example configuration above to work around this limitation. To provision a new table and then enable global tables using this configuration, you would:
Apply the configuration with
replica_regions
set to the default value,[]
, to create the table and autoscaling configuration, but not configure any global table replicas.Update the
replica_regions
variable's value with the list of regions you want to replicate your table into.Apply the configuration again to enable global table replicas.
Alternatively, you could configure your table to use the PAY-PER-REQUEST
billing (capacity) mode instead of using autoscaling. In that case, Terraform
could create the table and enable global table replicas in the same apply step.
Refer to this issue for more information about this limitation.
Note
If you are using autoscaling, you must also enable autoscaling on any global secondary indexes for the table before you configure global tables.
Manage TTL
DynamoDB allows you to expire items with a Time To Live (TTL) attribute. When TTL is configured, AWS will automatically remove items whose TTL attribute contains a timestamp that has passed. This can help you manage your table's performance by removing stale items.
For example, your application may store data from your environmental sensors every 5 minutes, as well as hourly average data. Once the hourly average has been calculated, you may no longer need the data from every 5 minutes. Use TTL to remove that data automatically.
Add a ttl
block to your DynamoDB table to enable TTL.
Inside the ttl
block, attribute_name
specifies the DynamoDB table attribute
which will contain the TTL as epoch time in seconds. After this time DynamoDB
will delete the associated item. Do not include a separate attribute
block for
the TTL attribute in your Terraform configuration unless you will also use it in
an index.
Apply this change to configure TTL for your DynamoDB table. Respond to the
confirmation prompt with a yes
.
Change the table class
In addition to the default Standard
table class, DynamoDB supports
Standard-Infrequent Access
tables for rarely-accessed data. The
Standard-Infrequent Access table class offers less expensive data storage, but
more expensive data access and throughput.
Update your table's configuration to set the table class.
Apply this change to configure your DynamoDB table's class. Respond to the
confirmation prompt with a yes
.
Changing the table class does not require replacing the table. However, if you modify a table's class, the AWS API will not allow you to make any other changes to the table configuration in that same request.
Populate your table
The Terraform AWS provider includes an aws_dynamodb_table_item
resource that
you can use to manage items in your DynamoDB tables. While Terraform is not the
appropriate tool to manage table items in most cases, you may wish to use this
resource to populate static data or example items for a test environment.
First, add a variable definition to variables.tf
to control whether or not
Terraform will load example data.
Add the following configuration to main.tf
to load example data from the
data/example_environments.csv
file into your table if the load_example_data
variable is set to true
.
The aws_dynamodb_table_item.example
block you added to main.tf
uses a
ternary operator to conditionally load the data based on the value of the
load_example_data
variable.
You can use this pattern to load example data in some environments, but not
others. For example, you would set load_example_data
to true
when you
provision your table in a test environment, and false
when you provision your
table in production. This way, your integration tests will work with a known set
of data in the test environment, but that data will not exist in production.
Apply this change to load the example data into your DynamoDB table. Respond to
the confirmation prompt with a yes
.
Since global tables are enabled, AWS will automatically propagate this data to your replicas.
Query your data
Use the AWS CLI to query your DynamoDb table.
Note
If you are following this tutorial in HCP Terraform, first ensure that the AWS CLI is configured with the same credentials as your HCP Terraform workspace.
The TTL attribute, expiry
, is set to a time in the past for 10 of the items in
your table, so AWS will automatically delete them. AWS usually removes expired
items very quickly. However, occasionally this process can take 48 hours or
more. Because of this, the data you get back from the following queries may be
different from the examples below.
Query your table for events for a specific device ID, b6c772c6-d621-46ff-86c6-7c662de62375
.
The AWS CLI This query will return either 10 results, or 8 results if AWS has already removed expired items.
Query events by device ID, within a time range, using the us-west-1
global
table replica.
This query will return either 5 results, or 4 results if AWS has already removed expired items.
Query events by geoLocation
.
This query will return either 20 results, or 18 results if AWS has already removed expired items.
Observe the effects of TTL
Your configuration manages table items with the
aws_dynamodb_table_item.example
block. Terraform will verify that the items
exist whenever you plan or apply changes. Since the TTL is set for several of
the items, AWS will remove them automatically. After it does so, Terraform will
notice that they are missing and attempt to recreate them whenever you apply
your configuration. Depending on your use case, you may want to avoid this
behavior by not setting the TTL attribute for table items managed by Terraform.
Apply your configuration now.
If AWS has removed the items with the TTL attribute, expiry
, set, Terraform
will prompt you to add the 10 expired table items. Otherwise, it will just plan
the changes to your global table's read and write capacity, as described above.
Respond to the confirmation prompt with a yes
, and Terraform will add the
expired items. AWS will once again remove them because of the TTL configured in
the expiry
attribute.
Clean up infrastructure
Remove your table and related resources. Respond to the confirmation prompt with
a yes
.
Note
Destroying your DynamoDB table automatically destroys any items in the table, whether they were created by Terraform or not. Terraform will explicitly destroy the items it manages, and AWS will automatically remove any others when Terraform destroys the table.
If you used HCP Terraform for this tutorial, after destroying your resources,
delete the learn-terraform-aws-dynamodb-scale
workspace from your HCP Terraform
organization.
Next steps
In this tutorial, you provisioned a DynamoDB table and then configured features related to managing your table's performance and capacity. You also provisioned items in your table, and ran several queries for the example data. Review the following resources to learn more about managing AWS DynamoDB with Terraform.
Read the Terraform documentation for the AWS DynamoDB table, table item and related resources.
The Terraform registry includes a module to manage DynamoDB resources, which you may find easier to use than managing resources directly.
Read the AWS DynamoDB developer documentation to learn more about how DynamoDB works and how to use it.
Read the documentation for Terraform's dynamic blocks.