Structure of Terraform S3 State Backend Bucket
This guide explains the structure of a Terraform S3 state backend bucket, including the use of workspaces, key prefixes, and buckets. It details how the backend.tf.json
file is used to configure the S3 backend for storing Terraform state, and how DynamoDB is used for state locking and consistency checking. The document provides examples and best practices for managing and accessing the Terraform state backend.
Understand the anatomy of a Terraform S3 state backend bucket and how workspaces, key prefixes and buckets are used.
From HashiCorp
Stores the state as a given key in a given bucket on Amazon S3. This backend also supports state locking and consistency checking via Dynamo DB, which can be enabled by setting the
dynamodb_table
field to an existing DynamoDB table name. A single DynamoDB table can be used to lock multiple remote state files. Terraform generates key names that include the values of thebucket
andkey
variables.
The backend.tf.json
File
This file is programmatically generated by Atmos using all the capabilities of
Stacks to deep merge. Every component defines a backend.tf.json
, which is what distinguishes it as a root module (as opposed to a terraform child module). The backend tells terraform where to access the last known deployed state of infrastructure for the given component. Since the backend is stored in S3, it’s easily accessed by in a distributed manner by anyone running terraform.
An identical backend.tf.json
file is used by all environments (stacks). Environments are selected using the
terraform workspace
command, which happens automatically when using atmos
together with the --stack
argument.
For reference, this is the anatomy of the backend configuration: (note this is just a JSON representation of HCL)
{
"terraform": {
"backend": {
"s3": {
"acl": "bucket-owner-full-control",
"bucket": "acme-ue2-root-tfstate",
"dynamodb_table": "acme-ue2-root-tfstate-lock",
"encrypt": true,
"key": "terraform.tfstate",
"profile": "acme-gbl-root-terraform",
"region": "us-east-2",
"workspace_key_prefix": "vpc"
}
}
}
}
Either profile
or role_arn
can be used here
S3 Backend
The S3 bucket is created in the cold start using the tfstate-backend component provisioned in the root account.
The state format is s3://{bucket_name}/{component}/{stack}/terraform.tfstate
-
The
bucket name
format is{namespace}-{optional tenant}-{environment}-{stage}-tfstate
-
We deploy this bucket in the
root
account so here are some example bucket names
acme-ue2-root-tfstate
(without tenant) acme-mgmt-ue2-root-tfstate
(with tenant: mgmt
)
-
The
component
name provided is used as the terraform state’sworkspace_key_prefix
in each component’sbackend.tf.json
. Therefore, this will be the first s3 key after the bucket name. -
The
stack
is where the component is provisioned and the name of the workspace created -
Finally, the
terraform.tfstate
is thekey
provided in each component’sbackend.tf.json
The terraform commands run by atmos
for the backend s3://acme-ue2-root-tfstate/vpc/ue2-prod/terraform.tfstate
atmos terraform deploy vpc --stack ue2-prod
| atmos will create the input variables from the YAML and run the following commands
| -- terraform init
| -- terraform workspace ue2-prod
| -- terraform plan
| -- terraform apply
To better visualize what’s going on, we recommend running the commands below to explore your own state bucket. Make sure
to use the correct profile
for your organization (acme-gbl-root-admin
is just a placeholder).
Find the bucket. It should contain tfstate
in its name. In the example below, we can see the
vpc component is deployed to use2-auto
, use2-corp
, use2-dev
, use2-qa
,
use2-sbx01
, use2-staging
. As you can see, the workspace
is constructed as the {environment}-{stage}
. This
setting is defined in the atmos.yaml
config with the stacks.name_pattern
setting (see Atmos
for all settings).
$ aws --profile acme-gbl-root-admin \
s3 ls --recursive
...
2021-11-01 19:53:48 120926 vpc/use2-auto/terraform.tfstate # workspace key prefix: vpc, workspace name is `use2-auto`
2021-11-01 19:49:12 123604 vpc/use2-corp/terraform.tfstate
2021-11-01 19:50:18 123486 vpc/use2-dev/terraform.tfstate
2021-11-01 19:48:39 123354 vpc/use2-qa/terraform.tfstate
2021-11-01 19:49:46 123735 vpc/use2-sbx01/terraform.tfstate
2021-11-01 19:50:50 124014 vpc/use2-staging/terraform.tfstate
See where all the VPC components contain state
aws --profile acme-gbl-root-admin \
s3 ls s3://{bucket_name}/vpc/
If a component is mistakenly deployed somewhere and destroyed, a leftover terraform.tfstate
file will be present on
your local filesystem with a small file size so while this is a good way to search for backends, it's not the best way
to determine where a component is deployed. Also, the S3 bucket has versioning enabled, ensuring we can always
(manually) revert to a previous state if need be.
DynamoDB Locking
Find the table. It should contain tfstate-lock
in its name.
aws --profile acme-gbl-root-admin \
dynamodb list-tables
Get a LockID
aws --profile acme-gbl-root-admin \
dynamodb get-item \
--table-name {table_name} \
--key '{"LockID": {"S": "{bucket_name}/{component}/{stack}/terraform.tfstate-md5"}}'
References
- https://www.terraform.io/docs/language/settings/backends/s3.html backend configuration documentation.