Proposed: Use More Flexible Resource Labels
Date: 19 Apr 2022
The content in this ADR may be out-of-date and needing an update. For questions, please reach out to Cloud Posse
- No pushback from the team. Overall, we know we need to support arbitrary label fields and don't like how we use environment to represent region. Note, this suggestion also matches (is consistent with) our filesystem organization:
<namespace>/<tenant>/<stage>/<environment>
. Internal discussion reference - Decision is to adopt:
<namespace>-<tenant>-<stage>-<environment>
Status
PROPOSAL
Problem
Currently, we use a fixed set of labels, dictated by the terraform-null-label component, for labeling everything provisioned by IoC. This set of labels is also treated specially by atmos
and includes labeling IAM roles and both atmos
and Spacelift “stacks”.
-
The choice of label names has proven to be confusing and unpopular.
-
The set of labels is fixed. When we added “tenant” as a possible label it was a major undertaking to upgrade
terraform-null-label
to handle it. -
Because the label names are fixed, and
atmos
does not have access to the outputs ofterraform-null-label
(becauseatmos
is written ingo
and not Terraform), adding or changing label names requires code changes to bothterraform-null-label
andatmos
-
The use of
name
as a label name is a particular problem as it conflicts with AWS' usage of the tag key “Name” as the UI display name of a resource. -
We have come to rely on
atmos
as a tool, and it needs to parse labels to determine the Atmos “stack” name, the Terraform backend configuration, the Terraform workspace name, the EKS cluster name, and possibly other resources, butatmos
is written ingo
and cannot useterraform-null-label
, which is a Terraform module, to generate these items, but nevertheless we want some of them to be available in Terraform so that components can access configuration data generated by other Terraform components. -
We have some components, such as Kubernetes deployments, that have additional configuration labels/variants, such as
color
for blue/green deployments orip
for IPv4/IPv6 variants. We would like to be able to flexibly use or not use these additional labels to distinguish deployed deployments where applicable, without requiring them for other components (e.g.cloudtrail_bucket
) where they are not needed. Currently we are doing this by manually altering the component names to include the variant labels, but this practice is not DRY and eliminates many of the advantagesatmos
gives us through importing configurations, since all configurations are, in the end, tied to a component name. The proper Atmos model is to have a single component name with variable Terraform workspaces selected by variable labels.
Context
Early on, Cloud Posse decided that consistent labeling was important and implemented a mechanism for it in the form of terraform-null-label
. (terraform-null-label
, or null-label
for short, was first released in 2017.) At the time it was first released, Terraform itself was in the early stages of development and lacked many essential features, so the capabilities of the module were limited. In particular, there was no way to iterate over lists or maps. This imposed a practical requirement that inputs to null-label
be known in advance (hardcoded).
The original set of labels was:
-
namespace
-
stage
-
name
Over time, we added
-
environment
-
tenant
...to get to the current set of 5 labels. (null-label
also accepts a list of attributes
and a map of tags
, which are outside the scope of this ADR.)
Unfortunately, except for the tenant
, there are issues with all of these label names.
-
namespace
collides with Kubernetes' use of “namespace” as a mechanism for isolating groups of resources within a single cluster, and we have had problems due to the$NAMESPACE
shell variable being set to indicate our version of “namespace” while being interpreted by some tools as Kubernetes' version. -
environment
is not bad, but a lot of people use it in a way we do not use it. We use it as a region code (abbreviation for a particular AWS Region) while most people use it to indicate a functional role or AWS account, such as “production” or “staging”. -
stage
is a bit confusing, and in the end more generic than we allow. We use it the way many people use “environment”, but because we typically have a 1-to-1 mapping ofstage
to AWS Account, our code frequently assumes that “stage” is the same as “account”. This breaks, however, in multi-tenant environments where tenants have multiple accounts, such astenant-dev
,tenant-stage
andtenant-production
. -
name
is a problem in that AWS reserves that for the tag key whose value is displayed in the web UI. For all our other labels, we add a tag with the (capitalized) label name as tag key and (normalized) label value as the tag value. We make an exception for “Name”, setting that value to the theid
(the fully formed identifier combining all the labels), not the value of thename
label, which confuses everyone. -
Atmos separately has (in
atmos.yaml
) configuration forhelm_aws_profile_pattern
, EKScluster_name_pattern
, and Stackname_pattern
, along with separate configuration for Component name (directory) and Terraform workspace name. Currently these are either completely hard coded (Component name) or are configured using a template based on the above listed special label names, which works completely separately fromnull-label
and must be kept in sync.
Now (April 2022), Terraform version 1.1 has several features that enable us to use an arbitrary set of label names. On the drawing board (but for no earlier than Terraform version 1.3) is also an additional feature we would like, allowing input objects to have optional attributes. This suggests we can create a new null-label
version with 1.1 features and again enhance it after optional attributes have been released. https://github.com/hashicorp/terraform/pull/31154
Considered Options
Option 1:
Null Label
Going forward, I suggest Cloud Posse use different label names in its engagements:
-
company
instead ofnamespace
, to provide a global prefix that makes the final ID unique despite our reuse of all the other label values -
region_code
orreg
instead ofenvironment
to indicate the abbreviated AWS Region -
tenant
can remain, or be changed toou
for organizational unit. -
env
instead ofstage
, to indicate the function of the environment, such as “development”, “sandbox”, or “production”. In environments whereenv
always equalsaccount
. We would specify only one and have the other be a generated label (see below). Which one to specify should be based on a survey of clients' preferences. -
account
instead ofstage
to indicate the name of the AWS account.account
would never be specified directly, it would generally be eitherenv
ortenant-env
. -
component_name
instead ofname
(and to avoid overloadingname
used by AWS andcomponent
which has special meaning toatmos
). -
Possibly an additional label component, such as
net
orip
that can be used to allow us to create IPv4 and IPv6 versions of components like EKS clusters or ALBs in the same account and region and yet still distinguish them. It label component would ideally have an optional attribute that removes the delimiter before it, so ifname
iseks
andip
is6
, we can get a name like{namespace}-{tenant}-{environment}-{stage}-eks6-cluster
instead of{namespace}-{tenant}-{environment}-{stage}-eks-6-cluster
To facilitate this, I suggest an overhaul of terraform-null-label
. We can use the existing label_order
input to take an arbitrary list of label names. We can deprecate the existing hard-coded label names in favor of a new input, called label_input
(to allow us to have an output named labels
which has the normalized label values, and a separate output named label_input
which preserves the input untransformed) or labels
(where either we do not care about the output labels
being different than the input or we are satisfied that module.this.labels
is normalized while module.this.context.labels
gets you back exactly what was input, as is currently the case with the special label names., e.g module.this.stage
vs modules.this.context.stage
) which is a map(string)
where the keys are label names and the values are label values. (This is exactly like the tags
input, but the tags are not altered, while labels are.)
Additionally, we deprecate the existing descriptor_format
input and descriptors
output in favor of a label_generator
input which adds labels to the labels
output. This would allow us to have an account
output that by default is the same as the env
or stage
output (and for that matter, allow us to preserve the namespace
, environment
, and name
outputs even though we have stopped using them as inputs), and also handle the case where account
is a composite of 2 labels like tenant-dev
.
Future Possibilities
Once Terraform supports optional object members, I would propose label_generator
be a map(object)
that has:
-
key is name of label to generate
-
labels = list(string)
list of label to construct the label from, in order -
delimiter = optional(string)
the delimiter to use when joining the labels, defaults to labeldelimiter
-
value_case = optional(string)
the case formatting of the label values, one oflower
,title
,upper
ornone
(no transformation), defaults tolabel_value_case
-
regex_remove_chars = optional(string)
regex specifying characters to remove from the value, defaults to top levelregex_replace_chars
(which I would deprecate and replace withregex_remove_chars
since we do not provide the capability to replace the characters and no one has asked for that). -
length_limit = optional(number)
the limit on the length of the value, or 0 for unlimited, defaults to 0. -
truncation_mode = optional(string)
one of "beginning", "middle", or "end". Where to place the hash that substitutes for the extra characters in the label. Allows you to decide to truncatefoo-bar-baz
asfoo-bar-<hash>
(the only mode we allow today),<hash>-bar-baz
, orfoo-<hash>-baz
. I would also addid_truncation_mode
to the top-level and defaulttruncation_mode
to whateverid_truncation_mode
is set to. Unfortunately,id_truncation_mode
would need to default toend
for backward compatibility, but I thinkmiddle
is the better default.
locals {
# Create a default format map so it can be reused, optionally with changes applied.
# This is in part to deal with the Terraform requirement that all values of a map
# must have the exact same type.
default_format = {
delimiter = "-"
value_case = "lower"
regex_remove_chars = "/[^a-zA-Z0-9-]/"
length_limit = 64
truncation_mode = "middle"
}
}
# Advanced example, more like what we would probably use
module "this" {
source = "cloudposse/label/null"
label_order = [ "org", "ou", "reg", "env", "component"]
label_format = local.default_format
label_generator = {
# This is how we would generate the "id" output if it were not hardcoded for backward compatibility
id = merge(local.default_format, {
labels = [ "org", "ou", "reg", "env", "component"]
})
# Generate an output named "account" of the form "${ou}_${env}"
account = merge(local.default_format, {
# Specify the value inputs and the order
labels = ["ou", "env"]
# Change the delimiter to "_" instead of "-"
delimiter = "_"
# By default, we remove underscores, so we need to alter the list of characters to remove
regex_remove_chars = "/[^a-zA-Z0-9-_]/"
})
}
# In practice, the "values" input would be generated by Atmos
# For example, in stacks/orgs/cplive/_defaults.yaml
# vars:
# label_values:
# org: cplive
label_values = merge ({component = var.component_name} , {
org = "cplive",
ou = "plat",
reg = "ue1"
})
}
locals {
id = module.this.id
org = module.this.labels["org"]
account_name = module.this.labels["account"]
}
# Simpler example
module "this" {
source = "cloudposse/label/null"
label_order = [ "org", "ou", "reg", "env", "component"]
label_format = local.default_format
label_generator = {
account = {
labels = ["ou", "env"]
delimiter = "_"
regex_remove_chars = "/[^a-zA-Z0-9-_]/"
}
}
label_values = {
org = "cplive",
ou = "plat",
reg = "ue1"
}
}
# Simplest example
module "this" {
source = "cloudposse/label/null"
label_order = [ "org", "ou", "reg", "env", "component"]
format = local.default_format
values = {
org = "cplive",
ou = "plat",
reg = "ue1"
}
}
# In stacks/orgs/cplive/_defaults.yaml using current labels
# (Compare to https://github.com/cloudposse/infra-live/blob/8754dc3d1e938c31387bc704ef361fc476fe28e5/stacks/orgs/cplive/_defaults.yaml#L9-L28 )
vars:
label_values:
namespace: cplive
label_order:
- namespace
- tenant
- environment
- stage
- name
- attributes
label_format: &default_label_format
delimiter: "-"
value_case: "lower"
regex_remove_chars: "/[^a-zA-Z0-9-]/"
length_limit: 64
truncation_mode: "middle"
label_generator:
account_name:
<<: *default_label_format
labels:
- tenant
- stage
stack:
<<: *default_label_format
labels:
- tenant
- environment
- stage
# In stacks/orgs/cplive/core/_defaults.yaml
vars:
label_values:
tenant: cplive
# et cetera
For now (April 2022) with no ETA on that feature, I would limit label_generators
to map(list(string))
:
-
key is name of label to generate
-
labels = list(string)
list of label to construct the label from, in order
The generated label will be the normalized values of the labels named in the list, in that order, joined by the same delimiter
used for the id
.
Likewise, we would deprecate the named outputs (and descriptors
) in favor of a labels
output which is a map of label names to normalized label outputs. So instead of module.this.stage
we would reference modules.list.labels["stage"]
Atmos Changes
We need to update atmos to support a flexible set of labels.
Atmos option 1
Instead of specifying a template for each configuration value, such as cluster_name_pattern
, Atmos could configure a labels
output to use as cluster_name_pattern
(e.g. cluster
) and then both atmos
and terraform
will have access to exactly the same information in the same way (e.g. module.this.labels["cluster"]
).
Atmos option 2
Right now, there are the top level namespace
, stage
, name
, tenant
, environment
labels.
We could put these now under a new section in the stacks or in atmos.yaml
:
terraform:
backend:
backend_pattern: {foo}-{bar}-{baz}
labels:
- foo
- bar
- baz
For compatibility with null-label
, atmos
should populate the labels based on the fully merged vars
section of the stack configuration, supporting both the old variables as it does now and the new label_input
(or whatever we call it) map.
Option 2:
module "this" {
label = "camelcase(id)-lowercase(name)-uppercase(company)" # camelcaseHyphenFoobarFormat(....)
context = var.context
}
Option 3:
We predefine a named set of formats and allow additional custom formats to be defined
# Simpler example
module "this" {
source = "cloudposse/label/null"
label_order = [ "org", "ou", "reg", "env", "component"]
label_format = "kebab"
label_generator = {
account = {
labels = ["ou", "env"]
format = "snake"
}
}
label_values = {
org = "cplive",
ou = "plat",
reg = "ue1"
}
}
Decision
DECIDED: