Module: rds-cloudwatch-sns-alarms
Terraform module that configures important RDS alerts using CloudWatch and sends them to an SNS topic.
Create a set of sane RDS CloudWatch alerts for monitoring the health of an RDS instance.
Usage
area | metric | comparison operator | threshold | rationale |
---|---|---|---|---|
Storage | BurstBalance | < | 20 % | 20 % of credits allow you to burst for a few minutes which gives you enough time to a) fix the inefficiency, b) add capacity or c) switch to io1 storage type. |
Storage | DiskQueueDepth | > | 64 | This number is calculated from our experience with RDS workloads. |
Storage | FreeStorageSpace | < | 2 GB | 2 GB usually provides enough time to a) fix why so much space is consumed or b) add capacity. You can also modify this value to 10% of your database capacity. |
CPU | CPUUtilization | > | 80 % | Queuing theory tells us the latency increases exponentially with utilization. In practice, we see higher latency when utilization exceeds 80% and unacceptable high latency with utilization above 90% |
CPU | CPUCreditBalance | < | 20 | One credit equals 1 minute of 100% usage of a vCPU. 20 credits should give you enough time to a) fix the inefficiency, b) add capacity or c) don't use t2 type. |
Memory | FreeableMemory | < | 64 MB | This number is calculated from our experience with RDS workloads. |
Memory | SwapUsage | > | 256 MB | Sometimes you can not entirely avoid swapping. But once the database accesses paged memory, it will slow down. |
Examples
See the examples/
directory for working examples.
resource "aws_db_instance" "default" {
allocated_storage = 10
storage_type = "gp2"
engine = "mysql"
engine_version = "5.7"
instance_class = "db.t2.micro"
identifier_prefix = "rds-server-example"
name = "mydb"
username = "foo"
password = "foobarbaz"
parameter_group_name = "default.mysql5.7"
apply_immediately = "true"
skip_final_snapshot = "true"
}
module "rds_alarms" {
source = "git::https://github.com/cloudposse/terraform-aws-rds-cloudwatch-sns-alarms.git?ref=tags/0.1.5"
db_instance_id = "${aws_db_instance.default.id}"
}
Variables
Required Variables
db_instance_id
(string
) requiredThe instance ID of the RDS database instance that you want to monitor.
Optional Variables
burst_balance_threshold
(number
) optionalThe minimum percent of General Purpose SSD (gp2) burst-bucket I/O credits available.
Default value:
20
cpu_credit_balance_threshold
(number
) optionalThe minimum number of CPU credits (t2 instances only) available.
Default value:
20
cpu_utilization_threshold
(number
) optionalThe maximum percentage of CPU utilization.
Default value:
80
disk_queue_depth_threshold
(number
) optionalThe maximum number of outstanding IOs (read/write requests) waiting to access the disk.
Default value:
64
free_storage_space_threshold
(number
) optionalThe minimum amount of available storage space in Byte.
Default value:
2000000000
freeable_memory_threshold
(number
) optionalThe minimum amount of available random access memory in Byte.
Default value:
64000000
swap_usage_threshold
(number
) optionalThe maximum amount of swap space used on the DB instance in Byte.
Default value:
256000000
Context Variables
The following variables are defined in the context.tf
file of this module and part of the terraform-null-label pattern.
context.tf
file of this module and part of the terraform-null-label pattern.Outputs
sns_topic_arn
The ARN of the SNS topic
Dependencies
Requirements
terraform
, version:>= 0.13.0
aws
, version:>= 2.0
Providers
aws
, version:>= 2.0
Modules
Name | Version | Source | Description |
---|---|---|---|
label | 0.25.0 | cloudposse/label/null | n/a |
subscription_label | 0.25.0 | cloudposse/label/null | n/a |
this | 0.25.0 | cloudposse/label/null | n/a |
topic_label | 0.25.0 | cloudposse/label/null | n/a |
Resources
The following resources are used by this module:
aws_cloudwatch_metric_alarm.burst_balance_too_low
(resource)aws_cloudwatch_metric_alarm.cpu_credit_balance_too_low
(resource)aws_cloudwatch_metric_alarm.cpu_utilization_too_high
(resource)aws_cloudwatch_metric_alarm.disk_queue_depth_too_high
(resource)aws_cloudwatch_metric_alarm.free_storage_space_too_low
(resource)aws_cloudwatch_metric_alarm.freeable_memory_too_low
(resource)aws_cloudwatch_metric_alarm.swap_usage_too_high
(resource)aws_db_event_subscription.default
(resource)aws_sns_topic.default
(resource)aws_sns_topic_policy.default
(resource)
Data Sources
The following data sources are used by this module:
aws_caller_identity.default
(data source)aws_iam_policy_document.sns_topic_policy
(data source)