Skip to main content

Module: rds-cloudwatch-sns-alarms

Terraform module that configures important RDS alerts using CloudWatch and sends them to an SNS topic.

Create a set of sane RDS CloudWatch alerts for monitoring the health of an RDS instance.

Usage

areametriccomparison operatorthresholdrationale
StorageBurstBalance<20 %20 % of credits allow you to burst for a few minutes which gives you enough time to a) fix the inefficiency, b) add capacity or c) switch to io1 storage type.
StorageDiskQueueDepth>64This number is calculated from our experience with RDS workloads.
StorageFreeStorageSpace<2 GB2 GB usually provides enough time to a) fix why so much space is consumed or b) add capacity. You can also modify this value to 10% of your database capacity.
CPUCPUUtilization>80 %Queuing theory tells us the latency increases exponentially with utilization. In practice, we see higher latency when utilization exceeds 80% and unacceptable high latency with utilization above 90%
CPUCPUCreditBalance<20One credit equals 1 minute of 100% usage of a vCPU. 20 credits should give you enough time to a) fix the inefficiency, b) add capacity or c) don't use t2 type.
MemoryFreeableMemory<64 MBThis number is calculated from our experience with RDS workloads.
MemorySwapUsage>256 MBSometimes you can not entirely avoid swapping. But once the database accesses paged memory, it will slow down.

Examples

See the examples/ directory for working examples.

resource "aws_db_instance" "default" {
allocated_storage = 10
storage_type = "gp2"
engine = "mysql"
engine_version = "5.7"
instance_class = "db.t2.micro"
identifier_prefix = "rds-server-example"
name = "mydb"
username = "foo"
password = "foobarbaz"
parameter_group_name = "default.mysql5.7"
apply_immediately = "true"
skip_final_snapshot = "true"
}

module "rds_alarms" {
source = "git::https://github.com/cloudposse/terraform-aws-rds-cloudwatch-sns-alarms.git?ref=tags/0.1.5"
db_instance_id = "${aws_db_instance.default.id}"
}