Skip to main content

Monitor Everything

With so many moving pieces, it's crucial to monitor what's happening under the hood to understand what's going on. This includes gathering telemetry in the form of metrics and logs coming from your services and the underlying infrastructure. This data must be shipped somewhere to build dashboards and raise alerts that will escalate to the appropriate personnel. Depending on your business needs, you may also need to monitor for security and compliance against various technical benchmarks like PCI/DSS, CIS, ISO 27001, and others.

1 Set up Telemetry

Choose between Datadog or AWS-managed Prometheus and Grafana with Loki for gathering your telemetry. Datadog offers the most mature implementation, while AWS-managed Grafana and Prometheus provide lower-cost alternatives with various trade-offs, that make them a good fit for many organizations.

Datadog is our most comprehensive observability solution, offering a monitoring-as-code approach using YAML configuration fully managed with Terraform. This includes Datadog monitors, custom RBAC roles, synthetic tests, child organizations, and other resources.

We show how to define reusable Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for consistent implementation, helping to reduce alert fatigue by focusing on critical business-specific metrics and leveraging Datadog's advanced capabilities. Then integrating this with OpsGenie for incident management.

Get Started
AI generated voice

2 Manage Incidents

With monitoring in place and alerts being emitted, it’s crucial to define what qualifies as an incident and escalate it to the appropriate people for action. We support OpsGenie, which will be natively integrated with Datadog.

3 Monitor for Security & Compliance

Monitoring for security and compliance is essential for organizations subject to industry regulations like HIPAA or for e-commerce companies aiming for PCI compliance. Our reference architecture includes comprehensive support for AWS's suite of security-oriented services, including: