Training Curriculum Proposal
Session Title:
SRE Foundations with Datadog Observability – Practical Workshop
Audience: ~60 participants (SREs, DevOps Engineers, Infra/Platform Teams)
Duration: 1 Day (8 hours)
Delivery Mode: Virtual / Onsite (as per discussion)
Trainer: Abhishek (17+ yrs experience | DevOps, Cloud, SRE, Observability Coach)
Training Modules to be Covered
Module 1: SRE Foundations & Core Principles
• Introduction to SRE: Origin, Purpose, Culture
• SRE vs DevOps: Overlap and Differentiators
• Key Terminologies: SLIs, SLOs, SLAs, Error Budgets
• The Role of SRE in Modern Infrastructure
• Incident Response & Toil Reduction Techniques
Module 3: SRE Practices & Service Management
• Reliability Engineering Principles in Action
• Managing Availability, Latency, and Performance
• Release Engineering and Change Management
• Error Budget Policies and Escalation Management
• Real-World SRE Operating Models and War Room Practices
Module 4: Monitoring, Alerting & Observability
• Monitoring vs Observability – Conceptual Difference
• Key Metrics: RED, USE, and Four Golden Signals
• Alerting Strategies and Anti-patterns
• Instrumentation Basics: Logs, Metrics, Traces
Hands-On – Practical Datadog Session (High-Level)
Objective: Provide a walkthrough and demo of how to configure observability in Datadog focusing on
industry-aligned best practices.
Hands-on Topics (Live Walkthrough or Guided Screenshots)
• Overview of the Datadog Platform (Dashboards, Metrics, APM, Logs)
• Configuring Infrastructure Monitoring:
o Installing Datadog Agent (Linux/Windows VM)
o Setting up system metrics (CPU, Memory, Disk)
o Best practices for standard metric collection
• Log Management:
o Enabling Log Collection
o Parsing and tagging logs
o Setting retention and log pipelines
• Tracing:
o Introduction to APM & Distributed Tracing
o Configuring tracing for sample apps
o Visualizing trace flows and latency
• Dashboards and Alerting:
o Creating SLO dashboards
o Real-time alerting with threshold logic
o Recommended templates for infrastructure observability
Training Deliverables
• PDF Copy of Slides
• Datadog Hands-On Guide (with screenshots)
• Sample SLO template
• Cheat Sheet: RED/USE Metrics and Alert Patterns
• Recording Access (on request)
• Post-session Q&A or Clinic (optional)
Costing Proposal (Indicative)
Item Details Rate
Trainer Fee 1-day live delivery (up to 60 participants) TBD
Material Customization & Preparation Hands-on guide, templates, Datadog lab Included
Post-session Support (1 hour follow-up) Q&A Clinic Included
Total Cost (All Inclusive) Virtual Delivery TBD
Prerequisites for Participants
• Basic understanding of system infrastructure (Linux, networking, cloud)
• Datadog access (or trainer-provided demo account/viewer access)
• Zoom/MS Teams access with screen sharing capability