Product

Introducing Playbooks: Self-Healing Queues That Fix Themselves

Queue lag spikes at 3am shouldn't page your on-call. OwlMQ Playbooks let you define automated remediation steps that trigger when conditions are met.

AS

Arjun Shah

Product Lead

December 28, 2024

8 min read

PlaybooksReliabilityOn-call

The worst part of being on-call isn't the 3am page. It's spending 45 minutes doing the exact same thing you did last time: check lag, scale consumers, reroute traffic, resolve the alert. Toil that could be automated.

What are Playbooks?

Playbooks are declarative YAML files that define automated responses to queue anomalies. When a trigger condition is met — lag exceeds a threshold, error rate spikes, consumer heartbeat times out — OwlMQ executes the playbook automatically.

They're not scripts. They're not Kubernetes operators. They're a purpose-built DSL for queue remediation that understands OwlMQ's primitives natively.

A real example

Here's the playbook we run in production for our payment queue:

name: payment-queue-healing
triggers:
  - type: lag_exceeded
    threshold: 10000
    window: 60s
actions:
  - step: scale_consumers
    scale_by: 3x
  - step: alert
    channels: [slack, pagerduty]
    severity: high

When lag exceeds 10,000 messages in a 60-second window, OwlMQ automatically triples the consumer count and fires an alert. No human needed. The alert tells us what happened and what was done about it — not a page asking us to figure it out.

Dry-run mode

Every playbook can be tested with --dry-run before deploying. OwlMQ simulates the trigger conditions and shows exactly which actions would execute, in what order, with what parameters. We've seen teams catch configuration mistakes before they matter using dry-run in CI.

Rollback and safety

Playbooks include a rollback block that executes if any step fails. OwlMQ takes a snapshot of queue state before execution and can restore it automatically. This makes playbooks safe to run aggressively — if something goes wrong, the system restores itself.

Playbooks are available on Pro and Enterprise plans. We're rolling them out in beta now — if you want early access, reach out in Discord.