“The shift was quiet. They'd been using pagerduty for weeks, mostly out of obligation. Then one feature clicked into place — and suddenly the friction of alert fatigue from noisy, low-signal pages that train them to under-respond felt absurd. They couldn't go back.”
When I'm payment processing latency is above threshold, I want to get paged only for things that require human intervention right now, so I can diagnose and resolve incidents fast enough to minimize user impact.
A software engineer or site reliability engineer who is on a rotating on-call schedule and whose relationship with PagerDuty is defined by the moments it wakes them up. They've been paged at 3am. They've resolved incidents from their phone in bed. They've also been paged for something that wasn't an incident — a flaky alert, a threshold set too low, a monitoring rule that was never updated after the system changed. Every false positive erodes their trust in the alert and their willingness to respond with full urgency next time. They manage this tension carefully.
To make pagerduty the system of record for get paged only for things that require human intervention right now. Not aspirationally — operationally. The kind of intention that shows up as a daily habit, not a quarterly goal.
The tangible result: get paged only for things that require human intervention right now happens on schedule, without manual intervention, and without the anxiety of alert fatigue from noisy, low-signal pages that train them to under-respond. pagerduty has earned a place in the daily workflow rather than being tolerated in it.
It's 2:47am. PagerDuty fires. Payment processing latency is above threshold. They're awake, phone in hand. They open the incident. Linked to a Datadog alert. They open Datadog. The latency spike started 12 minutes ago and is ongoing. They check the deployment log — a deploy happened 40 minutes ago. They roll back. Latency normalizes in 3 minutes. Total time: 19 minutes. They write the incident summary, flag the deploy for post-mortem, and go back to sleep. This is the best version of this scenario. They know this.
Is on an on-call rotation that cycles every 1–2 weeks. Has PagerDuty mobile app with escalating alert tones. Has been on-call for 1–5 years. Manages their own alert rules — or inherits ones they didn't write. Reviews alert noise monthly — or plans to. Has written at least one runbook. Knows which runbooks are out of date. Has escalated an incident to a senior engineer at least twice. Has been that senior engineer at least once. Has strong opinions about alert thresholds that they will share at any retrospective.
They've stopped comparing alternatives. pagerduty is open before their first meeting. Get paged only for things that require human intervention right now runs on a cadence they didn't have to enforce. The strongest signal: they've started onboarding teammates into their setup unprompted.
It's not one thing — it's the accumulation. Alert fatigue from noisy, low-signal pages that train them to under-respond that they've reported, worked around, and accepted. Then a competitor demo shows the same workflow without the friction, and the sunk cost argument collapses. Their worldview — every page is a hypothesis: "this is real, and you need to act now" — makes them unwilling to compromise once a better option is visible.
Pairs with `sentry-primary-user` for the error-detection-to-incident-response chain. Contrast with `datadog-primary-user` for the monitoring-as-prevention vs. incident-response-when-it-fails distinction. Use with `gitlab-primary-user` for DevOps teams where the deployment pipeline is the most common incident source.