Missed runs
Detect when scheduled processes fail to report completion.
Monitor Type
Ensure scheduled jobs and background workers report in on time.
Detect when scheduled processes fail to report completion.
Alert when jobs finish outside expected timing windows.
Track consistency of recurring tasks over time.
Validate hourly and daily jobs complete as expected.
Catch stalled ETL and asynchronous processing workflows quickly.
Confirm background sync processes continue sending heartbeats.
Start with services and workflows that create direct customer or revenue impact.
Use warning and critical layers so on-call responders get signal without alert fatigue.
Simulate failures and verify acknowledgement, assignment, and recovery behavior end-to-end.
| Signal | Recommended baseline | Escalate when |
|---|---|---|
| Expected heartbeat interval | Within schedule window | Missed 1 full interval |
| Completion delay | < 2 minutes | > 5 minutes over baseline |
| Consecutive misses | 0 | >= 2 consecutive misses |
Use it for jobs that should report completion on a predictable schedule.
Start with warning at one missed run and critical at two consecutive misses.
Yes. They catch hidden background failures where user-facing endpoints still appear healthy.
Pair this monitor to increase coverage and improve incident triage confidence.
Use together to reduce blind spots and catch degradation before customer impact.
Notify the right responders instantly across channels your team already uses.
Deliver rapid alerts with fallback channels for critical incidents.
Route monitor events directly into team collaboration channels.
Trigger downstream workflows in PagerDuty, Opsgenie, and internal tools.
Run checks from multiple regions to isolate local routing issues from global outages.
Pause checks during planned maintenance to keep alert noise low and signal clear.
Keep stakeholders informed when incidents remain open for longer durations.
Coordinate internal and customer updates with status page friendly incident workflows.
"We moved from delayed outage discovery to immediate, actionable alerts with clear ownership."
Create your monitor, define escalation policy, and start getting reliable signal in minutes.