Monitoring Accuracy & Detection Tracking
Learn how StatusPage.me measures detection speed, confirms outages across multiple locations, and surfaces accuracy metrics so you can trust incident timelines.
Key Concepts
- Detection tracking: Records
detected_at,confirmed_at, anddetection_delay_msfor each incident - Confirmation burst: When the primary region detects a failure, a burst check is immediately sent to all secondary regions, which have a 20-second window to respond
- Quorum-based confirmation: A majority of monitoring locations must agree before an incident is declared — prevents single-location blips from triggering false alerts
- Adaptive post-incident monitoring: Optional 15-second checks for 5 minutes after recovery (paid plans)
How Detection Works: Step by Step
- The primary monitoring region runs a scheduled check and detects a failure
detected_atis recorded immediately- A confirmation burst is sent to all secondary regions with a 20-second deadline
- Secondary regions run their checks and report back
- When the number of failing regions reaches quorum,
confirmed_atis set and the incident is created detection_delay_ms=confirmed_at−detected_at
If secondary regions report the service as healthy within the 20-second window, the primary failure is marked as a false positive and no incident is created.
Why Not Instant Notifications?
Many uptime monitoring vendors advertise “instant alerts.” In most cases, instant means they notify on the first failed check from a single location — before any confirmation step.
That approach has a cost: CDN hiccups, ISP blips, and transient network failures in one region all trigger pages. Engineers wake up, join a call, start a war room — and the service is already recovered. This is the false-positive problem.
StatusPage.me takes a different approach. When the primary monitoring region detects a failure, it immediately fires a confirmation burst to all secondary regions, which have a 20-second window to execute their own checks and report back. An alert only fires when a quorum of locations agrees the service is down.
The result: when you receive a StatusPage.me incident alert, the outage has been independently confirmed from multiple geographic locations. The tradeoff is a detection window measured in seconds to under a minute rather than a single check cycle — a deliberate choice to make every alert actionable.
If you need warnings before a monitor goes down entirely, see Down Prediction below.
Worst-Case Detection Time by Plan
Worst-case detection = one full check interval (to catch the next scheduled check) + up to ~23 seconds for confirmation burst and processing.
| Plan | Check interval | Locations | Quorum required | Worst-case detection |
|---|---|---|---|---|
| Free | 3 min (180s) | 1 | 1 of 1 — no secondary confirmation | ~3 minutes |
| Starter | 1 min (60s) | 4 | 3 of 4 | ~83 seconds |
| Team | 30 sec | 6 | 4 of 6 | ~53 seconds |
| Business | 30 sec | 9 | 5 of 9 | ~53 seconds |
Free plan note: With a single monitoring location, there are no secondary regions to confirm against. The first failed check from the primary location immediately creates an incident. This means faster detection but no false-positive filtering from additional regions.
Multi-location plans: No single region can trigger an incident on its own. The primary failure starts the 20-second confirmation window; secondaries must confirm within that window before an outage is declared.
Metrics & Dashboard
- Metrics endpoint:
/metrics/monitoring(IP-restricted) - Key signals:
adaptive_mode_count— monitors currently in the adaptive windowavg_detection_delay_ms— average detection delay over 24hdetection_delay_percentiles— p50 / p90 / p99 delaysmonitors_by_interval— breakdown of intervals in use
Use these to spot slow detections and tune intervals or locations.
Improving Accuracy
| Tuning | Why it helps |
|---|---|
| Add monitoring regions | Enables quorum confirmation; reduces false positives |
| Shorten base interval (plan limits apply) | Reduces time to first detection |
| Enable adaptive mode (paid) | Faster re-detection immediately after recovery |
| Use a dedicated primary region | Anchors the confirmation burst to a reliable location |
Related settings live in Monitors → Edit.
Plan Notes
- Adaptive monitoring requires Starter, Team, Business, or Enterprise
- Detection tracking is available on all plans
- Shorter base intervals (30 seconds) require Team, Business, or Enterprise
- Multi-region quorum confirmation requires at least 2 monitoring locations (Starter and above)
Troubleshooting
- High detection_delay_ms: Consider shorter intervals or more locations
- No detection data: Incidents created before this feature was introduced will not have tracking fields
- Adaptive not available: Check your plan or see the adaptive monitoring article below
Early Warnings Before Downtime: Down Prediction
Quorum-confirmed incident alerts fire after a failure is established. Down Prediction fires before the monitor goes down.
Down Prediction watches for two patterns after every check:
- Gradual degradation — a consistent upward slope in response times (linear regression over recent checks), where the current response time is already ≥ 1.5× the 30-day baseline
- Sudden spikes — a single-check jump to ≥ 3× the baseline, persisted for at least 30 seconds to filter transient blips
When either pattern is detected, a “Possible downtime approaching” warning fires through all configured notification channels — Slack, Discord, email, webhooks, and others. This is a predictive warning, not a confirmed incident.
Down Prediction is the right answer if you want to get ahead of degradation rather than react to confirmed outages. Available on Team, Business, and Enterprise plans.
→ Down Prediction Alerts documentation
See Also
- Setting Your Primary Monitoring Region — control which region anchors your quorum
- Adaptive Post-Incident Monitoring
- Monitoring Locations
- Monitoring Types
- Setting Up Monitor Alerts
- Uptime & SLA Reports