StatusPage.me Help Center

Popular topics: creating a status page, connecting monitors, automatic incidents, custom domains, integrations and billing.

StatusPage.me Dec 18, 2025 Monitoring

Monitoring Accuracy & Detection Tracking

Learn how StatusPage.me measures detection speed, confirms outages across multiple locations, and surfaces accuracy metrics so you can trust incident timelines.


Key Concepts

  • Detection tracking: Records detected_at, confirmed_at, and detection_delay_ms for each incident
  • Confirmation burst: When the primary region detects a failure, a burst check is immediately sent to all secondary regions, which have a 20-second window to respond
  • Quorum-based confirmation: A majority of monitoring locations must agree before an incident is declared — prevents single-location blips from triggering false alerts
  • Adaptive post-incident monitoring: Optional 15-second checks for 5 minutes after recovery (paid plans)

How Detection Works: Step by Step

  1. The primary monitoring region runs a scheduled check and detects a failure
  2. detected_at is recorded immediately
  3. A confirmation burst is sent to all secondary regions with a 20-second deadline
  4. Secondary regions run their checks and report back
  5. When the number of failing regions reaches quorum, confirmed_at is set and the incident is created
  6. detection_delay_ms = confirmed_atdetected_at

If secondary regions report the service as healthy within the 20-second window, the primary failure is marked as a false positive and no incident is created.


Why Not Instant Notifications?

Many uptime monitoring vendors advertise “instant alerts.” In most cases, instant means they notify on the first failed check from a single location — before any confirmation step.

That approach has a cost: CDN hiccups, ISP blips, and transient network failures in one region all trigger pages. Engineers wake up, join a call, start a war room — and the service is already recovered. This is the false-positive problem.

StatusPage.me takes a different approach. When the primary monitoring region detects a failure, it immediately fires a confirmation burst to all secondary regions, which have a 20-second window to execute their own checks and report back. An alert only fires when a quorum of locations agrees the service is down.

The result: when you receive a StatusPage.me incident alert, the outage has been independently confirmed from multiple geographic locations. The tradeoff is a detection window measured in seconds to under a minute rather than a single check cycle — a deliberate choice to make every alert actionable.

If you need warnings before a monitor goes down entirely, see Down Prediction below.


Worst-Case Detection Time by Plan

Worst-case detection = one full check interval (to catch the next scheduled check) + up to ~23 seconds for confirmation burst and processing.

PlanCheck intervalLocationsQuorum requiredWorst-case detection
Free3 min (180s)11 of 1 — no secondary confirmation~3 minutes
Starter1 min (60s)43 of 4~83 seconds
Team30 sec64 of 6~53 seconds
Business30 sec95 of 9~53 seconds

Free plan note: With a single monitoring location, there are no secondary regions to confirm against. The first failed check from the primary location immediately creates an incident. This means faster detection but no false-positive filtering from additional regions.

Multi-location plans: No single region can trigger an incident on its own. The primary failure starts the 20-second confirmation window; secondaries must confirm within that window before an outage is declared.


Metrics & Dashboard

  • Metrics endpoint: /metrics/monitoring (IP-restricted)
  • Key signals:
    • adaptive_mode_count — monitors currently in the adaptive window
    • avg_detection_delay_ms — average detection delay over 24h
    • detection_delay_percentiles — p50 / p90 / p99 delays
    • monitors_by_interval — breakdown of intervals in use

Use these to spot slow detections and tune intervals or locations.


Improving Accuracy

TuningWhy it helps
Add monitoring regionsEnables quorum confirmation; reduces false positives
Shorten base interval (plan limits apply)Reduces time to first detection
Enable adaptive mode (paid)Faster re-detection immediately after recovery
Use a dedicated primary regionAnchors the confirmation burst to a reliable location

Related settings live in Monitors → Edit.


Plan Notes

  • Adaptive monitoring requires Starter, Team, Business, or Enterprise
  • Detection tracking is available on all plans
  • Shorter base intervals (30 seconds) require Team, Business, or Enterprise
  • Multi-region quorum confirmation requires at least 2 monitoring locations (Starter and above)

Troubleshooting

  • High detection_delay_ms: Consider shorter intervals or more locations
  • No detection data: Incidents created before this feature was introduced will not have tracking fields
  • Adaptive not available: Check your plan or see the adaptive monitoring article below

Early Warnings Before Downtime: Down Prediction

Quorum-confirmed incident alerts fire after a failure is established. Down Prediction fires before the monitor goes down.

Down Prediction watches for two patterns after every check:

  • Gradual degradation — a consistent upward slope in response times (linear regression over recent checks), where the current response time is already ≥ 1.5× the 30-day baseline
  • Sudden spikes — a single-check jump to ≥ 3× the baseline, persisted for at least 30 seconds to filter transient blips

When either pattern is detected, a “Possible downtime approaching” warning fires through all configured notification channels — Slack, Discord, email, webhooks, and others. This is a predictive warning, not a confirmed incident.

Down Prediction is the right answer if you want to get ahead of degradation rather than react to confirmed outages. Available on Team, Business, and Enterprise plans.

Down Prediction Alerts documentation


See Also

Was this article helpful?

Share this article: