Uptime Monitoring·5 min read

Uptime Monitoring 101: What to Track and How

Uptime monitoring catches a failure mode error tracking tools miss entirely: your service not responding at all. Here's what to actually monitor and how often.

Error monitoring tools like Sentry catch exceptions thrown inside a running application. They don't catch a service that isn't responding at all — a crashed server, an expired SSL certificate, a DNS misconfiguration, or a network partition. That distinct failure mode is what uptime monitoring exists to catch, usually before your users tell you about it.

What to actually monitor

Core user-facing endpoints — Your homepage, login flow, and any revenue-critical page — the things that, if down, directly cost you money or trust.
API health endpoints — A dedicated `/health` or `/status` route that checks real dependencies (database connection, critical third-party APIs), not just "the server process is running."
SSL certificate expiry — A surprisingly common, entirely preventable outage cause — most monitoring tools can alert weeks before expiry, not after the certificate has already lapsed.
DNS resolution — Catches misconfigurations introduced by a DNS provider change or expired domain, which traditional uptime checks sometimes miss if they cache results.

How often should checks run?

Revenue-critical endpoints — Every 30 seconds to 1 minute — frequent enough that an outage is caught within minutes, not the better part of an hour.
Standard production services — Every 1–5 minutes is the typical default across most tools and is sufficient for most teams.
Low-stakes internal tools — Every 10–15 minutes is usually fine — frequent checks here mostly generate noise without meaningful benefit.

Avoid alert fatigue

A single transient blip shouldn't page anyone at 3am. Most tools let you require 2–3 consecutive failed checks before alerting — use that, or you'll start ignoring real alerts along with the noise.

Do you need a public status page?

A status page shows your service's current and historical uptime publicly, often with incident updates during outages. It reduces support load during incidents (customers check the page instead of all emailing support simultaneously) and is increasingly expected by B2B customers as a baseline trust signal — Better Uptime and Pingdom both offer this built in.

Next step

Use the RadarTrek Uptime Monitoring screener to compare alerting depth, check coverage, and price across tools before picking one.

Ready to decide?

Use the Uptime Monitoring Screener to filter by your criteria and compare options head-to-head.

Open screener View all tools

What to actually monitor

Core user-facing endpoints — Your homepage, login flow, and any revenue-critical page — the things that, if down, directly cost you money or trust.

API health endpoints — A dedicated `/health` or `/status` route that checks real dependencies (database connection, critical third-party APIs), not just "the server process is running."

SSL certificate expiry — A surprisingly common, entirely preventable outage cause — most monitoring tools can alert weeks before expiry, not after the certificate has already lapsed.

DNS resolution — Catches misconfigurations introduced by a DNS provider change or expired domain, which traditional uptime checks sometimes miss if they cache results.

How often should checks run?

Revenue-critical endpoints — Every 30 seconds to 1 minute — frequent enough that an outage is caught within minutes, not the better part of an hour.

Standard production services — Every 1–5 minutes is the typical default across most tools and is sufficient for most teams.

Low-stakes internal tools — Every 10–15 minutes is usually fine — frequent checks here mostly generate noise without meaningful benefit.

Avoid alert fatigue

Do you need a public status page?

Next step

Use the RadarTrek Uptime Monitoring screener to compare alerting depth, check coverage, and price across tools before picking one.