A smoke detector doesn't wait for you to smell fire — it sniffs the air constantly and screams the moment something's wrong, while you can still do something about it. Software needs the same early warning.
Health Endpoint Monitoring gives every service a little built-in detector: a special URL that, when pinged, answers one question honestly — am I healthy enough to be handling requests right now? Something on the outside checks it regularly and reacts the instant the answer turns to no.
The problem
Services fail in quiet, sneaky ways. The process is still running, so the operating system thinks all is well — but it's lost its database connection, its disk is full, or a downstream dependency is timing out. From the outside it looks alive while it's actually serving errors.
Without an honest signal of fitness, your traffic router keeps cheerfully sending real users to a sick instance, and you find out it's broken from angry customers rather than from your tools. You need a way to ask each instance, directly and often, whether it should still be in the line of duty — and to ask it the way an outside user would, not from inside the box where everything looks fine.
- Blind balancerRoutes purely on whether the process is reachable. With no honest fitness signal, it keeps a broken instance in rotation.
- Zombie instanceThe process is running so the OS reports it 'up', but it's lost its database — it answers every request with errors.
- Real usersA share of users get steered straight to the dead process and hit errors. You learn it's broken from complaints, not your tools.
How it works
Each service publishes a dedicated endpoint — say /health — whose only job is to assess and report fitness. A naïve version just returns 200 OK to prove the process responds. A good one does a quick check of the things the service depends on: can it reach the database, is the cache responding, is there free disk, are critical credentials still valid? It rolls that up into a clear healthy/unhealthy verdict.
A monitoring agent then probes these endpoints on a schedule, ideally from outside the deployment so the check travels the same network path real requests do. When an instance reports unhealthy, the system reacts: a load balancer drops it from rotation, an alert fires, or orchestration restarts it. The diagram below shows a monitor polling several instances and steering traffic away from the one that's gone red.
- MonitorPings every instance's health endpoint on a schedule, from outside the deployment, the way a real user would.
- Load balancerUses health results to keep sending traffic to healthy instances and drops failing ones from rotation.
- InstanceExposes a /health endpoint that checks its real dependencies and reports an honest fit-to-serve verdict.
Make the check meaningful but never expensive. A health endpoint that runs a full query workload or fans out to every dependency on every probe can become a load source of its own — or report sick simply because it timed out under its own weight. Cache dependency checks for a few seconds and put a hard timeout on the whole thing, so it stays a fast, truthful pulse.
When to use it
Health endpoint monitoring is nearly always worth it for anything running in production behind a load balancer or orchestrator — it's the signal those systems rely on to keep traffic flowing to healthy instances. It pairs naturally with a circuit breaker, which stops calling a dependency the health checks have flagged, and with retry logic that backs off until health is restored.
The main pitfalls are checks that lie — too shallow and they miss real failures, too deep and they trigger false alarms or add load. Tune what "healthy" means to match what the service genuinely needs to do its job, and you get an early-warning system that quietly keeps bad instances away from your users.