A user clicks "Export my report" and then stares at a spinner for forty seconds while the server crunches numbers and renders a PDF. Their request thread is held hostage the whole time, and if a hundred people click at once, the web server runs out of threads and everyone starts seeing timeouts — even people who just wanted to load the home page.
The Web-Queue-Worker style fixes this by getting slow work off the request path entirely. The web tier hands the heavy job to a queue and answers the user immediately; a separate worker does the actual grinding in the background.
The problem
When a web front end does slow work inline — sending a confirmation email, resizing an uploaded image, generating a report — it ties up the request thread for the entire duration of that work. The user waits, the connection stays open, and a thread that could be serving other requests is stuck doing one slow thing.
Under load this falls apart quickly. A burst of expensive requests exhausts the thread pool, latency climbs for everyone, and the whole site can grind to a halt because heavy background work and fast page loads are competing for the same resources.
How it works
The style splits the application into three cooperating pieces. The web front end handles incoming HTTP requests and is built to respond fast — when it hits something slow, it doesn't do the work itself, it drops a message describing the job onto a queue and returns right away. The queue buffers those background jobs, and a separate worker process pulls them off and does the heavy lifting asynchronously.
The web tier and the worker typically share data stores — a database, blob storage — so the worker can read the inputs and write back results that the front end can later show the user. The animation below traces this flow: a request arrives at the web tier, the slow job is parked on the queue, and the worker picks it up and processes it on its own time.
- Web front endHandles HTTP fast; offloads slow work to the queue.
- WorkerProcesses queued jobs in the background, off the request path.
The queue is the seam that makes everything else possible. It lets the web tier say "I've accepted this work" without saying "I've finished it," which is precisely what frees the request thread to move on to the next user.
Why the queue matters
The queue isn't just a hand-off; it's a shock absorber. This is queue-based load leveling in action: when traffic spikes, jobs pile up in the queue instead of overwhelming the worker, and the worker drains them at a steady, sustainable rate. The web tier never has to slow down just because the worker is busy.
It also lets the two halves scale independently. If the backlog grows, you add more workers reading from the same queue — that's competing consumers — so processing throughput becomes a dial you turn separately from your web capacity.
Living with asynchronous results
There's a catch you have to design for: because the web tier responds before the job is done, the result is asynchronous. The user gets an immediate "we're working on it," not the finished output, so you need a way to deliver the result later.
Common approaches are to have the worker write a status record the front end can poll, to email or notify the user when the job completes, or to push an update over a websocket. None of this is hard, but it is extra plumbing that an inline, synchronous design wouldn't need.
Because the web and worker share the same data stores and often the same codebase, it's easy for them to drift into a single tangled monolith — business logic smeared across both, deployed together, impossible to change in isolation. Keep the boundary deliberate: the contract between them should be the queue message, not a shared pile of internal functions.
When to use it
Web-Queue-Worker is a natural starting style for simple cloud applications: a web app with some background processing, no microservices sprawl, just two clearly separated halves connected by a queue. It maps cleanly onto managed cloud services and is one of the most common serverless shapes — an HTTP function for the web tier, a queue, and a queue-triggered function for the worker.
Reach for it whenever you have user-facing requests that shouldn't wait on slow work. Outgrow it when the worker starts doing many unrelated kinds of jobs, or when independent teams need to own and deploy pieces separately — at that point you're looking at splitting into proper services rather than one web-plus-worker pair.