Walk into a busy hospital emergency room and you'll notice it isn't first-come-first-served. Someone arriving with chest pains is seen before someone who's been waiting an hour with a sprained ankle. A nurse triages everyone and the most urgent cases move to the front, because order of arrival matters less than how critical the case is.
A priority queue brings that triage logic to your system. Instead of processing requests in the strict order they arrived, it lets more important work be handled first.
The problem
A plain queue is fair in the simplest way: first in, first out. Everything waits its turn. That's fine until not all work is equally important. A flood of low-stakes background jobs — say, batch report generation — can pile up in the queue, and now a time-sensitive request, like a paying customer's checkout confirmation, is stuck behind thousands of items it doesn't care about.
Strict FIFO has no notion that some messages deserve to be handled sooner. During a busy spell, the requests that matter most are exactly the ones most likely to be buried, and your most valuable users feel the slowest service.
- FIFO queueProcesses strictly in arrival order. It has no notion that some messages deserve to be handled sooner than others.
- WorkerPulls the oldest item next, so an urgent request must wait behind every routine job that arrived before it.
How it works
You attach a priority to each message and let consumers honor it. There are two common ways to build this. One is a single queue that natively understands priority, dequeuing the highest-priority message available rather than the oldest. The other — often simpler and more portable — is to use separate queues per priority level: a high-priority queue and a low-priority one, with consumers always draining the high queue first and only reaching for the low queue when the high one is empty.
You can also tune capacity per level: put more competing consumers on the urgent queue so it clears fast, and fewer on the routine one. Either way, urgent work no longer waits behind the routine backlog. The diagram below shows a high-priority path being served ahead of a low-priority one.
- High-priority queueUrgent or premium work lands here. Consumers always drain it first, and you can put more workers on it.
- Low-priority queueRoutine work waits here. Consumers reach for it only when the high queue is empty — but it must still drain eventually.
- WorkersHonor priority: serve high-priority messages ahead of low, so important work isn't stuck behind the backlog.
Beware starvation. If high-priority work keeps arriving, a naive 'always serve high first' rule means low-priority messages may never run. Reserve some consumer capacity for the low queue, or age messages up in priority the longer they wait, so routine work still drains instead of rotting at the bottom forever.
When to use it
A priority queue makes sense whenever your workload has a genuine mix of urgencies — premium versus free tiers, interactive requests versus background batch jobs, or alerts that must be acted on immediately alongside routine processing. It pairs naturally with queue load leveling, which smooths bursts, by adding a sense of which buffered work to tackle first.
Don't bother if all your messages are equally important — a plain FIFO queue is simpler and has no starvation risk to manage. And keep the number of priority levels small; two or three tiers capture most of the value, while a dozen finely graded levels just add complexity without making the system meaningfully smarter about what to do next.