Classify Async Work Before Refactoring
Classify async work before refactoring
Before adding parallelism, classify the work into one of five buckets. The bucket determines which Tokio primitive is correct. Skipping the classification step is how you end up calling spawn_blocking on a forever loop or holding the async scheduler hostage with sync HTTP.
The five buckets
1. Async I/O
Examples: socket reads/writes, IPC over tokio::net, reqwest HTTP calls, async IMAP/SMTP/Gmail API requests, async SQLite through sqlx.
Tool: ordinary async tasks. Do not offload. The whole point of async I/O is that it yields cheaply; offloading it adds overhead with no benefit.
2. Short, cheap CPU work
Examples: lightweight parsing, small routing decisions, tiny data transforms.
Tool: keep inline in async code. The overhead of moving the work elsewhere dwarfs the work itself.
3. Bounded blocking or CPU-heavy work
Examples: image decode, PDF/text extraction through sync libraries, HTML cleanup with heavy CPU cost, large MIME parsing through sync-only APIs, compression.
Tool: tokio::task::spawn_blocking. This is its intended use: bounded sync work that will finish.
let decoded = tokio::task::spawn_blocking(move || decode_image(path)).await??;
4. Long-lived blocking loops
Examples: persistent worker thread, long-running sync event loop, background process that mostly blocks outside Tokio.
Tool: std::thread::spawn, not spawn_blocking. The blocking pool has a finite size and is shared across all blocking work. A loop that never finishes will hold a slot in that pool forever, starving other blocking work.
5. Heavy pure CPU parallelism
Examples: big ranking jobs, large batch transforms, expensive indexing passes, embedding prep.
Tool: a dedicated CPU executor like rayon, or bounded concurrency with a semaphore. Tokio's blocking pool is sized for blocking I/O, not for CPU saturation. Spraying spawn_blocking across heavy CPU work will exhaust the pool and degrade everything.
Why classification is the first step
Without it, the common bugs are:
- Sync HTTP /
std::thread::sleep/ heavy CPU inside async tasks → the async scheduler stalls because the worker thread can't yield spawn_blockingfor a forever loop → blocking pool slot held forever; other blocking work backs up- Spawn-storm without bounded concurrency → throttling, queue blow-up, lock contention, worse latency for the interactive surfaces
Arc<Mutex<_>>around shared mutable state because "we need to share it" → owner-task would have been simpler and faster
Each of these is a classification error. The work was misclassified, the wrong primitive followed. The fix isn't a better primitive — it's restating which bucket the work actually belongs in.
The questions that make the classification
Before you reach for any concurrency primitive, answer:
- Is this work I/O-bound, CPU-bound, or both?
- Does it run to completion or run forever?
- Is there an async API available, or only sync?
- Is the work bounded in size, or does it scale with input?
The answers map cleanly to the buckets. I/O-bound, finite, async API → bucket 1. CPU-bound, finite, sync API → bucket 3. Anything forever-running → bucket 4. Anything CPU-saturating → bucket 5.
What this rules out
Reflexive answers like "just spawn it" or "wrap it in spawn_blocking." Those are guesses. The bucket determines the primitive. If you don't know the bucket, you don't yet have enough understanding of the work to parallelize it.
See also
- Tokio — the synthesis hub
- Concurrent Is Not Parallel — picking between
join!andspawninside bucket 1 - Bounded Concurrency First — what to add once you spawn anything