How Debuggers Actually Work

#resources #resources/programming #resources/programming/debugging

How Debuggers Actually Work

Synthesis note. Stitches together the layers between "I press F5 in my editor" and "the kernel pauses my process at a chosen line." Each section links out to its own atomic note.

The full layer cake

┌──────────────────────────────────────────────────────┐
│  YOU (or your TUI, IDE, agent)                       │
└────────────────────┬─────────────────────────────────┘
                     │ "set a breakpoint at line 42"
                     ▼
┌──────────────────────────────────────────────────────┐
│  Frontend: lazydap / VS Code / nvim-dap              │
└────────────────────┬─────────────────────────────────┘
                     │ DAP — JSON messages, just communication
                     ▼
┌──────────────────────────────────────────────────────┐
│  Adapter: codelldb / debugpy / dlv-dap / lldb-dap    │
│  ← speaks DAP on one end, native debugger on other   │
└────────────────────┬─────────────────────────────────┘
                     │ Library calls (C++ API, Python API, etc.)
                     ▼
┌──────────────────────────────────────────────────────┐
│  Real debugger: LLDB · GDB · sys.settrace · V8       │
│  ← THIS is what knows how to debug                   │
└────────────────────┬─────────────────────────────────┘
                     │ OS debugging API
                     ▼
┌──────────────────────────────────────────────────────┐
│  ptrace (Linux) · Mach exceptions (macOS) · Win32    │
│  ← syscalls into the kernel                          │
└────────────────────┬─────────────────────────────────┘
                     │ kernel handles process control,
                     │ memory access, signal delivery
                     ▼
┌──────────────────────────────────────────────────────┐
│  Your debuggee process — paused, inspected, resumed  │
└──────────────────────────────────────────────────────┘

Six layers. Each is a wrapper on the next (Plumbing and Porcelain). Each has its own failure mode and its own diagnostic tool when something breaks.

What each layer does, in one sentence

Frontend — translates user intent into DAP messages. Doesn't know what a register is.
DAP — JSON wire format. No code executes here; it's pure communication.
Adapter — bridge process. Speaks DAP on one side, drives a real debugger on the other.
Real debugger (LLDB, GDB, runtime hook) — knows about DWARF, breakpoints, variable scoping. Where debugging actually happens.
ptrace / Mach / Win32 — OS-level syscalls that let one process inspect another's memory and CPU state.
Kernel — does the actual pausing, signal delivery, memory mediation between processes.

Concrete example: setting a breakpoint at line 42

You set a breakpoint at line 42 of main.c. DWARF says line 42 is at address 0x4011a8.

Frontend → DAP setBreakpoints { line: 42 } → adapter.
Adapter → "LLDB, set breakpoint at line 42 of main.c."
LLDB → consult DWARF → "line 42 = 0x4011a8."
LLDB → ptrace(PTRACE_PEEKDATA, pid, 0x4011a8) → reads the byte (say 0x55).
LLDB → ptrace(PTRACE_POKEDATA, pid, 0x4011a8, 0xCC) → overwrites with INT3.
LLDB remembers the original byte. See Software Breakpoints for the full mechanism.
Adapter → DAP "verified: true" → frontend → ● in gutter.

When the program runs, the CPU executes INT3 at 0x4011a8, raises a software interrupt, kernel converts to SIGTRAP, debugger gets notified, frontend shows the pause. To continue: restore original byte, single-step, re-insert INT3, resume.

(Detailed walkthrough in ptrace and Software Breakpoints.)

Native vs managed

The above is for native code (C, Rust, Go). Managed runtimes (Python, JS, JVM) skip the ptrace layer entirely — the runtime cooperates with the debugger via built-in hooks (sys.settrace, V8 Inspector Protocol, JDWP). Same DAP on top, completely different machinery underneath.

Common failure modes per layer

Problem	Likely layer	Diagnostic
"Could not connect to debugger"	Frontend ↔ adapter (DAP transport)	Adapter stderr; check if adapter spawned and listening
"Breakpoint shows ◯ unverified"	Adapter ↔ debugger (DWARF mismatch)	Did binary build with `-g`? Is source path correct?
"Variable shows `<optimised out>`"	Debugger ↔ DWARF	Build with `-O0` or `-Og`
"Cannot attach: operation not permitted"	OS layer	Yama / capabilities; check `/proc/sys/kernel/yama/ptrace_scope`
"Adapter died unexpectedly"	Real debugger crashed	Adapter's logs; usually a debugger bug
"Step jumps around"	DWARF + optimisation	Build with `-O0`

Why this matters for Lazydap

Lazydap sits at the top of this stack — frontend layer. It doesn't touch ptrace, DWARF, or LLDB directly. But understanding the layers below makes lazydap's design coherent: the DAP Adapter hides debugger quirks; the lazydap protocol layer hides DAP awkwardness; the CLI hides the protocol. Each layer is a porcelain on plumbing that handles one concern cleanly.

When something breaks during lazydap development, the layer cake tells you where to look first.

Other syntheses (sibling MOCs in the vault)

The other "How X actually works" synthesis notes — each is the entry point to its own cluster:

How Email Actually Works — SMTP / IMAP / MIME / threading / OAuth / internal model
How Daemons Work — daemon lifecycle, PID files, signals, auto-spawning
How Processes Talk to Each Other — IPC, Unix sockets, framing, JSON-RPC
The Elm Architecture (TEA) — Model / Update / View / Cmd / Reducers
Client-Agnostic Cores — headless core + many clients, the architectural shape

How Debuggers Actually Work

How Debuggers Actually Work

The full layer cake

What each layer does, in one sentence

Concrete example: setting a breakpoint at line 42

Native vs managed

Common failure modes per layer

Why this matters for Lazydap

See also (atomic notes)

Other syntheses (sibling MOCs in the vault)

Further reading