ptrace

#resources #resources/programming #resources/programming/debugging #resources/programming/operating-systems

ptrace

The Linux syscall that lets one process inspect and control another. Foundation for every native-code debugger on Linux: GDB, LLDB, delve, rr.

man 2 ptrace. Standard since the 1980s. Hard to use directly; almost always wrapped by a debugger library.

What it can do

Process A (the debugger) calls ptrace on process B (the debuggee) to:

Attach (PTRACE_ATTACH) — claim B for debugging. Requires permissions (same UID, or CAP_SYS_PTRACE).
Pause B at any time.
Read CPU registers (PTRACE_GETREGS) — RIP (instruction pointer), RBP (base pointer), all of them.
Read memory (PTRACE_PEEKDATA) — one word at a time. Modern alternative: process_vm_readv for bulk reads.
Write memory (PTRACE_POKEDATA) — this is how breakpoints get installed. See Software Breakpoints.
Receive signal notifications — when B hits a SIGTRAP (because it executed an INT3 the debugger inserted), the kernel pauses B and notifies A.
Resume (PTRACE_CONT) or single-step (PTRACE_SINGLESTEP).
Detach when done.

Why it's privileged

ptrace is dangerous: arbitrary memory read/write into another process. Linux restricts it via Yama (/proc/sys/kernel/yama/ptrace_scope) and capabilities. By default, you can ptrace your own processes; ptracing others requires CAP_SYS_PTRACE or root.

This is why some Docker setups need --cap-add=SYS_PTRACE or --security-opt seccomp=unconfined to debug inside containers.

Equivalents on other OSes

macOS: Mach exception ports. Different API, same capabilities. LLDB on macOS uses these instead of ptrace (macOS has a vestigial ptrace but it's limited).
Windows: Windows Debug API (DebugActiveProcess, WaitForDebugEvent, ReadProcessMemory, WriteProcessMemory).
BSDs: ptrace, mostly compatible with Linux.

What ptrace doesn't do

It doesn't know about source code. ptrace operates on raw memory and CPU state. Mapping "line 42 of main.c" to a memory address is the job of DWARF Debug Symbols, parsed by the debugger library.
It doesn't work for managed runtimes. Python, JS, JVM use runtime-internal hooks instead. See Native vs Managed Debugging.
It doesn't time-travel. Tools like rr add deterministic record/replay on top of ptrace.

Key insight

Most of "what a debugger does" is built from these three primitives:

Read memory at address X
Write memory at address X
Pause / resume the process

Everything else (breakpoints, watchpoints, stepping, variable inspection) is the debugger library composing these primitives intelligently using DWARF as a map.

ptrace

What it can do

Why it's privileged

Equivalents on other OSes

What ptrace doesn't do

Key insight

See also