DWARF Debug Symbols
DWARF Debug Symbols
The format used by compiled binaries (ELF, Mach-O) to embed the information a debugger needs to map between source code and machine code.
Without DWARF, a debugger sees only raw assembly and has no idea which CPU instruction corresponds to which line of source, or what name a particular stack-allocated variable has, or even what type a value is.
What DWARF stores
- Source line tables — instruction at address
0x4011a8corresponds tomain.cline 42. - Symbol tables — function
mainlives at address0x4011a0–0x401234. - Variable locations — local variable
xis on the stack at offset-0x4from the base pointer. - Type information —
xis a 4-byte signed integer;tokensis an array of structToken; structTokenhas fieldskind,lexeme,posat offsets 0, 8, 24. - Inlined function info — that line of code came from this inlined call.
How it gets there
The compiler emits DWARF when invoked with -g (gcc, clang, rustc). The flag means "embed debug info." Without -g, the binary is symbol-stripped and a debugger can only show raw addresses.
Optimisation levels affect quality:
-O0(no optimisation) — DWARF is most accurate. Variables live where you expect; lines map cleanly.-O2and above — the optimiser may eliminate variables, reorder code, inline functions. DWARF tries to track this but values may show as<optimised out>and stepping may jump around.-Og— designed to be optimisation-friendly without breaking debugging. Compromise.
For lazydap users debugging C: gcc -g -O0 for cleanest debug experience.
How a debugger uses DWARF
When the user says "set a breakpoint at line 42":
- Debugger reads DWARF's line table: "line 42 → address
0x4011a8." - Debugger uses ptrace to install a breakpoint at
0x4011a8.
When the user wants to inspect variable x at the current pause:
- Debugger looks up
xin DWARF: "xis at offset-0x4from RBP." - Debugger uses ptrace to read RBP from CPU registers.
- Debugger uses ptrace to read 4 bytes at
RBP - 0x4. - Debugger uses DWARF's type info to interpret those 4 bytes as
int.
DWARF is the map; ptrace is the vehicle.
Where DWARF lives
- Linux ELF binaries: in
.debug_*sections within the binary itself. - macOS Mach-O: usually in a separate
.dSYMbundle next to the binary (hello.dSYM/). Created bydsymutil. - Stripped binaries: DWARF removed; debugger sees only raw addresses.
- Split DWARF /
.debug_info: optimisation where DWARF lives in separate.dwofiles for build-time speed.
Other formats
DWARF is dominant on Unix-likes. Windows uses PDB (Program Database) files alongside .exe / .dll. Same purpose, different format. Microsoft's debugger (windbg) reads PDB; LLDB and GDB read DWARF; some cross-platform tools read both.
DWARF version drift
DWARF is versioned (DWARF 2, 3, 4, 5). Newer versions add features (split DWARF, more compact encodings, better C++ support). Adapters and debuggers must support the version your compiler emits. Mismatch usually shows as missing variables or wrong line numbers.
See also
- How Debuggers Actually Work — full stack with worked example
- ptrace — the vehicle
- Software Breakpoints — the typical use
- DWARF spec: https://dwarfstd.org/