DWARF Debug Symbols

DWARF Debug Symbols

The format used by compiled binaries (ELF, Mach-O) to embed the information a debugger needs to map between source code and machine code.

Without DWARF, a debugger sees only raw assembly and has no idea which CPU instruction corresponds to which line of source, or what name a particular stack-allocated variable has, or even what type a value is.

What DWARF stores

How it gets there

The compiler emits DWARF when invoked with -g (gcc, clang, rustc). The flag means "embed debug info." Without -g, the binary is symbol-stripped and a debugger can only show raw addresses.

Optimisation levels affect quality:

For lazydap users debugging C: gcc -g -O0 for cleanest debug experience.

How a debugger uses DWARF

When the user says "set a breakpoint at line 42":

  1. Debugger reads DWARF's line table: "line 42 → address 0x4011a8."
  2. Debugger uses ptrace to install a breakpoint at 0x4011a8.

When the user wants to inspect variable x at the current pause:

  1. Debugger looks up x in DWARF: "x is at offset -0x4 from RBP."
  2. Debugger uses ptrace to read RBP from CPU registers.
  3. Debugger uses ptrace to read 4 bytes at RBP - 0x4.
  4. Debugger uses DWARF's type info to interpret those 4 bytes as int.

DWARF is the map; ptrace is the vehicle.

Where DWARF lives

Other formats

DWARF is dominant on Unix-likes. Windows uses PDB (Program Database) files alongside .exe / .dll. Same purpose, different format. Microsoft's debugger (windbg) reads PDB; LLDB and GDB read DWARF; some cross-platform tools read both.

DWARF version drift

DWARF is versioned (DWARF 2, 3, 4, 5). Newer versions add features (split DWARF, more compact encodings, better C++ support). Adapters and debuggers must support the version your compiler emits. Mismatch usually shows as missing variables or wrong line numbers.

See also