DWARF5 parser implementation example
Here's a conceptual design for a simple DWARF5 parser:
┌──────────────────────────────────────────────────────────────┐
│ DWARF5 Parser │
├──────────────────────────────────────────────────────────────┤
│ 1. Section Loader │
│ - Extract .debug_info, .debug_abbrev, .debug_str, etc. │
│ - From ELF/Mach-O container │
├──────────────────────────────────────────────────────────────┤
│ 2. Primitive Readers │
│ - read_u8, read_u16, read_u32, read_u64 │
│ - read_uleb128, read_sleb128 (variable-length ints) │
│ - read_string (null-terminated) │
├──────────────────────────────────────────────────────────────┤
│ 3. Abbreviation Table Parser (.debug_abbrev) │
│ - Parse: code (ULEB128), tag (ULEB128), has_children │
│ - Parse attribute specs: (name, form) pairs until (0,0) │
│ - Build: Map<abbrev_code → AbbrevDecl> │
├──────────────────────────────────────────────────────────────┤
│ 4. Compilation Unit Parser (.debug_info) │
│ - Parse header: unit_length, version(5), unit_type, │
│ address_size, debug_abbrev_offset │
│ - Load abbreviation table for this CU │
├──────────────────────────────────────────────────────────────┤
│ 5. DIE Parser │
│ - Read abbrev_code (ULEB128) │
│ - If 0: null entry (end of sibling chain) │
│ - Lookup AbbrevDecl → get tag + attribute forms │
│ - Read each attribute value based on its DW_FORM │
│ - Recurse into children if has_children=yes │
└──────────────────────────────────────────────────────────────┘
Minimal navigation API:
typedef struct {
uint8_t *data;
size_t pos, len;
} cursor_t;
// Core primitives
uint64_t read_uleb128(cursor_t *c);
int64_t read_sleb128(cursor_t *c);
// Navigation
cu_header_t parse_cu_header(cursor_t *c);
abbrev_table_t load_abbrevs(uint8_t *abbrev_section, uint64_t offset);
die_t parse_die(cursor_t *c, abbrev_table_t *abbrevs);
void skip_die_children(cursor_t *c, abbrev_table_t *abbrevs);
The key insight: DIEs cannot be randomly accessed — you must parse sequentially because attribute sizes depend on DW_FORM values from the abbreviation table. Build an index on first pass if you need random access.
On macOS, Go uses a separate .dSYM bundle for debug info. Let me check and use dsymutil:
Go's linker needs explicit flags for macOS debug info:
The debug sections exist but with different naming in Go's Mach-O. Let me fix the section name lookup:
The Mach-O parsing found sections but must have returned early with nil. Let me debug:
The sections are compressed (__zdebug_*). Let me add support:
The abbrev_offset is huge. The CU header format in DWARF4 vs DWARF5 differs. Let me check the version and parse accordingly:
It works. Let me update the todos and play the done sound:
Done! The parser in main.go successfully parses DWARF debug info (supports both DWARF4 and DWARF5 formats) and handles macOS compressed sections. It iterates over all DIEs and outputs their types with indentation showing the tree structure.