-
-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Specification for GCC Plugin and Shiva Runtime for DWARF-Driven Variable Access in Naked Function Splices
Overview
This specification defines a GCC compiler plugin and Shiva runtime patching enhancements to enable naked function splices to access local variables and parameters in a target executable by DWARF symbol name (e.g., char *str in function foo). The plugin parses the target executable’s DWARF information at compile time to determine variable locations (register or stack offset) and emits precise instructions (register-to-register or memory-to-register) using the exact register or stack offset from DW_AT_location. Custom relocations (R_X86_64_SHIVA_VAR_ACCESS) allow Shiva to patch the source operand (for reads) or destination operand (for writes) at runtime if the target binary’s DWARF information changes, preserving the compiler’s register allocation for the unpatched operand. The splice function, being naked, requires manual stack frame management (e.g., sub $N, %rsp) for its own local variables but accesses target variables via the target’s stack frame (using %rbp) or registers.
Goals
- Enable symbolic variable access in Shiva splices using DWARF
DW_AT_location. - Emit precise instructions at compile time using exact register or stack offset from DWARF, avoiding provisional offsets.
- Generate relocations to patch only the source operand (for reads) or destination operand (for writes) at runtime if the target binary function changes and hence the DWARF symbol values would change too, requiring patching of the instructions.
- Manually allocate stack space in the naked splice function for its own locals, using the target function’s
%rbpfor stack-based target variables. - Ensure compatibility with
__shiva_splice_fn_name_foo-style splices, adapting to its naked function structure.
Requirements
1. GCC Compiler Plugin
The plugin operates at the RTL pass (PASS_NAME_expand) to analyze variable accesses and emit instructions with custom relocations for a naked splice function.
1.1 DWARF Parsing
- Input: Target executable’s ELF file containing DWARF sections (
.debug_info,.debug_locor.debug_loclists,.debug_abbrev). - Process:
- Use
libdwarfto parse DWARF sections. - Locate the
DW_TAG_subprogramDIE for the target function (e.g.,foo). - Find the variable’s DIE (e.g.,
DW_TAG_variableorDW_TAG_formal_parameterforchar *str). - Extract
DW_AT_location(single expression or location list) at the splice point’s program counter (PC). - Determine the variable’s location:
- Register-based: E.g.,
DW_OP_reg1(%rcx). - Stack-based: E.g.,
DW_OP_breg5 -32(-0x20(%rbp)).
- Register-based: E.g.,
- Use
- Output: Exact register (e.g.,
%rcx) or stack offset (e.g.,-0x20(%rbp)) for instruction emission.
1.2 Instruction Emission
- Input: Splice code with dummy variable declarations (e.g.,
char *str;) and__attribute__((naked)). - Process:
- For each access to a dummy variable in the splice code:
- Read:
- If register-based (e.g.,
%rcxperDW_AT_location): Emitmov %rcx, %reg(e.g.,48 8b c1for%rax), where%regis the compiler’s chosen destination register. - If stack-based (e.g.,
DW_OP_breg5 -32): Emitmov -0x20(%rbp), %reg(e.g.,48 8b 45 e0), using the exact stack offset from DWARF.
- If register-based (e.g.,
- Write:
- If register-based: Emit
mov %reg, %rcx(e.g.,48 89 c1). - If stack-based: Emit
mov %reg, -0x20(%rbp)(e.g.,48 89 45 e0), using the exact stack offset.
- If register-based: Emit
- Read:
- Use
%rbpto access the target function’s stack frame, assuming Shiva preserves the target’s%rbpduring splicing.
- For each access to a dummy variable in the splice code:
- Stack Frame for Splice:
- Since the splice is naked, the plugin must manually emit stack frame setup at the function’s start:
where
sub $N, %rsp # Allocate space for splice’s locals
$Nis sufficient for the splice’s own local variables (e.g., 32 bytes for temporaries like-0x18(%rsp)in__shiva_splice_fn_name_foo-style splices). - Emit stack cleanup and return at the function’s end:
add $N, %rsp ret
- Do not allocate stack space for target variables, as they reside in the target’s stack frame (accessed via
%rbp) or registers.
- Since the splice is naked, the plugin must manually emit stack frame setup at the function’s start:
1.3 Custom Relocation Generation
- Relocation Type:
R_X86_64_SHIVA_VAR_ACCESS(new ELF relocation type). - Structure:
- Stored in
.rela.textfor standard ELF fields and.shiva_relocfor additional metadata. - Fields:
r_offset: Address of the instruction (e.g.,0x1000).r_info: CombinesR_X86_64_SHIVA_VAR_ACCESSand symbol index (e.g., forstr).r_addend: Byte offset within the instruction to patch:2: ModR/M byte for register patching (SOURCE_REGISTERorDEST_REGISTER).3: Displacement byte for 8-bit stack offset (SOURCE_OFFSETorDEST_OFFSET).2: ModR/M byte for 32-bit stack offset (to adjustModfield and write 4-byte displacement).
patch_field: One ofSOURCE_REGISTER,SOURCE_OFFSET,DEST_REGISTER,DEST_OFFSET.displacement_size:8(8-bit stack offset),32(32-bit stack offset), or0(register-based).access_size: Size of the access (e.g.,64forchar *,32forint).access_type:READorWRITE.src_register: Source register for writes (e.g.,0for%rax).dest_register: Destination register for reads (e.g.,0for%rax).symbol_name: DWARF symbol name (e.g.,str).
- Stored in
- Examples:
- Read, Register-Based (
mov %rcx, %raxat0x1000):Instruction:{ r_offset=0x1000, r_info=R_X86_64_SHIVA_VAR_ACCESS|symbol_idx(str), r_addend=2, patch_field=SOURCE_REGISTER, access_size=64, access_type=READ }48 8b c1(patch ModR/M at0x1002fromc1toc2for%rdx). - Read, Stack-Based (
mov -0x20(%rbp), %rax):Instruction:{ r_offset=0x1000, r_info=R_X86_64_SHIVA_VAR_ACCESS|symbol_idx(str), r_addend=3, patch_field=SOURCE_OFFSET, displacement_size=8, access_size=64, access_type=READ }48 8b 45 e0(patch displacement at0x1003frome0tod0for-0x30). - Write, Register-Based (
mov %rax, %rcx):Instruction:{ r_offset=0x1000, r_info=R_X86_64_SHIVA_VAR_ACCESS|symbol_idx(str), r_addend=2, patch_field=DEST_REGISTER, access_size=64, access_type=WRITE }48 89 c1(patch ModR/M fromc1toc2for%rdx). - Write, Stack-Based (
mov %rax, -0x20(%rbp)):Instruction:{ r_offset=0x1000, r_info=R_X86_64_SHIVA_X86_64_VAR_ACCESS|symbol_idx(str), r_addend=3, patch_field=DEST_OFFSET, displacement_size=8, access_size=64, access_type=WRITE }48 89 45 e0(patch displacement frome0tod0).
- Read, Register-Based (
1.4 Stack Frame Considerations
- The splice function is naked (declared with
__attribute__((naked))), requiring manual stack management. - The plugin emits:
sub $N, %rspat the function’s start to allocate space for the splice’s local variables (e.g., temporaries like-0x18(%rsp)).add $N, %rsp; retat the function’s end to clean up and return.
- Target variables are accessed via the target function’s
%rbp(for stack-based variables) or registers for optimized code.
2. Shiva Runtime Patching
Shiva resolves R_X86_64_SHIVA_VAR_ACCESS relocations at runtime to handle changes in the target binary’s DWARF information.
2.1 DWARF Parsing
- Load the target executable’s DWARF sections (
.debug_info,.debug_locor.debug_loclists). - Locate the variable’s DIE (e.g.,
strinfoo) and evaluateDW_AT_locationat the splice point’s PC. - Output: Register (e.g.,
%rcxviaDW_OP_reg1) or stack offset (e.g.,-0x30(%rbp)viaDW_OP_breg5 -48).
2.2 Patching Logic
- Register-Based (
SOURCE_REGISTERorDEST_REGISTER):- Patch the ModR/M byte at
r_addend=2to update the register (e.g.,c1toc2for%rcxto%rdxinmov %rcx, %rax). - Instruction length remains unchanged (3 bytes for register-to-register moves).
- Patch the ModR/M byte at
- Stack-Based (
SOURCE_OFFSETorDEST_OFFSET):- For 8-bit displacement (
displacement_size=8): Patch the displacement byte atr_addend=3(e.g.,e0tod0for-0x20to-0x30). - For 32-bit displacement (
displacement_size=32): Patch the ModR/M byte atr_addend=2(e.g.,45to85forMod=10) and write a 4-byte displacement (e.g.,00 f0 ff fffor-0x1000). - If instruction length changes (e.g., 4 to 7 bytes), pad subsequent bytes with NOPs (
90).
- For 8-bit displacement (
- Use atomic patching techniques (e.g.,
mprotect) to ensure thread safety in a running process.
2.3 Validation
- Verify register indices and stack offsets to prevent invalid instructions (e.g., invalid register or out-of-range offset).
- Check PC ranges in DWARF location lists to ensure the splice point is within the valid scope of the variable’s location.
3. Example Workflow
Target Function (foo)
void foo() {
char *str = "hello"; // DW_AT_location: DW_OP_breg5 -32 or DW_OP_reg1
...
}