Feature: Symbolic DWARF linking to local variables and function parameters in function splices

# Specification for GCC Plugin and Shiva Runtime for DWARF-Driven Variable Access in Naked Function Splices

## Overview
This specification defines a GCC compiler plugin and Shiva runtime patching enhancements to enable **naked** function splices to access local variables and parameters in a target executable by DWARF symbol name (e.g., `char *str` in function `foo`). The plugin parses the target executable’s DWARF information at compile time to determine variable locations (register or stack offset) and emits precise instructions (register-to-register or memory-to-register) using the exact register or stack offset from `DW_AT_location`. Custom relocations (`R_X86_64_SHIVA_VAR_ACCESS`) allow Shiva to patch the source operand (for reads) or destination operand (for writes) at runtime if the target binary’s DWARF information changes, preserving the compiler’s register allocation for the unpatched operand. The splice function, being naked, requires manual stack frame management (e.g., `sub $N, %rsp`) for its own local variables but accesses target variables via the target’s stack frame (using `%rbp`) or registers.

## Goals
- Enable symbolic variable access in Shiva splices using DWARF `DW_AT_location`.
- Emit precise instructions at compile time using exact register or stack offset from DWARF, avoiding provisional offsets.
- Generate relocations to patch only the source operand (for reads) or destination operand (for writes) at runtime if the target binary function changes and hence the DWARF symbol values would change too, requiring patching of the instructions.
- Manually allocate stack space in the naked splice function for its own locals, using the target function’s `%rbp` for stack-based target variables.
- Ensure compatibility with `__shiva_splice_fn_name_foo`-style splices, adapting to its naked function structure.

## Requirements

### 1. GCC Compiler Plugin
The plugin operates at the RTL pass (`PASS_NAME_expand`) to analyze variable accesses and emit instructions with custom relocations for a naked splice function.

#### 1.1 DWARF Parsing
- **Input**: Target executable’s ELF file containing DWARF sections (`.debug_info`, `.debug_loc` or `.debug_loclists`, `.debug_abbrev`).
- **Process**:
  - Use `libdwarf` to parse DWARF sections.
  - Locate the `DW_TAG_subprogram` DIE for the target function (e.g., `foo`).
  - Find the variable’s DIE (e.g., `DW_TAG_variable` or `DW_TAG_formal_parameter` for `char *str`).
  - Extract `DW_AT_location` (single expression or location list) at the splice point’s program counter (PC).
  - Determine the variable’s location:
    - **Register-based**: E.g., `DW_OP_reg1` (%rcx).
    - **Stack-based**: E.g., `DW_OP_breg5 -32` (-0x20(%rbp)).
- **Output**: Exact register (e.g., `%rcx`) or stack offset (e.g., `-0x20(%rbp)`) for instruction emission.

#### 1.2 Instruction Emission
- **Input**: Splice code with dummy variable declarations (e.g., `char *str;`) and `__attribute__((naked))`.
- **Process**:
  - For each access to a dummy variable in the splice code:
    - **Read**:
      - If register-based (e.g., `%rcx` per `DW_AT_location`): Emit `mov %rcx, %reg` (e.g., `48 8b c1` for `%rax`), where `%reg` is the compiler’s chosen destination register.
      - If stack-based (e.g., `DW_OP_breg5 -32`): Emit `mov -0x20(%rbp), %reg` (e.g., `48 8b 45 e0`), using the exact stack offset from DWARF.
    - **Write**:
      - If register-based: Emit `mov %reg, %rcx` (e.g., `48 89 c1`).
      - If stack-based: Emit `mov %reg, -0x20(%rbp)` (e.g., `48 89 45 e0`), using the exact stack offset.
  - Use `%rbp` to access the target function’s stack frame, assuming Shiva preserves the target’s `%rbp` during splicing.
- **Stack Frame for Splice**:
  - Since the splice is naked, the plugin must manually emit stack frame setup at the function’s start:
    ```asm
    sub $N, %rsp  # Allocate space for splice’s locals
    ```
    where `$N` is sufficient for the splice’s own local variables (e.g., 32 bytes for temporaries like `-0x18(%rsp)` in `__shiva_splice_fn_name_foo`-style splices).
  - Emit stack cleanup and return at the function’s end:
    ```asm
    add $N, %rsp
    ret
    ```
  - Do not allocate stack space for target variables, as they reside in the target’s stack frame (accessed via `%rbp`) or registers.

#### 1.3 Custom Relocation Generation
- **Relocation Type**: `R_X86_64_SHIVA_VAR_ACCESS` (new ELF relocation type).
- **Structure**:
  - Stored in `.rela.text` for standard ELF fields and `.shiva_reloc` for additional metadata.
  - Fields:
    - `r_offset`: Address of the instruction (e.g., `0x1000`).
    - `r_info`: Combines `R_X86_64_SHIVA_VAR_ACCESS` and symbol index (e.g., for `str`).
    - `r_addend`: Byte offset within the instruction to patch:
      - `2`: ModR/M byte for register patching (`SOURCE_REGISTER` or `DEST_REGISTER`).
      - `3`: Displacement byte for 8-bit stack offset (`SOURCE_OFFSET` or `DEST_OFFSET`).
      - `2`: ModR/M byte for 32-bit stack offset (to adjust `Mod` field and write 4-byte displacement).
    - `patch_field`: One of `SOURCE_REGISTER`, `SOURCE_OFFSET`, `DEST_REGISTER`, `DEST_OFFSET`.
    - `displacement_size`: `8` (8-bit stack offset), `32` (32-bit stack offset), or `0` (register-based).
    - `access_size`: Size of the access (e.g., `64` for `char *`, `32` for `int`).
    - `access_type`: `READ` or `WRITE`.
    - `src_register`: Source register for writes (e.g., `0` for `%rax`).
    - `dest_register`: Destination register for reads (e.g., `0` for `%rax`).
    - `symbol_name`: DWARF symbol name (e.g., `str`).
- **Examples**:
  - **Read, Register-Based** (`mov %rcx, %rax` at `0x1000`):
    ```plaintext
    { r_offset=0x1000, r_info=R_X86_64_SHIVA_VAR_ACCESS|symbol_idx(str), r_addend=2, patch_field=SOURCE_REGISTER, access_size=64, access_type=READ }
    ```
    Instruction: `48 8b c1` (patch ModR/M at `0x1002` from `c1` to `c2` for `%rdx`).
  - **Read, Stack-Based** (`mov -0x20(%rbp), %rax`):
    ```plaintext
    { r_offset=0x1000, r_info=R_X86_64_SHIVA_VAR_ACCESS|symbol_idx(str), r_addend=3, patch_field=SOURCE_OFFSET, displacement_size=8, access_size=64, access_type=READ }
    ```
    Instruction: `48 8b 45 e0` (patch displacement at `0x1003` from `e0` to `d0` for `-0x30`).
  - **Write, Register-Based** (`mov %rax, %rcx`):
    ```plaintext
    { r_offset=0x1000, r_info=R_X86_64_SHIVA_VAR_ACCESS|symbol_idx(str), r_addend=2, patch_field=DEST_REGISTER, access_size=64, access_type=WRITE }
    ```
    Instruction: `48 89 c1` (patch ModR/M from `c1` to `c2` for `%rdx`).
  - **Write, Stack-Based** (`mov %rax, -0x20(%rbp)`):
    ```plaintext
    { r_offset=0x1000, r_info=R_X86_64_SHIVA_X86_64_VAR_ACCESS|symbol_idx(str), r_addend=3, patch_field=DEST_OFFSET, displacement_size=8, access_size=64, access_type=WRITE }
    ```
    Instruction: `48 89 45 e0` (patch displacement from `e0` to `d0`).

#### 1.4 Stack Frame Considerations
- The splice function is naked (declared with `__attribute__((naked))`), requiring manual stack management.
- The plugin emits:
  - `sub $N, %rsp` at the function’s start to allocate space for the splice’s local variables (e.g., temporaries like `-0x18(%rsp)`).
  - `add $N, %rsp; ret` at the function’s end to clean up and return.
- Target variables are accessed via the target function’s `%rbp` (for stack-based variables) or registers for optimized code.

### 2. Shiva Runtime Patching
Shiva resolves `R_X86_64_SHIVA_VAR_ACCESS` relocations at runtime to handle changes in the target binary’s DWARF information.

#### 2.1 DWARF Parsing
- Load the target executable’s DWARF sections (`.debug_info`, `.debug_loc` or `.debug_loclists`).
- Locate the variable’s DIE (e.g., `str` in `foo`) and evaluate `DW_AT_location` at the splice point’s PC.
- Output: Register (e.g., `%rcx` via `DW_OP_reg1`) or stack offset (e.g., `-0x30(%rbp)` via `DW_OP_breg5 -48`).

#### 2.2 Patching Logic
- **Register-Based** (`SOURCE_REGISTER` or `DEST_REGISTER`):
  - Patch the ModR/M byte at `r_addend=2` to update the register (e.g., `c1` to `c2` for `%rcx` to `%rdx` in `mov %rcx, %rax`).
  - Instruction length remains unchanged (3 bytes for register-to-register moves).
- **Stack-Based** (`SOURCE_OFFSET` or `DEST_OFFSET`):
  - For 8-bit displacement (`displacement_size=8`): Patch the displacement byte at `r_addend=3` (e.g., `e0` to `d0` for `-0x20` to `-0x30`).
  - For 32-bit displacement (`displacement_size=32`): Patch the ModR/M byte at `r_addend=2` (e.g., `45` to `85` for `Mod=10`) and write a 4-byte displacement (e.g., `00 f0 ff ff` for `-0x1000`).
  - If instruction length changes (e.g., 4 to 7 bytes), pad subsequent bytes with NOPs (`90`).
- Use atomic patching techniques (e.g., `mprotect`) to ensure thread safety in a running process.

#### 2.3 Validation
- Verify register indices and stack offsets to prevent invalid instructions (e.g., invalid register or out-of-range offset).
- Check PC ranges in DWARF location lists to ensure the splice point is within the valid scope of the variable’s location.

### 3. Example Workflow
#### Target Function (`foo`)
```c
void foo() {
    char *str = "hello"; // DW_AT_location: DW_OP_breg5 -32 or DW_OP_reg1
    ...
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Symbolic DWARF linking to local variables and function parameters in function splices #33

Specification for GCC Plugin and Shiva Runtime for DWARF-Driven Variable Access in Naked Function Splices

Overview

Goals

Requirements

1. GCC Compiler Plugin

1.1 DWARF Parsing

1.2 Instruction Emission

1.3 Custom Relocation Generation

1.4 Stack Frame Considerations

2. Shiva Runtime Patching

2.1 DWARF Parsing

2.2 Patching Logic

2.3 Validation

3. Example Workflow

Target Function (`foo`)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature: Symbolic DWARF linking to local variables and function parameters in function splices #33

Description

Specification for GCC Plugin and Shiva Runtime for DWARF-Driven Variable Access in Naked Function Splices

Overview

Goals

Requirements

1. GCC Compiler Plugin

1.1 DWARF Parsing

1.2 Instruction Emission

1.3 Custom Relocation Generation

1.4 Stack Frame Considerations

2. Shiva Runtime Patching

2.1 DWARF Parsing

2.2 Patching Logic

2.3 Validation

3. Example Workflow

Target Function (foo)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Target Function (`foo`)