|
| 1 | +# RTL Reflection Call Overhead Analysis |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document summarizes the flat overhead cost associated with using RTL's reflective call paths across all benchmarked call types. It categorizes overhead into **non-erased** and **erased** call paths and establishes upper and lower bounds observed across the entire test suite. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Non-Erased RTL Calls (Fast Path) |
| 10 | + |
| 11 | +### Summary |
| 12 | + |
| 13 | +Non-erased calls (`rtl::function` / `rtl::method`) exhibit near-zero overhead in all realistic workloads and only minimal overhead in microbenchmarks. |
| 14 | + |
| 15 | +### Overhead Range |
| 16 | + |
| 17 | +* **Best case (real workloads):** Effectively *0 ns* overhead. Measurements match direct calls within noise. |
| 18 | +* **Worst case (scale = 0):** |
| 19 | + |
| 20 | + * set: ~+0.4 ns overhead |
| 21 | + * get: ~+1.1 ns overhead |
| 22 | + * Relative cost: ~1.6×–1.8× direct call cost in pure overhead tests. |
| 23 | + |
| 24 | +### Practical Interpretation |
| 25 | + |
| 26 | +Non-erased RTL calls are effectively *free* for practical purposes and safe even for ultra-hot loops. |
| 27 | + |
| 28 | +--- |
| 29 | + |
| 30 | +## Erased RTL Calls (Most Expensive Path) |
| 31 | + |
| 32 | +### Absolute Worst Case (All Benchmarks) |
| 33 | + |
| 34 | +The highest overhead observed occurs in fully erased `get` calls on trivial functions. |
| 35 | + |
| 36 | +* **Fully Erased get:** ~+15–16 ns overhead |
| 37 | +* **Relative:** ~12×–13× slower than direct |
| 38 | +* **Condition:** Function body is trivial (scale = 0) |
| 39 | + |
| 40 | +This is the *maximum possible overhead* identified. |
| 41 | + |
| 42 | +### Typical Hotpath Overhead (Real Workload) |
| 43 | + |
| 44 | +When the function performs meaningful computation (scale ≥ 5): |
| 45 | + |
| 46 | +* **set:** +3–6% overhead |
| 47 | +* **get:** +5–10% overhead |
| 48 | +* **Erased target only:** +1–3% overhead, often nearly at parity with direct |
| 49 | + |
| 50 | +### Practical Interpretation |
| 51 | + |
| 52 | +Erased calls introduce a measurable cost in pure overhead scenarios, but once real work exists, the relative overhead becomes small and predictable. |
| 53 | + |
| 54 | +--- |
| 55 | + |
| 56 | +## Flat Overhead Price Card |
| 57 | + |
| 58 | +### Non-Erased RTL |
| 59 | + |
| 60 | +* **Realistic Overhead:** 0–1 ns |
| 61 | +* **Pathological Worst Case:** ~1.2 ns |
| 62 | +* **Use Case:** Safe for ultra-hot loops; equivalent to direct calls and often faster than `std::function`. |
| 63 | + |
| 64 | +### Erased RTL |
| 65 | + |
| 66 | +* **Maximum Overhead Across All Tests:** ~16 ns |
| 67 | +* **Realistic Overhead:** +3–10% per call |
| 68 | +* **Target-Only Erasure:** +1–3% |
| 69 | +* **Use Case:** Suitable for high-performance code unless the function body is trivial. |
| 70 | + |
| 71 | +--- |
| 72 | + |
| 73 | +## One-Line Summary |
| 74 | + |
| 75 | +Across all callables, RTL's overhead ranges from **effectively zero** (non-erased) to a **maximum of ~16 ns** (fully erased trivial calls), with real-world workloads sitting comfortably at **+3–10%** overhead for erased calls. |
0 commit comments