Add clocks and cpubench subcommands; fix #161 PLL decode#162
Merged
Conversation
Closes #161. The new `clocks` subcommand exposes the running SoC's CPU PLL, DDR clock, and per-die HPM (Hardware Performance Monitor) characterization as YAML (default) or JSON (`--json`). Output is also wired into the default `ipctool` survey as a top-level `clocks:` section. The decode is table-driven (`struct pll_info`, `struct mux_info`, `struct hpm_info`, `struct raw_reg_info`) so future SoC families can be added as pure data next to `src/hal/hisi/clocks_v4.c`. V4 family register map, verified register-for-register on three lab boards (hi3516ev300 OpenIPC, gk7205v300 OpenIPC, gk7205v300 XM Sofia): CPU PLL (APLL) register pair (0x12010000, 0x12010004) ctrl_reg1 [23:0] FRACDIV [26:24] POSTDIV1 [30:28] POSTDIV2 ctrl_reg2 [11:0] FBDIV [17:12] REFDIV f = 24 MHz * FBDIV / (REFDIV * POSTDIV1 * POSTDIV2) DDR cksel mux 0x12010080 bits [5:3] (001->450 MHz, DDR3 x4) HPM 0x1202015c + 0x120280d8 aux fingerprint APLL bit layout borrowed from `struct hi3516a_pll_clock` in the HiSilicon SDK kernel patch (Hi3516EV200_SDK_V1.0.1.2 / linux-4.9.37 patch). The V4 SDK kernel patch itself defines no PLL struct -- V4's kernel clock driver treats CPU clock as fixed-rate post-boot, so the mask ROM is the only writer; both vendor u-boots leave the APLL register pair untouched (only 0x12010080 DDR cksel is written by both, confirmed via xxd of the reg_info_*.bin tables). Issue #161 originally identified `0x12010014` as "CPU PLL FBDIV" with `0x01770000 -> 952 MHz` (Board A) vs `0x018F0000 -> 1144 MHz` (Board B). Bench on three boards with three different values at `0x12010014` (`0x00000000`, `0x01770000`, `0x018F0000`) yields identical CPU clock to within 3% noise -- the value at `0x14` is a mask-ROM-written diagnostic that correlates with per-die HPM binning but does NOT drive any active clock. This commit exposes `0x12010014` and `0x1201000c` as `pll_shadow_*` raw entries (still useful for fleet silicon-binning comparisons) with a "not a live FBDIV" note. The multi-pattern triangulation that settled the question is now shipped as `ipctool cpubench` -- runs three inline-asm patterns with known Cortex-A7 throughput (dep_add 1 cyc/op, indep_add 0.5 cyc/op dual-issue, dep_mul 3-cyc latency) and back-calculates the clock from each, reporting median consensus and spread. Useful for future board bring-up when a PLL register decode is suspect. End-to-end on all three lab boards: hi3516ev300 OpenIPC: cpu_pll.freq_mhz=900, cpubench median 882 gk7205v300 OpenIPC: cpu_pll.freq_mhz=900, cpubench median 874 gk7205v300 XM Sofia: cpu_pll.freq_mhz=900, cpubench median 844* * XM has unkillable kernel/userspace background load that drags bench down ~5%; dep_mul (most reliable) still hits 863 MHz. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
widgetii
added a commit
that referenced
this pull request
May 14, 2026
PR review feedback (#163): the YAML emitted by the no-arg `ipctool` survey was too verbose for a daily-driver summary, and the previous commit's `note:` fields + `family:` line added noise without helping anyone read the output. Changes: - `clocks_build_json()` gains a `bool brief` parameter. - `brief = true` (used by `ipctool` no-arg survey): emits only `cpu_pll.freq_mhz`, `ddr.data_rate_mbps`, `hpm.bin`. - `brief = false` (used by `ipctool clocks`): emits the full register-by-register YAML the subcommand has always produced. - Remove the `family:` line; the chip name is already in `chip.model` one level up. - Remove every `note:` field. They were either workarounds for "we haven't fully decoded this register" or restated info already obvious from sibling fields. - Remove `pll_shadow_0c/14` (#162) and `ddr_pll/eth_pll/video_pll/ pll_lock_status` (this PR's earlier commit). Their FBDIV bit layout on V4 differs from APLL (field-shaped bytes sit at [23:16] rather than [11:0]) and the REFDIV/POSTDIV positions aren't confirmed -- shipping the raw words with a "decode TBD" note was noise. Per review: decypher or remove totally -- removed until a DDR-throughput probe and ethernet PHY rate cross-check anchor the layout. - Keep the lock-bit signal that IS reliable: APLL lock from PERI_CRG_PLL122 bit 0. Expose it as `cpu_pll.locked: true/false` in the full output (bit 0 = APLL on V4 confirmed across all three lab boards; other bits' assignment on V4 still TBD). Survey output is now (Goke V300 example): clocks: cpu_pll: freq_mhz: 900 ddr: data_rate_mbps: 1800 hpm: bin: mid Verified end-to-end on all three lab boards (hi3516ev300 OpenIPC, gk7205v300 OpenIPC, gk7205v300 XM Sofia): identical 8-line `clocks:` section in survey mode; full `ipctool clocks` output unchanged except for the dropped notes/family and the added cpu_pll.locked. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
widgetii
added a commit
that referenced
this pull request
May 14, 2026
* clocks: rename pll_shadow_* to ddr_pll/eth_pll, add video_pll + lock status Follow-up to #162. The `pll_shadow_0c` / `pll_shadow_14` entries shipped there were honest empirical observations ("vary by per-die HPM, don't drive CPU clock") but used misleading labels: those registers aren't shadows, they are real PLL configuration pairs for non-CPU domains. Source: the V4A / HI3516CV500 mask-ROM reverse engineering at github.com/widgetii/HI3516CV500-SDK/blob/master/sdk/bootrom/bootrom-re/ regmap-crg.h names them explicitly: APLL_CONFIG_0/1 = CRG_BASE + 0x00/0x04 CPU PLL DPLL_CONFIG_0/1 = CRG_BASE + 0x08/0x0C DDR PLL EPLL_CONFIG_0/1 = CRG_BASE + 0x10/0x14 Ethernet PLL VPLL_CONFIG_0/1 = CRG_BASE + 0x18/0x1C Video PLL PLL_LOCK_STAT = CRG_BASE + 0x1E8 lock status V4A is one generation up from V4 but shares the CRG layout (same APLL definition recurs in the Hi3516A SDK kernel patch and was empirically validated on V4 in #162), so the per-PLL labelling carries over. Changes: - `struct raw_reg_info` gains an optional `reg2` field so a single entry can dump both CONFIG_0 and CONFIG_1 of a PLL pair. - Rename `pll_shadow_0c` -> `ddr_pll`, `pll_shadow_14` -> `eth_pll`. Add `video_pll` at (0x18, 0x1C) which was previously unexposed. - Add `pll_lock_status` reading PERI_CRG_PLL122 (0x120101E8) raw, with a note explaining the V4A bootrom polls `0xB` (bits 0/1/3 for APLL/DPLL/EPLL) but V4 reads `0x5` (bits 0/2) on all three lab boards -- so V4's per-bit assignment likely differs from V4A's; ship the raw value, leave per-bit decode as a TODO. FBDIV decode for DPLL/EPLL/VPLL is also TODO: empirically the FBDIV-shaped bytes sit at bits [23:16] (0x8F/0x97/0x77 in the field data) rather than [11:0] like APLL, so the layout differs. Need a DDR-throughput probe and ethernet PHY rate cross-check before shipping a frequency calculation. The raw config pair is dumped verbatim so users can fleet-compare without misreading. Verified on three lab boards: cpu_pll.freq_mhz still 900 (unchanged), new fields render as expected, no regressions in ddr / hpm / cpu_running / cpubench outputs. The hi3516ev300 (HiSi-branded) firmware reads zero for both DPLL and EPLL config pairs while DDR still runs at 450 MHz, suggesting the HiSi firmware leaves those PLLs gated and derives DDR clock differently; the Goke firmware actively programs DPLL and EPLL. Out of scope for this PR. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * clocks: brief survey output, drop family/notes/non-decoded entries PR review feedback (#163): the YAML emitted by the no-arg `ipctool` survey was too verbose for a daily-driver summary, and the previous commit's `note:` fields + `family:` line added noise without helping anyone read the output. Changes: - `clocks_build_json()` gains a `bool brief` parameter. - `brief = true` (used by `ipctool` no-arg survey): emits only `cpu_pll.freq_mhz`, `ddr.data_rate_mbps`, `hpm.bin`. - `brief = false` (used by `ipctool clocks`): emits the full register-by-register YAML the subcommand has always produced. - Remove the `family:` line; the chip name is already in `chip.model` one level up. - Remove every `note:` field. They were either workarounds for "we haven't fully decoded this register" or restated info already obvious from sibling fields. - Remove `pll_shadow_0c/14` (#162) and `ddr_pll/eth_pll/video_pll/ pll_lock_status` (this PR's earlier commit). Their FBDIV bit layout on V4 differs from APLL (field-shaped bytes sit at [23:16] rather than [11:0]) and the REFDIV/POSTDIV positions aren't confirmed -- shipping the raw words with a "decode TBD" note was noise. Per review: decypher or remove totally -- removed until a DDR-throughput probe and ethernet PHY rate cross-check anchor the layout. - Keep the lock-bit signal that IS reliable: APLL lock from PERI_CRG_PLL122 bit 0. Expose it as `cpu_pll.locked: true/false` in the full output (bit 0 = APLL on V4 confirmed across all three lab boards; other bits' assignment on V4 still TBD). Survey output is now (Goke V300 example): clocks: cpu_pll: freq_mhz: 900 ddr: data_rate_mbps: 1800 hpm: bin: mid Verified end-to-end on all three lab boards (hi3516ev300 OpenIPC, gk7205v300 OpenIPC, gk7205v300 XM Sofia): identical 8-line `clocks:` section in survey mode; full `ipctool clocks` output unchanged except for the dropped notes/family and the added cpu_pll.locked. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
widgetii
added a commit
that referenced
this pull request
May 14, 2026
…165) Closes #160. `ipctool membw` runs three synthetic memory-bandwidth ops against large anonymous DDR buffers (mmap of /dev/zero, NOT malloc) and reports MB/s: write : memset over the buffer (W-only, libc-dependent) read : volatile uint32_t sum loop (R-only, libc-INdependent -- most trustworthy for cross-firmware comparison) copy : memcpy between two buffers (R+W, counted as 2x bytes) CLI matches the existing clocks/cpubench shape: --size MB buffer size per pass (default: 16; must exceed L2) --iters N passes per op (default: 16) --ops a,b,c comma list of write,read,copy (default: all) --json JSON output instead of YAML Output is YAML by default with a `chip:` tag for context: membw: buffer_mb: 16 iters: 16 results: write: mb_per_sec: 2243 duration_s: 0.120 read: mb_per_sec: 421 duration_s: 0.637 copy: mb_per_sec: 1863 duration_s: 0.288 chip: hi3516ev300 Use case (from #161 / #162 debugging): when two boards with the same SoC behave differently, this separates "CPU pipeline is the bottleneck" from "DDR pipeline is the bottleneck" in a few seconds. With APLL decode and HPM bin now in `ipctool clocks` from #162-#164, this PR closes the third leg of the same investigation flow. Verified on four lab boards (all with majestic / vendor App stopped to measure DDR config baseline rather than workload): hi3516ev300 (V4, OpenIPC): write 2243 read 421 copy 1863 MB/s gk7205v300 (V4, OpenIPC): write 2096 read 417 copy 1633 MB/s gk7205v300 (V4, XM Sofia): write 1576 read 370 copy 1302 MB/s [--size 4] hi3516av300 (V4A, OpenIPC): write 2320 read 427 copy 2440 MB/s XM Sofia ran with --size 4 because the board has only 48 MB userspace memory (the rest is mmz_anonymous for the encoder), so the default 32 MB total (2 x 16 MB buffers) doesn't fit -- confirms --size is a genuine knob, not just a tunable. Buffer-via-mmap caveat baked in per the issue: anonymous DDR pages rather than tmpfs / page cache. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
widgetii
added a commit
that referenced
this pull request
May 14, 2026
Replaces the build-toolchain-scp-dump dance from the #161/#162 work with a single `ipctool bootrom` call. Found during issue #160 follow-up when comparing mask-ROM accessibility across Goke V4 / HiSi V4 / V4A. Default (no flags): YAML metadata -- base, size, accessibility, first 16 bytes hex, chip name. bootrom: base: 0x04000000 size: 65536 accessible: 1 first_bytes: 10 00 00 ea 14 f0 9f e5 14 f0 9f e5 14 f0 9f e5 chip: hi3516av300 `accessible: 0` means the region reads as all zeros -- the Goke V4 case where the silicon hides the bootrom from userspace post-boot (see the corresponding kaeru lesson). For HiSi V4 / V4A boards the flag reliably reports 1. `--dump` writes the raw binary region to stdout (intended for redirect/pipe). No metadata preamble. Equivalent to the stand-alone dump_bootrom binary used in the earlier diagnostic but built into ipctool so no separate toolchain is needed. Per-family defaults sourced from the V4A mask-ROM RE (HI3516CV500-SDK/sdk/bootrom/bootrom.cpp: `BOOTROM = 0x04000000`, SRAM = 0x04010000, so 64 KB max). Currently shipping defaults for HISI_V4 (0x3516E300) and HISI_V4A (0x3516C500); other families require explicit --base / --size. Cross-board MD5 of `ipctool bootrom --dump | md5sum` reproduces the stand-alone dumper results exactly: hi3516ev300 (V4 HiSi): 669081159f4fed97ad4db0bfd5251dca (real ROM) gk7205v300 (V4 Goke): fcd6bcb56c1689fcef28b57c22475bad (all zeros) hi3516av300 (V4A): 1c2198a0a30c8258f5ab0c27aa523ff9 (real ROM, different mask) The `accessible` heuristic (all-zero buffer detector) correctly flags the Goke V4 case as `0` and the two real-ROM cases as `1`. CLI matches the existing clocks/cpubench/membw shape: ipctool bootrom [--base ADDR] [--size N] [--dump] [--json] Note: --base/--size accept any address; reading non-bootrom MMIO regions byte-by-byte (e.g. CRG at 0x12010000) will SIGBUS as the kernel signals an unaligned/word-only access. That's appropriate behavior -- the defaults are the supported path. Documented in the --help output. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ipctool clocks(aliasfreq) — reports CPU PLL, DDR clock, and per-die HPM characterization of the running SoC as YAML (default) or JSON. Also surfaces as a top-levelclocks:section in the defaultipctoolno-arg survey.ipctool cpubench— multi-pattern inline-asm CPU clock triangulation. Three Cortex-A7-tuned patterns (dep_add, indep_add, dep_mul) cross-validate the register-based clock decode. Kept as a battle-proven test for future board bring-up when a PLL register decode is suspect.0x12010014is not the CPU PLL FBDIV; the real CPU PLL (APLL) is the register pair(0x12010000, 0x12010004)with multi-field decode (FRACDIV/POSTDIV1/POSTDIV2 + FBDIV/REFDIV). All current V4 boards run at 900 MHz.Closes #161.
Falsifying the issue's register interpretation
The issue body identifies
0x12010014as "CPU PLL FBDIV" with values0x01770000 → 952 MHz(Board A) vs0x018F0000 → 1144 MHz(Board B), attributed to per-die HPM-driven mask-ROM binning.Bench on three V4 lab boards with three different values at
0x12010014:0x12010014rawdep_mulMHz0x000000000x01770000(claimed 952)0x018F0000(claimed 1144)* XM has unkillable kernel/userspace background load that drags estimates ~5% low.
All three boards run at the same CPU clock regardless of the
0x12010014value, so it cannot be the live CPU PLL FBDIV. The value at0x12010014is mask-ROM-written and correlates with per-die HPM binning, but doesn't drive any active clock.What
0x12010004isThe Hi3516A SDK kernel patch (
Hi3516EV200_SDK_V1.0.1.2/osdrv/opensource/kernel/linux-4.9.37.patch) definesstruct hi3516a_pll_clockfor APLL as a register pair (0x0,0x4):The V4 SDK kernel patch itself defines no PLL struct (
clk-hi3516ev300.cdeclares only fixed-rate and mux/gate clocks — the V4 Linux clock driver treats CPU clock as fixed-rate post-boot). Applying the Hi3516A layout to V4's actual reads(0x12010000, 0x12010004) = (0x12000000, 0x0100104B)gives FBDIV=75, REFDIV=1, POSTDIV1=2, POSTDIV2=1 → 900 MHz, within 0.2% of the cpubench triangulation.Cross-checked: both
u-boot-hi3516ev200/reg_info_hi3516ev300.binandu-boot-gk7205v200/reg_info_gk7205v300.binwrite only0x12010080(DDR cksel). Neither writes the APLL register pair — the mask ROM is the sole writer and the value is identical across all V4 boards (HiSi and Goke brandings).V4 family register map shipped in this PR
cpu_pll(APLL)(0x12010000, 0x12010004)ddrcksel0x12010080bits[5:3]hpm0x1202015Cbits[25:16]hpm.aux(HPM_CORE_REG0)0x120280D8pll_shadow_0c(raw)0x1201000Cpll_shadow_14(raw)0x12010014The two
pll_shadow_*entries are kept so users can still spot per-die HPM correlations in the field, but with an explicit note that they don't drive any clock.Sample output (gk7205v300 OpenIPC)
Test plan
arm-openipc-linux-musleabi(CI baseline, hi3516cv100 toolchain) + bare UPX — no warnings under-Wextra -O0/-Os.ipctool clockson hi3516ev300 OpenIPC → 900 MHz, shadow regs both zero (mask ROM didn't write them on the HiSi brand).ipctool clockson gk7205v300 OpenIPC → 900 MHz, shadow regs0x01770000/0x018F0000(HPM-bin shadow only, properly labeled).ipctool clockson gk7205v300 XM Sofia → 900 MHz, shadow regs0x018F0000/0x01970000.ipctoolno-arg →clocks:section appears alongsidechip:,ram:,sensors:, etc.ipctool cpubenchon all three boards → median 844-882 MHz, dep_mul 863-900 MHz (within model noise of 900).ipctool clocks --jsonandipctool cpubench --json→ valid JSON viajq.ipctool freqalias works.Out of scope (TODOs in code comments)
0x120d0000(currently assumes DDR3 quad-pumped viarate_mult = 4).scaling_cur_freqfor SMP variants (cpu0 only).🤖 Generated with Claude Code