Skip to content

feat(rvf): accept JSONL RVF container in --model loader#810

Open
lockewerks wants to merge 3 commits into
ruvnet:mainfrom
lockewerks:feat/jsonl-rvf-adapter
Open

feat(rvf): accept JSONL RVF container in --model loader#810
lockewerks wants to merge 3 commits into
ruvnet:mainfrom
lockewerks:feat/jsonl-rvf-adapter

Conversation

@lockewerks
Copy link
Copy Markdown

Summary

The README.md documents this gap:

"the HF model ships in JSONL RVF format, but v2/crates/wifi-densepose-sensing-server/src/rvf_container.rs only parses the binary RVF segment format. Pointing --model at model.rvf.jsonl currently errors with invalid magic at offset 0: expected 0x52564653, got 0x7974227B and the live pipeline degrades to null output rather than falling back to heuristic mode"

This PR closes that gap.

Changes

  • v2/crates/wifi-densepose-sensing-server/src/rvf_container.rs (+283/-2):
    • from_bytes now sniffs the first non-whitespace byte and dispatches:
      • R → existing binary path (RVFS magic). Byte-for-byte unchanged; all 14 pre-existing binary tests still green.
      • { → new JSONL path that reads lines, parses each as JSON, builds an in-memory equivalent of binary segments (SEG_MANIFEST / SEG_QUANT / SEG_META), re-uses the existing parser
      • Otherwise → explicit error citing both supported formats
    • Binary "invalid magic" error now hints at the JSONL alternative when the first byte looks like JSON
    • 10 new tests cover the JSONL path including round-trip, error paths, and the ProgressiveLoader::load_layer_a integration that main.rs uses on startup
  • README.md (+2/-2): replaced the "Known gap" paragraph with a description of what the loader now accepts

Verification

  • cargo test -p wifi-densepose-sensing-server --no-default-features --lib435 passed, 0 failed, 1 ignored (10 new tests included)
  • cargo build -p wifi-densepose-sensing-server --no-default-features → clean (warnings only)
  • End-to-end manually verified: launching sensing-server --model models/wifi-densepose-pretrained/model.rvf.jsonl --source esp32 now logs Layer A ready: model=wifi-densepose-csi-embedding v1.0.0 (3 segments) instead of the previous null-output degrade

Notes

The JSONL bundle is metadata-only by design — the f32 weight matrix lives in the sibling model.safetensors. RvfReader::weights() therefore returns None for JSONL input, same behaviour as launching without --model. The README updates this section to be explicit about that.

A future patch could stitch the sibling model.safetensors into a synthetic SEG_VEC segment so weights() returns real data — out of scope here. (Note: that future patch additionally needs the safetensors header fix to land first; see PR fix/safetensors-header-padding on this fork.)

Test plan

  • cargo test -p wifi-densepose-sensing-server --no-default-features --lib passes including new jsonl_* tests
  • cargo build -p wifi-densepose-sensing-server --no-default-features clean
  • Launch sensing-server --model <path>/model.rvf.jsonl --source esp32 and confirm Layer A ready log line
  • Launch sensing-server --model <path>/model.rvf (existing binary) and confirm no regression
  • Hand the loader a malformed file (random bytes) and confirm the new explicit error mentions both formats

RvfReader::from_bytes now sniffs the leading non-whitespace byte and
dispatches to a JSONL parser when it sees '{' or '['. The new
from_jsonl_bytes helper walks each line, validates that it is a JSON
object with a "type" field, and maps known types onto in-memory binary
segments so the rest of the pipeline keeps working unchanged:

  type=metadata     -> SEG_MANIFEST (name, version, architecture)
  type=quantization -> SEG_QUANT    (full JSON payload, default
                                     quant_type filled in if absent)
  type=*            -> SEG_META     (verbatim, bundled into one entry)

The binary-path "invalid magic" error now points operators at the JSONL
format so failures are explicit instead of degrading to null output, and
unrecognised content (non-UTF-8, no objects, missing type) returns a
detailed error rather than a silent partial parse.

The JSONL container intentionally does not carry the f32 weight matrix
- those ship as model.safetensors / model-qN.bin in the HuggingFace
bundle - so weights() returns None for JSONL inputs. Callers that need
the convolution weights must still load one of the sibling files.

Fixes the documented gap where pointing the sensing-server --model flag
at model.rvf.jsonl from the HuggingFace bundle errored with
"invalid magic at offset 0: expected 0x52564653, got 0x7974227B".
Covers the new sniff + JSONL adapter end-to-end:

  - from_bytes_dispatches_to_jsonl_on_brace: feeds an in-memory copy
    of the exact bytes shipped at ruvnet/wifi-densepose-pretrained
    (model.rvf.jsonl, v1.0.0) through the public API and asserts the
    synthesised manifest exposes the real model id and version.
  - jsonl_sniff_tolerates_leading_whitespace: padding with \n \t still
    dispatches to JSONL.
  - jsonl_quantization_becomes_quant_segment: the quantization line
    surfaces verbatim through quant_info().
  - jsonl_preserves_other_lines_in_metadata: encoder, lora, ewc, and
    metadata lines all round-trip through metadata()["lines"].
  - jsonl_no_weights_segment_present: weights() returns None - the
    JSONL bundle does not carry f32 weights, by design.
  - jsonl_progressive_loader_layer_a_works: covers the integration
    point that previously broke - ProgressiveLoader::new + load_layer_a
    now reports the real model name on JSONL input.
  - jsonl_invalid_json_line_is_explicit / jsonl_missing_type_field /
    jsonl_blank_lines_only: every error path produces a "JSONL RVF"
    prefix and identifies the offending line, so failures surface to
    operators instead of degrading to null output.
  - jsonl_minimal_metadata_only: a single-line bundle still parses.
  - binary_error_mentions_jsonl_hint: corrupt binary input now points
    at the JSONL format in its error text.
The "Pretrained model on Hugging Face" section now reflects that the
sensing-server --model flag accepts the JSONL container shipped at
ruvnet/wifi-densepose-pretrained directly, with no preprocessing.

Replaces the "Known gap" paragraph with a "JSONL loader" paragraph
that documents the segment mapping (metadata -> SEG_MANIFEST,
quantization -> SEG_QUANT, anything else -> SEG_META) and is honest
about the remaining limitation: JSONL does not carry the f32 weight
matrix, so weight-sensitive inference paths still need the sibling
model.safetensors / model-qN.bin file from the HF bundle.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant