fix(router-core): avoid null bytes in dehydrated SSR match ids#7586
fix(router-core): avoid null bytes in dehydrated SSR match ids#7586VihAMBR wants to merge 1 commit into
Conversation
dehydrateSsrMatchId replaced "/" with U+0000 so dehydrated ids would not look like crawlable URLs (TanStack#6739). U+0000 is forbidden in the HTML input stream though, so the inlined hydration payload tripped a control-character-in-input-stream parse error and failed markup validation. Encode with U+FFFD instead, which is valid in HTML and which hydrateSsrMatchId already decodes back to "/".
📝 WalkthroughWalkthroughThis PR fixes invalid HTML produced during Server-Side Rendering by replacing null-byte (U+0000) encoding in dehydrated SSR match IDs with the Unicode replacement character (U+FFFD). The change modifies the encoding function, adds validation tests, and documents the patch release. ChangesSSR Match ID Null-Byte Encoding Fix
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/router-core/tests/ssr-match-id.test.ts (1)
27-42: ⚡ Quick winConsider adding a backward compatibility test for null-byte decoding.
The new test correctly validates that
dehydrateSsrMatchIdno longer emits C0 control characters. However, the PR description states "The existing decode branch for NUL -> "/" is retained for backward compatibility with payloads that still contain null bytes."To ensure this backward compatibility path remains functional, consider adding an explicit test:
it('decodes legacy null-byte delimiters for backward compatibility', () => { expect(hydrateSsrMatchId('\0posts\01')).toBe('/posts/1') })This would verify that old SSR payloads containing null bytes can still be hydrated correctly by the client.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/router-core/tests/ssr-match-id.test.ts` around lines 27 - 42, Add a backward-compatibility unit test to verify hydrateSsrMatchId still decodes legacy NUL delimiters: create a new test case that calls hydrateSsrMatchId with a string containing embedded null characters (e.g., '\0posts\01') and asserts the result equals '/posts/1'; this complements the existing dehydrateSsrMatchId control-character test and ensures the legacy decode branch in hydrateSsrMatchId remains functional.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/router-core/tests/ssr-match-id.test.ts`:
- Around line 27-42: Add a backward-compatibility unit test to verify
hydrateSsrMatchId still decodes legacy NUL delimiters: create a new test case
that calls hydrateSsrMatchId with a string containing embedded null characters
(e.g., '\0posts\01') and asserts the result equals '/posts/1'; this complements
the existing dehydrateSsrMatchId control-character test and ensures the legacy
decode branch in hydrateSsrMatchId remains functional.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9d7f7c13-5c8a-4233-adeb-0ca53bf28840
📒 Files selected for processing (3)
.changeset/ssr-match-id-null-byte.mdpackages/router-core/src/ssr/ssr-match-id.tspackages/router-core/tests/ssr-match-id.test.ts
Summary
dehydrateSsrMatchIdencodes match IDs for the SSR hydration payload by replacing/with\0. That encoding came from #6739, so the dehydrated ids stop looking like relative URLs that crawlers pick up as phantom pages.The catch is that U+0000 is forbidden in the HTML input stream (the
control-character-in-input-streamparse error in the HTML spec). The dehydrated ids are inlined into the$tsr-stream-barrier<script>, so every SSR response ends up with raw null bytes and fails markup validation.validator.w3.orgreportsSaw U+0000 in stream(#7581). It only works in browsers today because the parser silently rewrites those null bytes to U+FFFD, which is exactly whyhydrateSsrMatchIdalready carries a� -> /fallback.This swaps the delimiter from
\0to�(U+FFFD REPLACEMENT CHARACTER):/, so the SSR dehydrated match IDs are URL-shaped strings — Google crawls them as phantom URLs #6739 crawler fix holdshydrateSsrMatchIdalready decodes� -> /, so the client side needs no changeThe existing
\0 -> /decode branch is left in place so any payload still carrying a null byte keeps round-tripping.Testing
Added a codec test asserting the dehydrated id has no C0 control characters. It fails on
mainand passes with this change.nx run @tanstack/router-core:test:unit(39 files, 1179 passed)nx run @tanstack/router-core:test:typesnx run @tanstack/router-core:test:eslintnx run @tanstack/router-core:buildFixes #7581
Summary by CodeRabbit
Bug Fixes
Tests