docs(iceberg): update JSON schema translation behavior#1603
docs(iceberg): update JSON schema translation behavior#1603nvartolomei wants to merge 4 commits intov-WIP/26.1from
Conversation
📝 WalkthroughWalkthroughThe change updates JSON Schema validation guidance documentation for Iceberg schema specification. It clarifies and expands supported JSON Schema features: Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can approve the review once all CodeRabbit's comments are resolved.Enable the |
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
modules/manage/pages/iceberg/specify-iceberg-schema.adoc (3)
313-313: Consider adding inline clarification for "exclusiveoneOfnullable pattern".The term "exclusive
oneOfnullable pattern" is used here but only defined later in the "Keyword behavior" section (Line 347). Readers encountering this in the table may be confused.Consider adding a brief inline example or parenthetical clarification, such as:
The `null` type is only supported as a nullability marker, either in a `type` array (for example, `["string", "null"]`) or in an exclusive `oneOf` nullable pattern (where exactly one branch is `{"type":"null"}` and the other is a non-null schema).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc` at line 313, Add a short parenthetical clarification to the sentence about the `null` type so readers immediately understand "exclusive `oneOf` nullable pattern" without jumping to the later section; for example, append a brief phrase after "oneOf nullable pattern" such as "(i.e., an exclusive oneOf where exactly one branch is {"type":"null"} and the other branch is a non-null schema)" referencing the `null` type and `oneOf` examples so the meaning is clear in-place.
329-329: Minor grammar improvement: remove redundant "object".The phrase "A schema-valued
additionalPropertiesobject" is slightly awkward. The word "object" at the end is redundant sinceadditionalPropertiesitself is the keyword.✍️ Suggested rewording
Consider one of these alternatives:
Option 1:
-Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object. +Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object.Option 2:
-Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object. +Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. When `additionalProperties` is a schema object, it is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc` at line 329, The sentence "A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`." uses a redundant trailing "object"; update the wording to remove it so it reads e.g. "A schema-valued `additionalProperties` is translated to an Iceberg `map<string, T>`." Make this change wherever this exact sentence appears (referencing the `additionalProperties` keyword and the Iceberg `map<string, T>` translation) to improve grammar and clarity.
354-355: Improve technical terminology for JSON Schema features.Two phrases use non-standard or ambiguous terminology:
Line 354: "Conditional typing (
if,then,else,dependencieskeywords)" - The term "conditional typing" is not standard JSON Schema terminology. The standard term is "conditional subschemas" or "schema dependencies".Line 355: "Boolean JSON Schema combinations" - This is non-standard terminology. The standard terms are "schema combinators" or "schema composition keywords".
📚 Suggested improvements
-* Conditional typing (`if`, `then`, `else`, `dependencies` keywords) -* Boolean JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns) +* Conditional subschemas (`if`, `then`, `else`, `dependencies` keywords) +* Schema combinators (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)Or alternatively:
-* Conditional typing (`if`, `then`, `else`, `dependencies` keywords) -* Boolean JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns) +* Conditional schema application (`if`, `then`, `else`, `dependencies` keywords) +* Boolean schema composition (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc` around lines 354 - 355, Replace the non-standard phrases in the bulleted list: change "Conditional typing (`if`, `then`, `else`, `dependencies` keywords)" to use standard JSON Schema terminology such as "Conditional subschemas (`if`, `then`, `else`) and schema dependencies (`dependencies`)" and change "Boolean JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)" to "Schema combinators / composition keywords (`allOf`, `anyOf`, `oneOf`)" so the document uses "conditional subschemas" and "schema combinators" consistently (update the two phrase occurrences shown in the diff).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc`:
- Line 346: The phrase "references to unknown keywords" in the sentence about
$ref is ambiguous; update the wording around the $ref/$id sentence to enumerate
the exact unsupported cases (e.g., references to external resources, references
to non-existent schema $id values, undefined anchors or unresolvable
fragments/JSON Pointer fragments) and keep the note that a root-level $ref
schema is not supported; reference the $ref and $id tokens as the symbols to
edit so the text explicitly says "References to external resources, non-existent
$id values, or undefined anchors/fragments are not supported" (or similar
concise wording).
---
Nitpick comments:
In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc`:
- Line 313: Add a short parenthetical clarification to the sentence about the
`null` type so readers immediately understand "exclusive `oneOf` nullable
pattern" without jumping to the later section; for example, append a brief
phrase after "oneOf nullable pattern" such as "(i.e., an exclusive oneOf where
exactly one branch is {"type":"null"} and the other branch is a non-null
schema)" referencing the `null` type and `oneOf` examples so the meaning is
clear in-place.
- Line 329: The sentence "A schema-valued `additionalProperties` object is
translated to an Iceberg `map<string, T>`." uses a redundant trailing "object";
update the wording to remove it so it reads e.g. "A schema-valued
`additionalProperties` is translated to an Iceberg `map<string, T>`." Make this
change wherever this exact sentence appears (referencing the
`additionalProperties` keyword and the Iceberg `map<string, T>` translation) to
improve grammar and clarity.
- Around line 354-355: Replace the non-standard phrases in the bulleted list:
change "Conditional typing (`if`, `then`, `else`, `dependencies` keywords)" to
use standard JSON Schema terminology such as "Conditional subschemas (`if`,
`then`, `else`) and schema dependencies (`dependencies`)" and change "Boolean
JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)"
to "Schema combinators / composition keywords (`allOf`, `anyOf`, `oneOf`)" so
the document uses "conditional subschemas" and "schema combinators" consistently
(update the two phrase occurrences shown in the diff).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 73bd4385-5e4f-4e87-b541-1003e1b95486
📒 Files selected for processing (1)
modules/manage/pages/iceberg/specify-iceberg-schema.adoc
kbatuigas
left a comment
There was a problem hiding this comment.
If this is new for 26.1, this PR should be opened against the v-WIP/26.1 branch cc @Feediver1
| | struct | ||
| | The `properties` keyword must be used to define `struct` fields and constrain their types. The `additionalProperties` keyword is accepted only when it is set to `false`. | ||
| | struct or map | ||
| | Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object. |
There was a problem hiding this comment.
@nvartolomei do I understand this correctly?
- if object contains
properties(and optionallyadditionalProperties: false), -> struct - if object contains
additionalPropertieswhose value is itself a schema object -> map
There was a problem hiding this comment.
Yes. That's it.
There was a problem hiding this comment.
| | Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object. | |
| a| * Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. | |
| * If `additionalProperties` contains a schema, it translates to an Iceberg `map<string, T>`. | |
| * You cannot combine `properties` and `additionalProperties`in an object if `additionalProperties` is set to a schema. |
I'd suggest a tweak to the formatting in this cell so it's a little easier to scan and to see the different conditions in which the type is translated to struct vs map. Please feel free to adjust the wording for accuracy
|
@mattschumpert @nvartolomei Does this need an entry in What's New? |
@nvartolomei Given that this is for 26.1, which I was unaware of (thanks @kbatuigas !), this PR should have been opened against v-WIP/26.1. Can you please update to point to this branch? Thanks! |
Approved but it is based on the main branch instead of v-WIP/26.1. So I had to revoke the approval.
d68c52e to
71aa511
Compare
|
Yes I think so, for 26.1. There was confusion about if the improvement was backported but I am hearing no |
|
@kbatuigas would be cool to add a note on what's new, yes. |
Co-authored-by: Gellért Peresztegi-Nagy <gellert.nagy@redpanda.com> Co-authored-by: Joyce Fee <102751339+Feediver1@users.noreply.github.com>
| * Map type from `additionalProperties` - `additionalProperties` objects that contain subschemas now translate to Iceberg `map<string, T>`. | ||
| * `oneOf` nullable pattern - The `oneOf` keyword is now supported for the standard nullable pattern if exactly one branch is `{"type":"null"}` and the other is a non-null schema. | ||
|
|
||
| See xref:manage:iceberg/specify-iceberg-schema.adoc#how-iceberg-modes-translate-to-table-format[Specify Iceberg Schema] for JSON types mapping and updated requirements. |
There was a problem hiding this comment.
@nvartolomei @mattschumpert Added this to What's New, please let me know if this looks good
|
Thanks @kbatuigas. I need to make some edits. will request reviews once ready. |
Description
Resolves https://redpandadata.atlassian.net/browse/DOC-1957
Review deadline: v26.1 release
Page previews
https://deploy-preview-1603--redpanda-docs-preview.netlify.app/current/manage/iceberg/specify-iceberg-schema/
https://deploy-preview-1603--redpanda-docs-preview.netlify.app/current/get-started/release-notes/redpanda/#improved-json-schema-translation-for-iceberg-topics
Checks