Conversation
Co-authored-by: sosahi <syousefisahi@nvidia.com>
Co-authored-by: sosahi <syousefisahi@nvidia.com>
removed errant conflict branch
Greptile SummaryThis PR re-applies 26.03 documentation commits onto a reverted baseline: it rewrites
|
| Filename | Overview |
|---|---|
| contributing.md | Major rewrite from a minimal sign-off stub to a comprehensive 492-line guide; contains a contradictory license section that says "NVIDIA Proprietary Software License" but shows Apache-2.0 in the SPDX header template. |
| docs/docs/extraction/quickstart-guide.md | Version bump to 26.3.0, added air-gapped deployment section; introduces a duplicate "Step 3" heading due to renaming the former "Step 4" without removing the existing "Step 3: Ingest Documents". |
| docs/docs/extraction/releasenotes-nv-ingest.md | 26.03 release notes added, 26.01 notes removed; "Related Topics" still links to quickstart-library-mode.md which is deleted in this PR, creating a broken link. |
| docs/docs/extraction/support-matrix.md | NIM names updated to nemotron-* branding; LanceDB incorrectly dropped from the retrieval bullet despite remaining the documented default backend. |
| docs/docs/extraction/quickstart-library-mode.md | File deleted (487 lines); content referenced by releasenotes-nv-ingest.md "Related Topics" link — deletion without updating that link creates a 404. |
| docs/docs/extraction/helm.md | Stub file deleted; content was a simple pointer to external Helm charts — no issues with the deletion itself. |
| docs/docs/extraction/contributing.md | Stub file deleted; was a short redirect to the GitHub CONTRIBUTING.md — no issues with the deletion. |
| docs/docs/extraction/user-defined-functions.md | Single-line note updated to reflect the NV-Ingest → NeMo Retriever Library rename — straightforward and correct. |
Comments Outside Diff (1)
-
docs/docs/extraction/support-matrix.md, line 26 (link)LanceDB silently dropped from retrieval description
The old bullet read "Enables embedding and indexing into LanceDB (default) or Milvus." The new text removes LanceDB entirely. However,
quickstart-guide.mdstill references LanceDB as the default vector database backend, and the 26.03 release notes list "Enabled hybrid search with Lancedb" as a new feature — so LanceDB remains supported. Dropping it here creates a documentation inconsistency that could confuse users about which backends are available.Prompt To Fix With AI
This is a comment left during a code review. Path: docs/docs/extraction/support-matrix.md Line: 26 Comment: **LanceDB silently dropped from retrieval description** The old bullet read "Enables embedding and indexing into LanceDB (default) or Milvus." The new text removes LanceDB entirely. However, `quickstart-guide.md` still references LanceDB as the default vector database backend, and the 26.03 release notes list "Enabled hybrid search with Lancedb" as a new feature — so LanceDB remains supported. Dropping it here creates a documentation inconsistency that could confuse users about which backends are available. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docs/docs/extraction/releasenotes-nv-ingest.md
Line: 51
Comment:
**Broken link to deleted file**
`quickstart-library-mode.md` is deleted in this PR, but the "Related Topics" section at the bottom still links to it. Any reader following this link will get a 404.
Remove or update the `quickstart-library-mode.md` reference before merging.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 26
Comment:
**LanceDB silently dropped from retrieval description**
The old bullet read "Enables embedding and indexing into LanceDB (default) or Milvus." The new text removes LanceDB entirely. However, `quickstart-guide.md` still references LanceDB as the default vector database backend, and the 26.03 release notes list "Enabled hybrid search with Lancedb" as a new feature — so LanceDB remains supported. Dropping it here creates a documentation inconsistency that could confuse users about which backends are available.
```suggestion
- retrieval — Enables embedding and indexing into LanceDB (default) or Milvus.
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: contributing.md
Line: 456-460
Comment:
**Contradictory license statements**
The "Licensing" section states "NV-Ingest is licensed under the **NVIDIA Proprietary Software License**", but the SPDX header template shown immediately below says `SPDX-License-Identifier: Apache-2.0`. These two statements are mutually exclusive; one of them is wrong and will mislead contributors about the project's actual license terms.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: docs/docs/extraction/quickstart-guide.md
Line: 364
Comment:
**Step numbering regression**
"Step 4: Inspecting and Consuming Results" has been relabeled "Step 3", but "Step 3: Ingest Documents" already exists earlier in the file. This creates two sections both labeled "Step 3", which will confuse readers following the numbered walkthrough.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "Delete docs/docs/extraction/quickstart-l..." | Re-trigger Greptile
| ## Related Topics | ||
|
|
||
| - [Prerequisites](prerequisites.md) | ||
| - [Deploy Without Containers (Library Mode)](quickstart-library-mode.md) |
There was a problem hiding this comment.
quickstart-library-mode.md is deleted in this PR, but the "Related Topics" section at the bottom still links to it. Any reader following this link will get a 404.
Remove or update the quickstart-library-mode.md reference before merging.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/releasenotes-nv-ingest.md
Line: 51
Comment:
**Broken link to deleted file**
`quickstart-library-mode.md` is deleted in this PR, but the "Related Topics" section at the bottom still links to it. Any reader following this link will get a 404.
Remove or update the `quickstart-library-mode.md` reference before merging.
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| To ensure high-quality documentation, follow these best practices: | ||
|
|
||
| 1. **Use headings and subheadings**: Organize your content with clear headings and subheadings to facilitate scanning and navigation. | ||
| 2. **Use bullet points and lists**: Break up complex information into easy-to-read lists and bullet points. |
There was a problem hiding this comment.
Contradictory license statements
The "Licensing" section states "NV-Ingest is licensed under the NVIDIA Proprietary Software License", but the SPDX header template shown immediately below says SPDX-License-Identifier: Apache-2.0. These two statements are mutually exclusive; one of them is wrong and will mislead contributors about the project's actual license terms.
Prompt To Fix With AI
This is a comment left during a code review.
Path: contributing.md
Line: 456-460
Comment:
**Contradictory license statements**
The "Licensing" section states "NV-Ingest is licensed under the **NVIDIA Proprietary Software License**", but the SPDX header template shown immediately below says `SPDX-License-Identifier: Apache-2.0`. These two statements are mutually exclusive; one of them is wrong and will mislead contributors about the project's actual license terms.
How can I resolve this? If you propose a fix, please make it concise.| ## Step 3: Inspecting and Consuming Results | ||
|
|
||
| After the ingestion steps above have been completed, you should be able to find the `text` and `image` subfolders inside your processed docs folder. Each will contain JSON-formatted extracted content and metadata. | ||
|
|
There was a problem hiding this comment.
"Step 4: Inspecting and Consuming Results" has been relabeled "Step 3", but "Step 3: Ingest Documents" already exists earlier in the file. This creates two sections both labeled "Step 3", which will confuse readers following the numbered walkthrough.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/quickstart-guide.md
Line: 364
Comment:
**Step numbering regression**
"Step 4: Inspecting and Consuming Results" has been relabeled "Step 3", but "Step 3: Ingest Documents" already exists earlier in the file. This creates two sections both labeled "Step 3", which will confuse readers following the numbered walkthrough.
How can I resolve this? If you propose a fix, please make it concise.
added PR commits to reverted baseline doc.