Skip to content

Kheiss/bugfix reapply#1666

Closed
kheiss-uwzoo wants to merge 17 commits intoNVIDIA:26.03from
kheiss-uwzoo:kheiss/bugfix-reapply
Closed

Kheiss/bugfix reapply#1666
kheiss-uwzoo wants to merge 17 commits intoNVIDIA:26.03from
kheiss-uwzoo:kheiss/bugfix-reapply

Conversation

@kheiss-uwzoo
Copy link
Copy Markdown
Collaborator

added PR commits to reverted baseline doc.

@kheiss-uwzoo kheiss-uwzoo marked this pull request as ready for review April 22, 2026 19:05
@kheiss-uwzoo kheiss-uwzoo requested a review from a team as a code owner April 22, 2026 19:05
@kheiss-uwzoo kheiss-uwzoo requested review from drobison00 and removed request for a team April 22, 2026 19:05
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

This PR re-applies 26.03 documentation commits onto a reverted baseline: it rewrites contributing.md, updates release notes, version-bumps install instructions to 26.3.0, renames NIMs to their new nemotron-* branding, adds an air-gapped deployment section, and removes three stub/outdated doc files.

  • Broken link: releasenotes-nv-ingest.md "Related Topics" still references quickstart-library-mode.md, which is deleted in this PR.
  • Duplicate step heading: quickstart-guide.md now has two "Step 3" sections after the former "Step 4" was renumbered.
  • Contradictory license text: contributing.md says "NVIDIA Proprietary Software License" in prose but shows SPDX-License-Identifier: Apache-2.0 in the code block immediately below.

Confidence Score: 3/5

Not safe to merge as-is: four P1 documentation correctness issues remain.

All changes are documentation-only, but four distinct P1 issues exist: a broken link to a deleted file, a duplicate Step 3 heading that breaks the quickstart walkthrough, LanceDB incorrectly removed from the support matrix while still being the documented default, and contradictory license text in contributing.md.

contributing.md (license contradiction), docs/docs/extraction/quickstart-guide.md (duplicate Step 3), docs/docs/extraction/releasenotes-nv-ingest.md (broken link to deleted file), docs/docs/extraction/support-matrix.md (LanceDB omission)

Important Files Changed

Filename Overview
contributing.md Major rewrite from a minimal sign-off stub to a comprehensive 492-line guide; contains a contradictory license section that says "NVIDIA Proprietary Software License" but shows Apache-2.0 in the SPDX header template.
docs/docs/extraction/quickstart-guide.md Version bump to 26.3.0, added air-gapped deployment section; introduces a duplicate "Step 3" heading due to renaming the former "Step 4" without removing the existing "Step 3: Ingest Documents".
docs/docs/extraction/releasenotes-nv-ingest.md 26.03 release notes added, 26.01 notes removed; "Related Topics" still links to quickstart-library-mode.md which is deleted in this PR, creating a broken link.
docs/docs/extraction/support-matrix.md NIM names updated to nemotron-* branding; LanceDB incorrectly dropped from the retrieval bullet despite remaining the documented default backend.
docs/docs/extraction/quickstart-library-mode.md File deleted (487 lines); content referenced by releasenotes-nv-ingest.md "Related Topics" link — deletion without updating that link creates a 404.
docs/docs/extraction/helm.md Stub file deleted; content was a simple pointer to external Helm charts — no issues with the deletion itself.
docs/docs/extraction/contributing.md Stub file deleted; was a short redirect to the GitHub CONTRIBUTING.md — no issues with the deletion.
docs/docs/extraction/user-defined-functions.md Single-line note updated to reflect the NV-Ingest → NeMo Retriever Library rename — straightforward and correct.

Comments Outside Diff (1)

  1. docs/docs/extraction/support-matrix.md, line 26 (link)

    P1 LanceDB silently dropped from retrieval description

    The old bullet read "Enables embedding and indexing into LanceDB (default) or Milvus." The new text removes LanceDB entirely. However, quickstart-guide.md still references LanceDB as the default vector database backend, and the 26.03 release notes list "Enabled hybrid search with Lancedb" as a new feature — so LanceDB remains supported. Dropping it here creates a documentation inconsistency that could confuse users about which backends are available.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: docs/docs/extraction/support-matrix.md
    Line: 26
    
    Comment:
    **LanceDB silently dropped from retrieval description**
    
    The old bullet read "Enables embedding and indexing into LanceDB (default) or Milvus." The new text removes LanceDB entirely. However, `quickstart-guide.md` still references LanceDB as the default vector database backend, and the 26.03 release notes list "Enabled hybrid search with Lancedb" as a new feature — so LanceDB remains supported. Dropping it here creates a documentation inconsistency that could confuse users about which backends are available.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docs/docs/extraction/releasenotes-nv-ingest.md
Line: 51

Comment:
**Broken link to deleted file**

`quickstart-library-mode.md` is deleted in this PR, but the "Related Topics" section at the bottom still links to it. Any reader following this link will get a 404.

Remove or update the `quickstart-library-mode.md` reference before merging.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 26

Comment:
**LanceDB silently dropped from retrieval description**

The old bullet read "Enables embedding and indexing into LanceDB (default) or Milvus." The new text removes LanceDB entirely. However, `quickstart-guide.md` still references LanceDB as the default vector database backend, and the 26.03 release notes list "Enabled hybrid search with Lancedb" as a new feature — so LanceDB remains supported. Dropping it here creates a documentation inconsistency that could confuse users about which backends are available.

```suggestion
- retrieval — Enables embedding and indexing into LanceDB (default) or Milvus.
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: contributing.md
Line: 456-460

Comment:
**Contradictory license statements**

The "Licensing" section states "NV-Ingest is licensed under the **NVIDIA Proprietary Software License**", but the SPDX header template shown immediately below says `SPDX-License-Identifier: Apache-2.0`. These two statements are mutually exclusive; one of them is wrong and will mislead contributors about the project's actual license terms.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/quickstart-guide.md
Line: 364

Comment:
**Step numbering regression**

"Step 4: Inspecting and Consuming Results" has been relabeled "Step 3", but "Step 3: Ingest Documents" already exists earlier in the file. This creates two sections both labeled "Step 3", which will confuse readers following the numbered walkthrough.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Delete docs/docs/extraction/quickstart-l..." | Re-trigger Greptile

## Related Topics

- [Prerequisites](prerequisites.md)
- [Deploy Without Containers (Library Mode)](quickstart-library-mode.md)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Broken link to deleted file

quickstart-library-mode.md is deleted in this PR, but the "Related Topics" section at the bottom still links to it. Any reader following this link will get a 404.

Remove or update the quickstart-library-mode.md reference before merging.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/releasenotes-nv-ingest.md
Line: 51

Comment:
**Broken link to deleted file**

`quickstart-library-mode.md` is deleted in this PR, but the "Related Topics" section at the bottom still links to it. Any reader following this link will get a 404.

Remove or update the `quickstart-library-mode.md` reference before merging.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread contributing.md
Comment on lines +456 to +460

To ensure high-quality documentation, follow these best practices:

1. **Use headings and subheadings**: Organize your content with clear headings and subheadings to facilitate scanning and navigation.
2. **Use bullet points and lists**: Break up complex information into easy-to-read lists and bullet points.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Contradictory license statements

The "Licensing" section states "NV-Ingest is licensed under the NVIDIA Proprietary Software License", but the SPDX header template shown immediately below says SPDX-License-Identifier: Apache-2.0. These two statements are mutually exclusive; one of them is wrong and will mislead contributors about the project's actual license terms.

Prompt To Fix With AI
This is a comment left during a code review.
Path: contributing.md
Line: 456-460

Comment:
**Contradictory license statements**

The "Licensing" section states "NV-Ingest is licensed under the **NVIDIA Proprietary Software License**", but the SPDX header template shown immediately below says `SPDX-License-Identifier: Apache-2.0`. These two statements are mutually exclusive; one of them is wrong and will mislead contributors about the project's actual license terms.

How can I resolve this? If you propose a fix, please make it concise.

## Step 3: Inspecting and Consuming Results

After the ingestion steps above have been completed, you should be able to find the `text` and `image` subfolders inside your processed docs folder. Each will contain JSON-formatted extracted content and metadata.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Step numbering regression

"Step 4: Inspecting and Consuming Results" has been relabeled "Step 3", but "Step 3: Ingest Documents" already exists earlier in the file. This creates two sections both labeled "Step 3", which will confuse readers following the numbered walkthrough.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/quickstart-guide.md
Line: 364

Comment:
**Step numbering regression**

"Step 4: Inspecting and Consuming Results" has been relabeled "Step 3", but "Step 3: Ingest Documents" already exists earlier in the file. This creates two sections both labeled "Step 3", which will confuse readers following the numbered walkthrough.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants