Skip to content

Document revised OCR processing#4940

Open
JorjMcKie wants to merge 4 commits intomainfrom
ocr-doc-updates
Open

Document revised OCR processing#4940
JorjMcKie wants to merge 4 commits intomainfrom
ocr-doc-updates

Conversation

@JorjMcKie
Copy link
Collaborator

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds/updates PyMuPDF4LLM documentation to describe the revised OCR plugin system and clarify OCR-related API behavior in the generated docs.

Changes:

  • Adds a new documentation page describing default OCR plugins, selection order, hybrid OCR workflow, and how to provide custom OCR functions.
  • Updates the PyMuPDF4LLM feature list to mention automatic OCR-benefit page detection and multiple OCR engines.
  • Updates API docs for force_ocr, ocr_dpi, and ocr_function to reflect default OCR engine/plugin behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File Description
docs/pymupdf4llm/ocr-plugins.rst New page documenting OCR plugin options, selection logic, and customization.
docs/pymupdf4llm/index.rst Adds OCR-related capability to the feature list.
docs/pymupdf4llm/api.rst Clarifies OCR-related parameter documentation and defaults.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +1 to +6
Default OCR Functions
======================

PyMuPDF4LLM supports default OCR functions. They come in the form of plugins that are present in its `ocr` subpackage. They are based on currently 3 popular OCR engines, Tesseract OCR, RapidOCR and PaddleOCR. Some engines can be combined to make use of their strengths and mitigate their weaknesses. For example, Tesseract OCR is very good at **recognizing** text, while RapidOCR is better at **detecting** text bounding boxes in images with complex backgrounds. By combining the two engines, we can achieve better overall OCR results while at the samne time also reducing the overall OCR processing time.

Here is an overview of the available default plugins:
@JorjMcKie JorjMcKie requested a review from jamie-lemon March 14, 2026 16:55
JorjMcKie and others added 2 commits March 14, 2026 12:58
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants