fix: prevent PDF binary content from being included in scrape output by dan-and · Pull Request #42 · devflowinc/firecrawl-simple

dan-and · 2025-09-27T20:33:44Z

Add PDF detection to skip processing PDF files in fetch and playwright scrapers. This prevents raw PDF binary data from being dumped into HTML/markdown fields.

Fixes #28

Add PDF detection to skip processing PDF files in fetch and playwright scrapers. This prevents raw PDF binary data from being dumped into HTML/markdown fields. Fixes devflowinc#28

fix: prevent PDF binary content from being included in scrape output

4028efa

Add PDF detection to skip processing PDF files in fetch and playwright scrapers. This prevents raw PDF binary data from being dumped into HTML/markdown fields. Fixes devflowinc#28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent PDF binary content from being included in scrape output#42

fix: prevent PDF binary content from being included in scrape output#42
dan-and wants to merge 1 commit intodevflowinc:mainfrom
dan-and:pdf_detection

dan-and commented Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dan-and commented Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant