Skip to content

feat: Add post_navigation_hooks to crawlers#1795

Open
Mantisus wants to merge 7 commits intoapify:masterfrom
Mantisus:new-hooks
Open

feat: Add post_navigation_hooks to crawlers#1795
Mantisus wants to merge 7 commits intoapify:masterfrom
Mantisus:new-hooks

Conversation

@Mantisus
Copy link
Collaborator

Description

  • Add post_navigation_hooks that run after navigation.

Issues

Testing

  • Tests for navigation hooks have been added and updated.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for post_navigation_hooks across the crawler stack (HTTP, Playwright, and Adaptive Playwright) so users can run logic after navigation completes but before the request handler executes.

Changes:

  • Introduces post-navigation hook registration/execution in AbstractHttpCrawler and PlaywrightCrawler.
  • Adds PlaywrightPostNavCrawlingContext and updates context inheritance so the post-nav context includes response.
  • Extends AdaptivePlaywrightCrawler with post-nav hooks and a wrapper AdaptivePlaywrightPostNavCrawlingContext, plus new/updated unit tests.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/unit/crawlers/_playwright/test_playwright_crawler.py Adds Playwright pre/post-nav hook tests and ordering assertions.
tests/unit/crawlers/_http/test_http_crawler.py Updates HTTP crawler hook tests and adds post-nav hook coverage + ordering.
tests/unit/crawlers/_adaptive_playwright/test_adaptive_playwright_crawler.py Adds Adaptive Playwright post-nav hook tests and updates hook-only test naming.
src/crawlee/crawlers/_playwright/_playwright_post_nav_crawling_context.py New post-nav context type holding the Playwright response.
src/crawlee/crawlers/_playwright/_playwright_crawling_context.py Makes main Playwright context inherit from post-nav context.
src/crawlee/crawlers/_playwright/_playwright_crawler.py Inserts post-nav hook execution into the Playwright pipeline and exposes registration API.
src/crawlee/crawlers/_playwright/init.py Exports PlaywrightPostNavCrawlingContext.
src/crawlee/crawlers/_adaptive_playwright/_adaptive_playwright_crawling_context.py Adds AdaptivePlaywrightPostNavCrawlingContext wrapper + conversion helper.
src/crawlee/crawlers/_adaptive_playwright/_adaptive_playwright_crawler.py Delegates post-nav hooks to subcrawlers and adds a public registration API.
src/crawlee/crawlers/_adaptive_playwright/init.py Exports AdaptivePlaywrightPostNavCrawlingContext.
src/crawlee/crawlers/_abstract_http/_abstract_http_crawler.py Adds post-nav hook list, pipeline step, and registration method for HTTP crawlers.
src/crawlee/crawlers/init.py Re-exports the new crawling context types from the top-level crawlers package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Mantisus and others added 3 commits March 16, 2026 01:02
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Mantisus Mantisus requested review from Pijukatel and vdusek March 15, 2026 23:39
Copy link
Collaborator

@Pijukatel Pijukatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With more hooks being added, there is some code duplication. I guess there are more hooks on the way, so it would be good to start thinking about code re-use and some refactoring in case many hooks will share near duplicate code snippets.

Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I don't have enough time to review it properly manually, but here are a few comments from Claude. Consider them, please.

Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also could you please consider whether we should not document this feature somewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants