- ...
0.1.0 (2024-07-09)
- new project bootstrapping via
pipx run crawlee create
- improve error handling in project bootstrapping
0.0.7 (2024-06-27)
- selector handling for
RETRY_CSS_SELECTORSin_handle_blocked_requestinBeautifulSoupCrawler - selector handling in
enqueue_linksinBeautifulSoupCrawler - improve
AutoscaledPoolstate management
0.0.6 (2024-06-25)
- BREAKING:
BasicCrawler.export_datahelper method which replacesBasicCrawler.export_to Configuration.get_global_configurationmethod- Automatic logging setup
- Context helper for logging (
context.log)
- Handling of relative URLs in
add_requests - Graceful exit in
BasicCrawler.run
0.0.5 (2024-06-21)
- Add explicit error messages for missing package extras during import
- Better browser abstraction:
BrowserController- Wraps a single browser instance and maintains its state.BrowserPlugin- Manages the browser automation framework, and basically acts as a factory for controllers.
- Browser rotation with a maximum number of pages opened per browser.
- Add emit persist state event to event manager
- Add batched request addition in
RequestQueue - Add start requests option to
BasicCrawler - Add storage-related helpers
get_data,push_dataandexport_totoBasicCrawlerandBasicContext - Add enqueue links helper to
PlaywrightCrawler - Add max requests per crawl option to
BasicCrawler
- Fix type error in persist state of statistics
0.0.4 (2024-05-30)
- Another internal release, adding statistics capturing, proxy configuration and
the initial version of browser management and
PlaywrightCrawler.
StatisticsProxyConfigurationBrowserPoolPlaywrightCrawler
0.0.3 (2024-05-15)
- Another internal release, adding mainly session management and
BeautifulSoupCrawler.
HttpxClientSessionPoolBeautifulSoupCrawlerBaseStorageClientStoragesandMemoryStorageClientwere refactored
0.0.2 (2024-04-11)
- The first internal release with
BasicCrawlerandHttpCrawler.
EventManager&LocalEventManagerSnapshotterAutoscaledPoolMemoryStorageClientStoragesBasicCrawler&HttpCrawler
0.0.1 (2024-01-30)
- Dummy package
crawleewas released on PyPI.