Skip to content

OpenAI Batch API support for SmartScraperMultiGraph #1036

@Ceirced

Description

@Ceirced

Is your feature request related to a problem? Please describe.

I'm using SmartScraperMultiGraph and the API costs get pretty expensive. Since I don't need real-time results anyway, it feels like I'm paying extra for speed I don't actually need.

Describe the solution you'd like

Support for OpenAI's Batch API in SmartScraperMultiGraph.
The Batch API gives a 50% discount on token costs - you just have to wait up to 24 hours for results instead of getting them immediately.
The implementation could be something like a config flag (use_batch_api = True) or a separate class. The scraper would still fetch and parse all the HTML normally, but instead of making individual LLM calls, it would bundle them into a batch job and let you retrieve results later.

Describe alternatives you've considered

  • Writing a wrapper myself that collects all the prompts, submits them via batch, and maps the results back - but this feels like it should be built into the library
  • Just accepting the higher costs and using the regular sync API

Additional context

Batch API docs: https://platform.openai.com/docs/guides/batch
Pricing showing 50% discount: https://openai.com/api/pricing/

Let me know if you are interested or what you think!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions