-
Notifications
You must be signed in to change notification settings - Fork 5
Closed
Description
Is your feature request related to a problem? Please describe.
Gemini models offer free generations but are capped in terms of RPM, RPD.
- LLM providers or the generate method should offer to deliberately slow down the generation process to accomodate low RPM.
- This could be down via an
rpm_limitargument.
Imagine two providers: OpenAI + Gemini:
- OpenAI is not rate limited
- Gemini we want to leverage the firee tier so it is rate limited: say 10 RPM.
We should setrpm_limit=10either inGeminiProviderclass or 'LLMProvideror ingenerate()`. I have the feeling it should be at provider level. TBC.
Metadata
Metadata
Assignees
Labels
No labels