Prefer RateLimitError on 429 with rate-limit body by juanmanuelramallo · Pull Request #775 · crmne/ruby_llm

juanmanuelramallo · 2026-05-19T03:33:14Z

What this does

Some providers include token-budget context in their HTTP 429 rate-limit bodies. Anthropic, for example, returns messages like "...rate limit of 5,000,000 input tokens per minute...". The "input tokens" substring matches the existing /input[_\s-]?token/i entry in CONTEXT_LENGTH_PATTERNS, so the current 429 branch classifies these as ContextLengthExceededError — making it impossible for callers to tell a token-quota rate-limit (retriable with backoff) apart from a true context-window overflow (non-retriable).

On 429, prefer RateLimitError whenever the body carries an explicit rate-limit signal ("rate limit"), and only fall back to ContextLengthExceededError when the message is unambiguously context-length-shaped. A single regex is sufficient — the Anthropic body and other documented rate-limit responses use the literal phrase "rate limit".

Preserves the existing "429 + Request too large for model" → ContextLengthExceededError behavior (that phrase has no rate-limit signal). All other status-code branches and the 400 branch are unchanged.

Type of change

Scope check

I read the Contributing Guide
This aligns with RubyLLM's focus on LLM communication
This isn't application-specific logic that belongs in user code
This benefits most users, not just my specific use case

Required for new features

I opened an issue before writing code and received maintainer approval
Linked issue: #___

PRs for new features or enhancements without a prior approved issue will be closed.

Quality check

I ran overcommit --install and all hooks pass
I tested my changes thoroughly
- For provider changes: Re-recorded VCR cassettes with bundle exec rake vcr:record[provider_name]
- All tests pass: bundle exec rspec
I updated documentation if needed
I didn't modify auto-generated files manually (models.json, aliases.json)

AI-generated code

I used AI tools to help write this code
I have reviewed and understand all generated code (required if above is checked)

API changes

Breaking change
New public methods/classes
Changed method signatures
No API changes

Some providers include token-budget context in their HTTP 429 rate-limit bodies. Anthropic, for example, returns messages like "...rate limit of 5,000,000 input tokens per minute...". The "input tokens" substring matches the existing /input[_\\s-]?token/i entry in CONTEXT_LENGTH_PATTERNS, so the current 429 branch classifies these as ContextLengthExceededError — making it impossible for callers to tell a token-quota rate-limit (retriable with backoff) apart from a true context-window overflow (non-retriable). On 429, prefer RateLimitError whenever the body carries an explicit rate-limit signal ("rate limit"), and only fall back to ContextLengthExceededError when the message is unambiguously context-length-shaped. A single regex is sufficient — the Anthropic body and other documented rate-limit responses use the literal phrase "rate limit". Preserves the existing "429 + Request too large for model" → ContextLengthExceededError behavior (that phrase has no rate-limit signal). All other status-code branches and the 400 branch are unchanged.

codecov · 2026-05-19T03:36:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.22%. Comparing base (5bdda1a) to head (aa2a77f).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #775   +/-   ##
=======================================
  Coverage   87.21%   87.22%           
=======================================
  Files         121      121           
  Lines        5703     5707    +4     
  Branches     1442     1443    +1     
=======================================
+ Hits         4974     4978    +4     
  Misses        729      729

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

juanmanuelramallo · 2026-05-22T02:54:38Z

@crmne does it make sense to merge this?

Merge branch 'main' into fix-anthropic-rate-limit-misclassification

aa2a77f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prefer RateLimitError on 429 with rate-limit body#775

Prefer RateLimitError on 429 with rate-limit body#775
juanmanuelramallo wants to merge 2 commits into
crmne:mainfrom
juanmanuelramallo:fix-anthropic-rate-limit-misclassification

juanmanuelramallo commented May 19, 2026

Uh oh!

codecov Bot commented May 19, 2026 •

edited

Loading

Uh oh!

juanmanuelramallo commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

juanmanuelramallo commented May 19, 2026

What this does

Type of change

Scope check

Required for new features

Quality check

AI-generated code

API changes

Uh oh!

codecov Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

juanmanuelramallo commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 19, 2026 •

edited

Loading