Skip to content

Add token estimation for chat input feature#1

Merged
mohammadumar-dev merged 1 commit into
mainfrom
develop
May 12, 2026
Merged

Add token estimation for chat input feature#1
mohammadumar-dev merged 1 commit into
mainfrom
develop

Conversation

@mohammadumar-dev
Copy link
Copy Markdown
Owner

PR Description

In a Nutshell

This PR introduces token budget management and retry logic for API calls to prevent context overflow issues. It optimizes prompt construction by sending only essential file metadata (paths, status, area classification) instead of full diffs, and adds automatic retry handling for rate-limited API requests.


Changes

Core Improvements:

  1. Token Budget Management

    • Added new constants: maxCommitRequestTokens (1400) and maxGroupingRequestTokens (1200) to prevent exceeding API limits
    • Introduced limitChatPayload() function to intelligently truncate user messages while preserving system instructions
    • Added token estimation functions: estimateChatInputTokens(), estimateTokens(), and estimatedCharsPerToken constant (4 chars/token)
  2. Optimized Grouping Prompt

    • Replaced full diff context with a lightweight file graph (buildChangedFilesGraph())
    • Changed from buildManagedChangeContext() to file-only metadata including: path hierarchy, change status, and inferred file area
    • Added formatGraphPath() helper for consistent tree-style path formatting
    • Reduced MaxTokens from 900 to 500 for commit grouping requests
    • Updated system prompt to clarify "Use only path, status, and area metadata; do not assume diff contents"
  3. Retry Logic with Rate Limit Handling

    • Refactored sendGroqChat() to wrap sendGroqChatOnce() with retry mechanism (up to 4 attempts, configurable via maxGroqAttempts)
    • Added groqHTTPError type for structured error handling with HTTP status codes
    • Added groqRetryDelay() function to parse rate-limit headers and calculate smart backoff delays
    • Exponential-like backoff: 500ms to 30s, respects server-provided retry guidance
  4. Commit Message Generation Enhancement

    • Added payload limiting before the second refinement pass in generateCommitMessage()
    • Ensures all AI requests stay within token budgets

Code Alignment:

  • Improved system prompt clarity for commit grouping
  • Reduced redundancy by eliminating expensive diff-context building for grouping operations
  • Better error context through typed HTTP errors

Why This Matters

  • Reliability: Automatic retry handling prevents transient failures from failing the entire operation
  • Cost Efficiency: Reduced token usage per request lowers API costs
  • Stability: Token budget enforcement prevents "context overflow" errors from the AI API
  • Debuggability: Structured error types and rate-limit parsing make troubleshooting easier

Suggested Next Steps

  1. Review the retry logic — Ensure backoff delays are appropriate for your use case
  2. Validate token estimates — The 4 chars/token estimate may need tuning based on actual Groq API behavior
  3. Test with large changesets — Verify that the file graph approach works well for repos with 50+ changed files
  4. Check system prompts — Confirm that the AI still produces quality grouping recommendations with path-only input

@mohammadumar-dev mohammadumar-dev self-assigned this May 12, 2026
@mohammadumar-dev mohammadumar-dev added bug Something isn't working enhancement New feature or request labels May 12, 2026
@mohammadumar-dev mohammadumar-dev merged commit 00c4b76 into main May 12, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant