Skip to content

[BOT ISSUE] Groq: stale token limits on llama-3.1-8b-instant (max_output 8K → 131K) and openai/gpt-oss-20b (max_output 32K → 65K) #751

@github-actions

Description

@github-actions

Gap

Two Groq models in packages/proxy/schema/model_list.json have stale max_output_tokens values that significantly understate their current capabilities per official Groq documentation.

Current vs correct

Entry Field Current Correct Source
llama-3.1-8b-instant (line 5336) max_output_tokens 8,192 131,072 Groq models page
llama-3.1-8b-instant (line 5336) max_input_tokens 128,000 131,072 Groq models page
openai/gpt-oss-20b (line ~3838) max_output_tokens 32,768 65,536 Groq models page

The llama-3.1-8b-instant gap is especially significant — max output has increased 16x from 8K to 131K tokens.

Suggested changes

Update llama-3.1-8b-instant:

"llama-3.1-8b-instant": {
  "format": "openai",
  "flavor": "chat",
  "input_cost_per_mil_tokens": 0.05,
  "output_cost_per_mil_tokens": 0.08,
  "displayName": "Llama 3.1 8B Instant 128k",
  "max_input_tokens": 131072,
  "max_output_tokens": 131072,
  "available_providers": [
    "groq"
  ]
}

Update openai/gpt-oss-20b:

"openai/gpt-oss-20b": {
  "format": "openai",
  "flavor": "chat",
  "input_cost_per_mil_tokens": 0.075,
  "output_cost_per_mil_tokens": 0.3,
  "displayName": "GPT-OSS 20B",
  "max_input_tokens": 131072,
  "max_output_tokens": 65536,
  "available_providers": [
    "groq"
  ]
}

Verification checklist

  • Cross-source: Token limits confirmed on the Groq documentation page which serves as both model listing and pricing reference:
    1. Groq models page — production models table lists context window and max completion tokens for each model
    2. Same page includes pricing ($0.05/$0.08 for llama-3.1-8b-instant, $0.075/$0.30 for gpt-oss-20b) — pricing matches catalog, confirming correct model identification
  • Recent commits: No recent commit corrects these values
  • ID format: Existing entries, no ID change needed

Verification notes

Field Source Notes
llama-3.1-8b-instant context (131,072) Groq models page Listed under "CONTEXT WINDOW (TOKENS)"
llama-3.1-8b-instant max completion (131,072) Groq models page Listed under "MAX COMPLETION TOKENS"
openai/gpt-oss-20b context (131,072) Groq models page Already correct in catalog
openai/gpt-oss-20b max completion (65,536) Groq models page Listed under "MAX COMPLETION TOKENS"
Pricing (both models) Groq models page Already correct in catalog

Local files inspected

  • packages/proxy/schema/model_list.json:
    • llama-3.1-8b-instant (line 5336): max_input_tokens: 128000, max_output_tokens: 8192 (both stale)
    • openai/gpt-oss-20b (line ~3838): max_output_tokens: 32768 (stale)

Source URLs

{
  "kind": "cost_update",
  "provider": "groq",
  "models": ["llama-3.1-8b-instant", "openai/gpt-oss-20b"],
  "status": "active",
  "model_specs": {
    "llama-3.1-8b-instant": {
      "format": "openai",
      "flavor": "chat",
      "input_cost_per_mil_tokens": 0.05,
      "output_cost_per_mil_tokens": 0.08,
      "displayName": "Llama 3.1 8B Instant 128k",
      "max_input_tokens": 131072,
      "max_output_tokens": 131072,
      "available_providers": ["groq"]
    },
    "openai/gpt-oss-20b": {
      "format": "openai",
      "flavor": "chat",
      "input_cost_per_mil_tokens": 0.075,
      "output_cost_per_mil_tokens": 0.3,
      "displayName": "GPT-OSS 20B",
      "max_input_tokens": 131072,
      "max_output_tokens": 65536,
      "available_providers": ["groq"]
    }
  },
  "source_urls": [
    "https://console.groq.com/docs/models"
  ]
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions