Gap
Two Groq models in packages/proxy/schema/model_list.json have stale max_output_tokens values that significantly understate their current capabilities per official Groq documentation.
Current vs correct
| Entry |
Field |
Current |
Correct |
Source |
llama-3.1-8b-instant (line 5336) |
max_output_tokens |
8,192 |
131,072 |
Groq models page |
llama-3.1-8b-instant (line 5336) |
max_input_tokens |
128,000 |
131,072 |
Groq models page |
openai/gpt-oss-20b (line ~3838) |
max_output_tokens |
32,768 |
65,536 |
Groq models page |
The llama-3.1-8b-instant gap is especially significant — max output has increased 16x from 8K to 131K tokens.
Suggested changes
Update llama-3.1-8b-instant:
"llama-3.1-8b-instant": {
"format": "openai",
"flavor": "chat",
"input_cost_per_mil_tokens": 0.05,
"output_cost_per_mil_tokens": 0.08,
"displayName": "Llama 3.1 8B Instant 128k",
"max_input_tokens": 131072,
"max_output_tokens": 131072,
"available_providers": [
"groq"
]
}
Update openai/gpt-oss-20b:
"openai/gpt-oss-20b": {
"format": "openai",
"flavor": "chat",
"input_cost_per_mil_tokens": 0.075,
"output_cost_per_mil_tokens": 0.3,
"displayName": "GPT-OSS 20B",
"max_input_tokens": 131072,
"max_output_tokens": 65536,
"available_providers": [
"groq"
]
}
Verification checklist
Verification notes
| Field |
Source |
Notes |
llama-3.1-8b-instant context (131,072) |
Groq models page |
Listed under "CONTEXT WINDOW (TOKENS)" |
llama-3.1-8b-instant max completion (131,072) |
Groq models page |
Listed under "MAX COMPLETION TOKENS" |
openai/gpt-oss-20b context (131,072) |
Groq models page |
Already correct in catalog |
openai/gpt-oss-20b max completion (65,536) |
Groq models page |
Listed under "MAX COMPLETION TOKENS" |
| Pricing (both models) |
Groq models page |
Already correct in catalog |
Local files inspected
packages/proxy/schema/model_list.json:
llama-3.1-8b-instant (line 5336): max_input_tokens: 128000, max_output_tokens: 8192 (both stale)
openai/gpt-oss-20b (line ~3838): max_output_tokens: 32768 (stale)
Source URLs
{
"kind": "cost_update",
"provider": "groq",
"models": ["llama-3.1-8b-instant", "openai/gpt-oss-20b"],
"status": "active",
"model_specs": {
"llama-3.1-8b-instant": {
"format": "openai",
"flavor": "chat",
"input_cost_per_mil_tokens": 0.05,
"output_cost_per_mil_tokens": 0.08,
"displayName": "Llama 3.1 8B Instant 128k",
"max_input_tokens": 131072,
"max_output_tokens": 131072,
"available_providers": ["groq"]
},
"openai/gpt-oss-20b": {
"format": "openai",
"flavor": "chat",
"input_cost_per_mil_tokens": 0.075,
"output_cost_per_mil_tokens": 0.3,
"displayName": "GPT-OSS 20B",
"max_input_tokens": 131072,
"max_output_tokens": 65536,
"available_providers": ["groq"]
}
},
"source_urls": [
"https://console.groq.com/docs/models"
]
}
Gap
Two Groq models in
packages/proxy/schema/model_list.jsonhave stalemax_output_tokensvalues that significantly understate their current capabilities per official Groq documentation.Current vs correct
llama-3.1-8b-instant(line 5336)max_output_tokensllama-3.1-8b-instant(line 5336)max_input_tokensopenai/gpt-oss-20b(line ~3838)max_output_tokensThe
llama-3.1-8b-instantgap is especially significant — max output has increased 16x from 8K to 131K tokens.Suggested changes
Update
llama-3.1-8b-instant:Update
openai/gpt-oss-20b:Verification checklist
Verification notes
llama-3.1-8b-instantcontext (131,072)llama-3.1-8b-instantmax completion (131,072)openai/gpt-oss-20bcontext (131,072)openai/gpt-oss-20bmax completion (65,536)Local files inspected
packages/proxy/schema/model_list.json:llama-3.1-8b-instant(line 5336): max_input_tokens: 128000, max_output_tokens: 8192 (both stale)openai/gpt-oss-20b(line ~3838): max_output_tokens: 32768 (stale)Source URLs
{ "kind": "cost_update", "provider": "groq", "models": ["llama-3.1-8b-instant", "openai/gpt-oss-20b"], "status": "active", "model_specs": { "llama-3.1-8b-instant": { "format": "openai", "flavor": "chat", "input_cost_per_mil_tokens": 0.05, "output_cost_per_mil_tokens": 0.08, "displayName": "Llama 3.1 8B Instant 128k", "max_input_tokens": 131072, "max_output_tokens": 131072, "available_providers": ["groq"] }, "openai/gpt-oss-20b": { "format": "openai", "flavor": "chat", "input_cost_per_mil_tokens": 0.075, "output_cost_per_mil_tokens": 0.3, "displayName": "GPT-OSS 20B", "max_input_tokens": 131072, "max_output_tokens": 65536, "available_providers": ["groq"] } }, "source_urls": [ "https://console.groq.com/docs/models" ] }