feat: add MiniMax as generation backend by octo-patch · Pull Request #365 · OpenBMB/UltraRAG

octo-patch · 2026-03-21T08:29:24Z

Summary

Add MiniMax as a first-class LLM generation backend alongside the existing vllm, openai, and hf backends.

MiniMax provides OpenAI-compatible cloud APIs with models featuring up to 1M context windows, making them well-suited for RAG workloads that require processing large retrieved contexts.

Changes

servers/generation/src/generation.py — Added minimax backend with:
- Auto-detection of MINIMAX_API_KEY environment variable
- Temperature clamping to MiniMax's accepted (0, 1] range
- Automatic <think>...</think> tag stripping (configurable via strip_think_tags)
- Default model: MiniMax-M2.7 (1M context)
- Concurrent request support with exponential backoff retry
- Two static helper methods: _clamp_temperature() and _strip_think_tags()
servers/generation/parameter.yaml — Added MiniMax config section with all available options
examples/minimax_rag.yaml — Example RAG pipeline using MiniMax backend
examples/parameter/minimax_generation_parameter.yaml — Full parameter reference
README.md / docs/README_zh.md — Added "Supported Cloud LLM Backends" table documenting all four backends with MiniMax usage instructions
tests/test_minimax_generation.py — 38 unit tests covering temperature clamping, think-tag stripping, initialization, and generation
tests/test_minimax_integration.py — 3 integration tests (auto-skipped when MINIMAX_API_KEY is not set)

Supported Models

Model	Context	Notes
`MiniMax-M2.7`	1M tokens	Latest, default
`MiniMax-M2.7-highspeed`	1M tokens	Fast variant
`MiniMax-M2.5`	256K tokens	Previous generation
`MiniMax-M2.5-highspeed`	204K tokens	Fast, long context

Usage

export MINIMAX_API_KEY="your-api-key"
ultrarag run examples/minimax_rag.yaml

Or set backend: minimax in your generation parameter file.

Test Plan

38 unit tests pass (temperature clamping, think-tag stripping, init validation, mock generation)
3 integration tests pass against live MiniMax API
Verify no regression on existing vllm/openai/hf backends

9 files changed, 857 additions(+), 3 deletions(-)

Add MiniMax as a first-class LLM provider in the generation server, alongside vllm, openai, and hf backends. MiniMax provides OpenAI-compatible cloud APIs with M2.7 and M2.5 model series. Features: - Dedicated minimax backend with auto-detection of MINIMAX_API_KEY - Temperature clamping to MiniMax's (0, 1] range - Automatic <think>...</think> tag stripping (configurable) - Default model: MiniMax-M2.7 (1M context window) - Concurrent request support with retry logic - Example YAML pipeline and parameter configuration - 38 unit tests + 3 integration tests - Documentation in both English and Chinese READMEs Supported models: MiniMax-M2.7, MiniMax-M2.7-highspeed, MiniMax-M2.5, MiniMax-M2.5-highspeed

xhd0728 · 2026-04-08T05:19:00Z

@octo-patch Thanks for the PR~

The main issue is that this introduces provider-specific logic and a dedicated parameter setup for MiniMax in the generation framework. We'd prefer not to maintain a separate parameter path for one specific provider, and instead keep the backend interface and configuration as unified as possible.

Therefore, we won't merge this PR in its current form.

xhd0728 closed this Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add MiniMax as generation backend#365

feat: add MiniMax as generation backend#365
octo-patch wants to merge 1 commit intoOpenBMB:mainfrom
octo-patch:feature/add-minimax-provider

octo-patch commented Mar 21, 2026 •

edited by xhd0728

Loading

Uh oh!

xhd0728 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

octo-patch commented Mar 21, 2026 • edited by xhd0728 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Supported Models

Usage

Test Plan

Uh oh!

xhd0728 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

octo-patch commented Mar 21, 2026 •

edited by xhd0728

Loading