feat(ai-rag): support multiple embedding providers, add Cohere rerank, and standardize chat interface#12941
feat(ai-rag): support multiple embedding providers, add Cohere rerank, and standardize chat interface#12941ChuanFF wants to merge 5 commits intoapache:masterfrom
Conversation
|
Hi @ChuanFF, could you explain these breaking changes? Is it necessary to introduce these changes? |
|
There was a problem hiding this comment.
Pull request overview
This PR significantly enhances the ai-rag plugin by adding multi-provider support for embeddings, introducing optional reranking capability via Cohere, and standardizing the chat interface to work with native LLM request formats. The changes represent a major architectural improvement but introduce breaking changes in the API.
Changes:
- Added support for multiple embedding providers (OpenAI, Azure OpenAI, OpenAI-compatible) through a unified schema and shared openai-base implementation
- Implemented optional Cohere rerank capability with graceful fallback behavior
- Removed custom
ai_ragfield requirement from requests, now works with standardmessagesarray format - Enhanced Azure AI Search configuration with additional options (fields, select, k, exhaustive)
- Renamed providers to use consistent kebab-case naming (azure-openai, azure-ai-search)
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
apisix/plugins/ai-rag.lua |
Core plugin rewrite with schema redesign using oneOf, dynamic driver loading, standardized message handling, and context injection before last user message |
apisix/plugins/ai-rag/embeddings/openai-base.lua |
Unified embeddings implementation supporting OpenAI API format with Bearer token authentication, model and dimensions configuration |
apisix/plugins/ai-rag/embeddings/openai.lua |
Wrapper returning openai-base for OpenAI provider |
apisix/plugins/ai-rag/embeddings/azure-openai.lua |
Wrapper returning openai-base for Azure OpenAI provider |
apisix/plugins/ai-rag/embeddings/openai-compatible.lua |
Wrapper returning openai-base for OpenAI-compatible providers |
apisix/plugins/ai-rag/vector-search/azure-ai-search.lua |
Enhanced Azure AI Search with configurable fields, select, k, exhaustive; returns parsed document strings instead of raw JSON |
apisix/plugins/ai-rag/rerank/cohere.lua |
New Cohere rerank implementation with graceful fallback on API failures |
t/plugin/ai-rag.t |
Comprehensive test suite with mock servers for embeddings, vector search, and rerank; tests authentication, error handling, and context injection |
docs/en/latest/plugins/ai-rag.md |
Updated documentation with new provider configurations, attributes table, and usage examples |
docs/zh/latest/plugins/ai-rag.md |
Chinese translation of updated documentation |
Makefile |
Added installation rules for rerank directory |
Comments suppressed due to low confidence (5)
apisix/plugins/ai-rag/embeddings/openai-base.lua:98
- The
modelfield has a default value in the schema (line 38), but when constructing the request body in line 98,conf.modelis used directly without checking if it's nil. If a user doesn't specify a model, the default should be applied during schema validation. However, if for any reason the default isn't applied, this could result in sendingmodel: nullto the API. Consider adding a fallback or verifying that schema defaults are always applied.
apisix/plugins/ai-rag/vector-search/azure-ai-search.lua:37 - The description for the
fieldsproperty says "Comma-separated list of fields to retrieve" (line 37), but based on Azure AI Search API documentation and the usage in line 66, this should be a single field name for vector search, not a comma-separated list. The description should be updated to accurately reflect that it's the target field name for the vector search query.
apisix/plugins/ai-rag/embeddings/openai-base.lua:99 - The
dimensionsfield is optional in the schema (no default, not required), but it's always included in the request body (line 99). Whenconf.dimensionsis nil, this will result indimensions: nullin the JSON body sent to the API. While OpenAI's API may tolerate null values by ignoring them, it's cleaner to only include the dimensions field when it's explicitly set. Consider using conditional logic:if conf.dimensions then body.dimensions = conf.dimensions end.
apisix/plugins/ai-rag/vector-search/azure-ai-search.lua:105 - The code accesses
item[conf.select](line 105) without checking if the field exists in the item. If the Azure AI Search returns items without the specified select field, this could result in nil values in the docs array, which may cause issues downstream in the rerank or context injection stages. Consider adding validation to ensure the field exists, or filtering out items that don't contain the selected field.
apisix/plugins/ai-rag/vector-search/azure-ai-search.lua:46 - The description for the
selectproperty says "field to select in the response" (singular), but the Azure AI Search API and documentation both indicate this can be a comma-separated list of fields. The implementation in line 105 (item[conf.select]) treats it as a single field key, so either the implementation should handle comma-separated values or the description should explicitly state it's a single field name.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@Baoyuantop I've addressed the feedback from Copilot in this PR. Could you please take another look? |
Description
This PR enhances the
ai-ragplugin with multi-provider support, optional reranking capability, and a standardized chat interface that works with native LLM request formats.Key Improvements
1. Multiple Embedding Provider Support
openai– OpenAI embeddings APIazure-openai– Azure OpenAI Serviceopenai-compatible– Any OpenAI-compatible endpointopenai-base.luawith provider-specific configurationmodelanddimensionsoptions for flexibility2. Optional Rerank Stage
top_nparameter3. Standardized Chat Interface
ai_ragfield in request body{ "messages": [ {"role": "user", "content": "What is APISIX?"} ] }rag_config.input_strategyto control text extraction:last(default): uses only the latest user messageall: concatenates all user messages with newlines4. Improved Context Injection
5. Vector Search Enhancements (Azure AI Search)
azure-ai-search(kebab-case consistency)fields– fields to search againstselect– fields to return in resultsk– number of nearest neighbors (default: 5)exhaustive– whether to perform exhaustive search (default: true)6. Schema Improvements
oneOffor provider selectionai_ragpayload requirementsBackward Compatibility
ai_ragfield with nested configsmessagesarray onlyazure_openai(snake_case)azure-openai(kebab-case)This redesign is intentional to support more flexible and production-ready RAG workflows.
Example Configuration
{ "embeddings_provider": { "openai": { "endpoint": "https://api.openai.com/v1/embeddings", "api_key": "sk-xxx", "model": "text-embedding-3-large", "dimensions": 1536 } }, "vector_search_provider": { "azure-ai-search": { "endpoint": "https://xxx.search.windows.net/indexes/xxx/docs/search?api-version=2023-11-01", "api_key": "xxx", "fields": "vector", "select": "content", "k": 5, "exhaustive": true } }, "rerank_provider": { "cohere": { "endpoint": "https://api.cohere.ai/v2/rerank", "api_key": "xxx", "model": "rerank-english-v3.0", "top_n": 3 } }, "rag_config": { "input_strategy": "last" } }Motivation
The original
ai-ragimplementation had several limitations:ai_ragfield, making integration with standard LLM APIs cumbersomeThis enhancement addresses these issues by: