Per-persona multi-language support (translate-for-retrieval + reply-in-user-language) by rajivml · Pull Request #40 · UiPath/danswer

rajivml · 2026-05-04T19:09:43Z

Summary

Adds a per-persona toggle that makes Darwin work for non-English (Japanese / Chinese / Korean) users without changing the indexing pipeline, embedding model, or any infra. The flag is opt-in per assistant — English-only assistants pay nothing extra, and customers running primarily English traffic aren't impacted.

When the flag is on:

Retrieval — a non-English query is translated to English before hitting Vespa, so the existing English-corpus index returns the right docs (resolution: persona flag → MULTILINGUAL_QUERY_EXPANSION env var → off).
Answer language — after the answering LLM produces a (possibly English) reply, a small second-pass LLM call translates the answer back into the user's original language. Citation markers ([1], [2], [[1]](url)), URLs, and code blocks are preserved verbatim.
Chat session naming — chat titles follow the user's language too.

Both code paths are covered:

Chat UI (/chat) — chat/process_message.py
Slack one-shot (the path the slackbot listener uses) — one_shot_answer/answer_question.py

What was implemented

Backend

Persona.multilingual_query_expansion: bool — new column with default false. Alembic migration a3f1d7c4e9b2_persona_multilingual_query_expansion.py.
CreatePersonaRequest / PersonaSnapshot carry the field through the persona admin API.
upsert_persona / create_update_persona accept and persist the flag.
PromptConfig carries the flag forward so prompt builders can decide whether to append LANGUAGE_HINT.
SearchPipeline reads the persona flag and (when on) sets multilingual_expansion_str=\"English\" so retrieval translates to English before search.
citations_prompt.py and quotes_prompt.py source the language-hint from prompt_config, with the global env var as fallback.
chat_session_naming.get_renamed_conversation_name accepts use_language_hint; the rename endpoint fetches the chat-session's persona and forwards it.
chat/multilingual_translation.py (new module): detect_query_language (Unicode-script heuristic), language_name, translate_answer_to_language (LLM call that preserves citations, URLs, code blocks; falls back to English on failure).
chat/process_message.py — when persona has the flag on AND the query is non-English: buffer DanswerAnswerPiece tokens during streaming, run the translate pass after stream-end, emit translated text as a single piece, persist translated text in the DB. Other packets (citations, tool responses) flow in real time.
one_shot_answer/answer_question.py — same buffer-and-translate logic for the Slack path. CitationInfo packets keep flowing during the buffer, so the slackbot's existing 5-attempt citation-required retry loop works unchanged.

Frontend

Persona TS interface, PersonaCreationRequest / PersonaUpdateRequest, and buildPersonaAPIBody carry the field.
Assistant editor: a BooleanFormField checkbox in the Misc section labeled "Enable multi-language support" with subtext explicitly noting the cost (+1 LLM call per non-English query).

Drive-by fixes (rolled into this branch by request)

web/src/app/admin/bot/SlackBotConfigCreationForm.tsx — fix silent-submit on the Slack-bot config create form. The Yup schema unconditionally required curated_response_config.response_message, but the matching input is only rendered when the integration toggle is on. Result: with the default toggle off, the field was empty, validation failed silently, and Create did nothing. Fixed by gating the validation with .when(...) to mirror the jira_config pattern.
web/src/components/table/DragHandle.tsx — destructure isDragging before spreading onto the DOM <div>, silencing React's "unknown DOM attribute" warning that fired on every PersonasTable render.

What was tested

backend/scripts/test_multilanguage_e2e.py — end-to-end smoke test driving the real local stack (Postgres + Vespa + the configured GenAI provider). Five phases:

Setup — create test connector + cc-pair, seed 3 English docs about a unique fictional brand ("Zorblax") into Vespa via the real indexing pipeline, create two test personas (flag on / flag off).
English baseline — sanity check that retrieval and entity-in-answer work for English queries against the seeded corpus.
Non-English (chat-UI path) — for ja / zh / ko, hard-asserts (a) retrieval brought back the expected English doc, (b) the answer contains the factual entity (numerals / proper nouns survive translation), (c) the final answer text is in the user's language.
Control persona — same non-English queries against the flag-OFF persona; logged informationally to show the contrast.
Slack one-shot path — drives get_search_answer (the entry point the slackbot listener uses) with the same hard contract as Phase 3, proving the Slack flow honors the flag end-to-end.

The test is re-run safe: cascading SQL teardown handles chat_message__search_doc, tool_call, chat_feedback, and document_retrieval_feedback FK dependents before deleting chat_message / chat_session / persona.

Local results

Stability run before option C (post-translation pass) was wired:

6/6 retrieval contract: works every time
0/9 non-English answers came back in the user's language — gpt-4o-mini in the gateway routinely ignored LANGUAGE_HINT when context was English-heavy

After option C (post-translation pass):

27/27 hard assertions PASS (12 retrieval + 12 entity-in-answer + 3 English-baseline)
9/9 non-English answers in the user's language via the chat-UI path
3/3 non-English answers in the user's language via the Slack one-shot path (Phase 5)
Control persona (flag off) still answers in English — confirms the post-pass only fires for opted-in personas

Re-run on this branch: please run cd backend && PYTHONPATH=$(pwd) python scripts/test_multilanguage_e2e.py --yes against your local stack to verify before merge.

Trade-offs / things to know

Streaming UX for non-English replies: token-by-token streaming is suspended for opted-in personas because we buffer the answer and translate it before emitting. The user sees the whole translated answer at once after one extra LLM round-trip. English replies are unaffected.
Cost: one extra LLM call per non-English query (translate). Only on personas with the flag on. English personas / queries pay zero overhead.
Slack bot citation gate is preserved: the [1]/[2] citation retry loop still works because the translation prompt instructs the LLM to keep markers verbatim, and CitationInfo packets flow in real time even when answer pieces are buffered.
LLM non-determinism: occasional borderline queries (e.g. JA office printer with multiple similar docs in the corpus) can have the LLM hedge on which doc to cite — observed in 1 of 6 stability runs. Not a wiring issue; would benefit from a stronger model.
Resolution precedence: persona flag → MULTILINGUAL_QUERY_EXPANSION env var → off. Existing global-env-var deployments are untouched.

Process bounce required after merge + deploy

Per CLAUDE.md ("Modify a SQLAlchemy model → run alembic upgrade head, then bounce API server + background jobs"):

alembic upgrade head from backend/ (adds the new column with server_default false).
Bounce dapi (api-server) — new ORM mapping + new module imports.
Bounce dbe (background jobs).
Bounce dsl (Slack listener) — needed for the Slack post-translation path to take effect.

Test plan

Pull, run alembic, bounce all three services
Run python scripts/test_multilanguage_e2e.py --yes against the dev stack — expect exit 0 with all 5 phases green
In the chat UI: pick a multilingual-flagged assistant, ask a question in Japanese; expect a Japanese answer with [1]/[2] citations preserved
In a Slack channel bound to the same persona: post the same question; expect a Japanese reply
Sanity-check English flow on a non-flagged persona — should still stream tokens normally
Smoke-test the drive-by fixes: open /admin/bot/new, click Create with all fields default — should now show backend validation errors instead of doing nothing silently. PersonasTable should no longer log the isDragging warning in dev console.

🤖 Generated with Claude Code

Adds a per-persona `multilingual_query_expansion` boolean. When on: non-English queries are translated to English for retrieval and the answer-side prompt gets the LANGUAGE_HINT directive so the LLM is asked to reply in the user's original language. Resolution precedence: persona flag > MULTILINGUAL_QUERY_EXPANSION env var > off. Existing global behavior is preserved. Touched call sites: - Persona model + alembic migration (new column, default false) - CreatePersonaRequest / PersonaSnapshot pydantic - upsert_persona / create_update_persona - PromptConfig (carries the flag through to answer prompts) - process_message threads persona.flag into PromptConfig - SearchPipeline reads persona flag, falls back to env var - citations_prompt / quotes_prompt source language-hint from prompt config (with env-var fallback) - chat_session_naming accepts use_language_hint; chat_backend rename fetches persona and forwards the flag - Assistant editor: checkbox + initial value + yup validation + payload field Tests: backend/scripts/test_multilanguage_e2e.py drives the real stack: seeds 3 English Vespa docs about a unique fictional brand ("Zorblax"), creates two personas (flag on / off), and for en/ja/ zh/ko asserts retrieval lands the right doc and the answer contains the factual entity. Answer-language detection is logged but not asserted (the LLM's adherence to LANGUAGE_HINT varies by model and isn't part of the wiring contract). Re-run safe; cascading SQL teardown handles all FK dependents. Locally: 3 back-to-back runs, 27/27 hard assertions pass on 2 of 3 runs, 26/27 on the third (one LLM hedge on a borderline query). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The persona's `multilingual_query_expansion` flag already added the LANGUAGE_HINT directive to the answering prompt, but gpt-4o-mini in practice ignores it and answers in English when context is English- heavy (0/9 non-English answers came back in the user's language across 6 prior runs). This adds a deterministic second pass: when the persona has the flag on AND the user query is detected as ja/zh/ko, we buffer the streamed DanswerAnswerPiece tokens during generation, then make one extra LLM call after stream end to translate the assembled English answer into the user's language. The translated text is then emitted as a single answer piece, and the DB-saved message is the translated version too. Trade-offs: - Non-English replies lose token-by-token streaming (one extra round-trip; user sees a brief delay then the whole answer at once). - One extra LLM call per non-English query — only for personas opted in via the flag. - Citations preserved: the translation prompt is directive about keeping [1]/[2] markers, URLs, and code blocks verbatim. - English queries are entirely unaffected (translate_target stays None, no buffering, normal streaming). Files: - backend/danswer/chat/multilingual_translation.py (new): detect_query_language() Unicode-script heuristic, language_name() code→display-name lookup, translate_answer_to_language() that falls back to English on any LLM failure rather than dropping the response. - backend/danswer/chat/process_message.py: Detect query language up front; intercept DanswerAnswerPiece in the stream loop when in translate mode; emit translated text and persist it as the assistant message. - backend/scripts/test_multilanguage_e2e.py: Promote the answer-language detection from informational to a hard assertion. Local run with this commit: 9/9 non-English answers came back in the user's language (vs 0/9 prior). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…one-shot path The chat-UI flow honors the persona's multilingual_query_expansion flag end-to-end, but the Slack-bot path (one_shot_answer.py) was never updated: - PromptConfig.from_model() was called without the flag, so LANGUAGE_HINT was never appended to Slack-served answers. - There was no post-translation pass, so the answering LLM's English output was sent to Slack verbatim. This commit mirrors what process_message.py does: 1. Read chat_session.persona.multilingual_query_expansion and thread it into both PromptConfig.from_model() call sites. 2. Detect non-English query language up front; if matched and the persona has the flag on, set translate_target. 3. In the streaming loop, buffer DanswerAnswerPiece tokens (instead of yielding them) when in translate mode. CitationInfo packets still flow in real time so the slackbot's citation-required retry loop in get_search_answer keeps working — and the translation prompt preserves [1]/[2] markers verbatim, so the emitted translated text still satisfies the citation gate. 4. After stream end, run the translate LLM pass and yield the translated answer as a single DanswerAnswerPiece (+ None terminator). 5. Persist the translated text in the saved ChatMessage and re-count its tokens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drives `get_search_answer` (the entry point the slack listener uses) through the same query set as Phase 3 and asserts the same hard contract: retrieval hits the expected doc, the factual entity appears in the answer, and the answer is in the user's language. Local run with this commit: 9/9 hard PASS in Phase 5; 3/3 non- English answers came back in the user's language via the Slack code path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Yup schema unconditionally required curated_response_config. response_message, but the matching text input is only rendered when enable_curated_response_integration is true. Default is false, so on a fresh /admin/bot/new the field was empty, validation failed silently, the Create button did nothing, and no error rendered because the errored field wasn't on screen. Mirror the jira_config pattern: only require when the toggle is enabled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@dnd-kit/sortable passes a logical `isDragging` prop; spreading it onto the underlying <div> tripped React's "unknown DOM attribute" warning on every PersonasTable render. Destructure it before spread. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rajivml and others added 6 commits May 4, 2026 22:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per-persona multi-language support (translate-for-retrieval + reply-in-user-language)#40

Per-persona multi-language support (translate-for-retrieval + reply-in-user-language)#40
rajivml wants to merge 6 commits intofeature/darwinfrom
feature/multilanguage-support

rajivml commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rajivml commented May 4, 2026

Summary

What was implemented

Backend

Frontend

Drive-by fixes (rolled into this branch by request)

What was tested

Local results

Trade-offs / things to know

Process bounce required after merge + deploy

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant