Skip to content

feat(js/plugins/compat-oai): add translation adapter and add translation flow in testapps#4786

Open
7hokerz wants to merge 4 commits intofirebase:mainfrom
7hokerz:7hokerz/compat-oai-translation-adapter-openai
Open

feat(js/plugins/compat-oai): add translation adapter and add translation flow in testapps#4786
7hokerz wants to merge 4 commits intofirebase:mainfrom
7hokerz:7hokerz/compat-oai-translation-adapter-openai

Conversation

@7hokerz
Copy link
Contributor

@7hokerz 7hokerz commented Feb 20, 2026

Description here... Help the reviewer by:

  • support for OpenAI's audio translation models
    (Specially, requests utilizing whisper-1-translate are dynamically mapped to whisper-1 before hitting the OpenAI API to distinguish between STT and translation workflows)
    (I'm not sure if this is the best approach. I'm open to suggestions if you see any issues)
  • Implemented the new feature by referencing existing logic and conventions to ensure code consistency

Checklist (if applicable):

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @7hokerz, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the capability to perform audio translation using OpenAI models within the compat-oai plugin. It defines new types, model configurations, and integration points to handle translation requests, specifically managing the distinction between translation and transcription models when communicating with the OpenAI API. A new test flow has also been added to validate this functionality.

Highlights

  • OpenAI Audio Translation Support: Added comprehensive support for OpenAI's audio translation models within the compat-oai plugin.
  • Model Mapping for Translation: Implemented dynamic mapping for requests utilizing whisper-1-translate to whisper-1 before interacting with the OpenAI API, distinguishing translation from speech-to-text workflows.
  • Code Consistency: Ensured new features were implemented by referencing existing logic and conventions to maintain code consistency.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • js/plugins/compat-oai/src/audio.ts
    • Added TranslationCreateParams and TranslationCreateResponse types from openai/resources/audio/index.mjs.
    • Introduced TranslationRequestBuilder type for custom request modifications.
    • Defined TRANSLATION_MODEL_INFO for translation model capabilities.
    • Created TranslationConfigSchema for translation-specific configuration options.
    • Implemented toTranslationRequest function to convert Genkit GenerateRequest to OpenAI TranslationCreateParams.
    • Added translationToGenerateResponse function to convert OpenAI translation results to Genkit GenerateResponseData.
    • Exported defineCompatOpenAITranslationModel to define new Genkit models compatible with OpenAI Translation API.
    • Provided compatOaiTranslationModelRef helper for creating translation model references with reasonable defaults.
  • js/plugins/compat-oai/src/index.ts
    • Exported TranslationConfigSchema, compatOaiTranslationModelRef, defineCompatOpenAITranslationModel, and TranslationRequestBuilder from ./audio.js.
  • js/plugins/compat-oai/src/openai/index.ts
    • Imported defineCompatOpenAITranslationModel and TranslationConfigSchema from ../audio.js.
    • Imported openAITranslationModelRef and SUPPORTED_TRANSLATION_MODELS from the newly created ./translation.js.
    • Modified createResolver to include logic for resolving 'translate' actions, dynamically mapping whisper-1-translate to whisper-1.
    • Updated listActions to include metadata for translation models.
    • Extended openAIPlugin to define and register translation models, including the whisper-1-translate mapping.
    • Updated the OpenAIPlugin type definition to include model overloads for translation models.
  • js/plugins/compat-oai/src/openai/translation.ts
    • Added a new file to define openAITranslationModelRef for OpenAI-specific translation model references.
    • Defined SUPPORTED_TRANSLATION_MODELS, including whisper-1-translate, with a note on its mapping to whisper-1.
  • js/testapps/compat-oai/src/index.ts
    • Added a new translate flow to demonstrate audio translation using the whisper-1-translate model with a Korean audio file.
Activity
  • The pull request was created by 7hokerz.
  • The PR title adheres to conventional commits specification.
  • The changes have been manually and unit tested.
  • The author is seeking feedback on the approach for distinguishing between STT and translation workflows for whisper-1.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces support for OpenAI's audio translation models, specifically mapping whisper-1-translate to whisper-1 for translation workflows. The changes include adding new types, model information, and configuration schemas for translation, along with corresponding functions to handle translation requests and responses. The openai/index.ts file is updated to integrate these new translation functionalities, and a new openai/translation.ts file defines the OpenAI-specific translation model references. A test application file is also updated to include a translation sample. Overall, the changes are well-structured and follow existing patterns within the codebase.

@7hokerz 7hokerz changed the title feat(js/plugins/compat-oai): Added translation adapter and add translation flow in testapps feat(js/plugins/compat-oai): add translation adapter and add translation flow in testapps Feb 20, 2026
Copy link
Contributor

@xavidop xavidop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ssbushi
Copy link
Contributor

ssbushi commented Feb 23, 2026

Thank you @7hokerz for adding this. I agree with the changes in the compat-oai plugin, but I'm unsure about the whisper-1-translate approach.

On one hand, this is an "pseudo" model, which makes it confusing for the user to discover. Also, there are no more *-translate models currently offered by OpenAI, which makes the patterns implemented here forward looking and not relevant for today.
On the other hand, I can see that this approach should work since we handle the model name override.

One way to approach this to either:

  • Not handle the conversion based on the name, but have the user overload the config? (Not a clean approach IMO)
  • Have a new translateModel overload on the OpenAI plugin to specify this type of models (Not consistent with other plugins)
  • Have model somehow take an extra optional param to denote a specific type/flavor of a model on which to decide whether whisper-1 is a transcription or translation model.

Curious what you think? @pavelgj @ifielker

@7hokerz
Copy link
Contributor Author

7hokerz commented Feb 23, 2026

@ssbushi, I totally see your point, and I've actually been having the same concerns myself. It’s a tricky trade-off, and I agree it needs a better solution.

@ssbushi
Copy link
Contributor

ssbushi commented Feb 23, 2026

@ssbushi, I totally see your point, and I've actually been having the same concerns myself. It’s a tricky trade-off, and I agree it needs a better solution.

Yeah, I think the config based fix makes sense here, but for that to work we may need to do a bit of a refactor. Here's the plan:

  1. Introduce a new "type" of model / config called WhisperConfigSchema, this can live in addition to the current transcription model helper. This will only cover the whisper model family (which I think only has one model in it?)
  2. The WhisperConfigSchema can have a translate?: z.boolean() field which will denote whether the whisper model is to be used for translation or transcription (default). If set to true, we define it as a translation model using the compatOai helpers you wrote, otherwise, it should keep the current transcription behavior.

This refactor means we will have to separate out the whisper-* parts of the model resovler / dynamic model definitions into this new logic, but it should not make a difference to the user.

WDYT?

@7hokerz
Copy link
Contributor Author

7hokerz commented Feb 24, 2026

@ssbushi, I'll give it a try based on your opinion.

@7hokerz 7hokerz requested a review from xavidop February 24, 2026 10:35
@7hokerz
Copy link
Contributor Author

7hokerz commented Feb 24, 2026

@ssbushi, could you please take a look at this when you have a moment?

@ssbushi
Copy link
Contributor

ssbushi commented Feb 24, 2026

Thank you @7hokerz, added a quick comment.

@ssbushi
Copy link
Contributor

ssbushi commented Feb 24, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for OpenAI's audio translation feature by introducing a new defineCompatOpenAIWhisperModel and related components. The changes correctly separate whisper models from transcription-only models. The implementation is largely solid, but I've identified a few areas for improvement. Specifically, there's a typo in a constant name, significant code duplication that could be refactored for better maintainability, and a type-safety issue with requestBuilder that could lead to errors. I've provided detailed comments with suggestions for each of these points.

@7hokerz 7hokerz requested a review from ssbushi February 25, 2026 08:24
@7hokerz
Copy link
Contributor Author

7hokerz commented Feb 25, 2026

I've finished the refactoring as discussed and added the test codes for Whisper. @ssbushi, could you please take a look at this when you have a moment?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for OpenAI's audio translation models by dynamically mapping requests utilizing whisper-1-translate to whisper-1 before hitting the OpenAI API. It includes new files for translation functionality and updates existing files to incorporate the new translation flow. The changes also include corresponding test cases to ensure the new feature works as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants