FEAT: Scientific Translation Converter by jbolor21 · Pull Request #1379 · Azure/PyRIT

jbolor21 · 2026-02-19T19:21:54Z

Description

Adding scientific translation converter to translate queries into various "scientific" modes

Tests and Documentation

Added unit tests and added converter into converters notebook for text->text using LLMs

…converter

pyrit/datasets/prompt_converters/scientific_obfuscation_converter.yaml

romanlutz · 2026-02-19T21:07:55Z

pyrit/datasets/prompt_converters/scientific_translation_converter.yaml

+
+  ## Mode-specific guidelines:
+
+  {% if mode == "academic" %}


I am confused. This includes only a specific section depending on the mode BUT at the end there's a combined mode. How will it know all the modes if we exclude most of them? Examples below also include all of them.

Also not sure if I understand the combined mode. In the class it's explicitly listed as a mode, but here it's a catchall, so "foobar" would resolve to a combined prompt. I feel like it would be better to just drop "combined" and refer to the default/wildcard as a combined mode

I put the combined to combine a couple of the methods into 1. I did change to an "elif" rather than catch all "else" since any other mode should get caught as an exception!

romanlutz · 2026-02-19T21:12:59Z

pyrit/prompt_converter/scientific_translation_converter.py

+        Raises:
+            ValueError: If an invalid mode is provided.
+        """
+        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")


is there an easier way to check for this given that it's a literal that's defined above

I suggested one below by just attaching the valid modes to the class itself, but it's a nit so feel free to disregard

I suppose you could have both.

from typing import Literal, get_args ObfuscationMode = Literal[ "academic", "technical", "smiles", "research", "reaction", "combined" ] OBFUSCATION_MODES = set(get_args(ObfuscationMode)) def is_valid_mode(value: str) -> bool: return value in OBFUSCATION_MODES

Copilot

Pull request overview

Adds a new LLM-based prompt converter that rewrites prompts into “scientific/technical” phrasing across multiple modes, along with the seed prompt template, documentation wiring, and unit tests.

Changes:

Introduces ScientificObfuscationConverter (mode-driven) backed by a new YAML seed prompt template.
Exposes the converter via pyrit.prompt_converter exports and API docs.
Adds unit tests and an example usage snippet in the converters documentation notebook.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`pyrit/prompt_converter/scientific_obfuscation_converter.py`	Implements the new LLM-based converter and identifier construction.
`pyrit/datasets/prompt_converters/scientific_obfuscation_converter.yaml`	Adds the mode-parameterized system prompt template used by the converter.
`pyrit/prompt_converter/__init__.py`	Exports the new converter from the prompt_converter package.
`tests/unit/converter/test_scientific_obfuscation_converter.py`	Adds unit tests validating initialization, mode validation, and conversion behavior.
`doc/code/converters/1_text_to_text_converters.py`	Documents example usage of the new converter in the text-to-text converters notebook source.
`doc/code/converters/1_text_to_text_converters.ipynb`	Adds the corresponding notebook cell content for the new converter example.
`doc/api.rst`	Adds the converter to the API reference list.

Comments suppressed due to low confidence (2)

pyrit/prompt_converter/scientific_obfuscation_converter.py:23

The PR title/description refer to a "Scientific Translation Converter", but the implementation and dataset are named "ScientificObfuscationConverter" / "scientific_obfuscation_converter". If this is intended to be a translation-style converter, consider aligning the naming (or update the PR description) to avoid confusion for API consumers and documentation readers.

class ScientificObfuscationConverter(LLMGenericTextConverter):
    """
    Uses an LLM to transform simple or direct prompts into

pyrit/prompt_converter/scientific_obfuscation_converter.py:67

valid_modes duplicates the allowed values already defined in ObfuscationMode. To avoid the tuple and the type alias drifting out of sync, derive the runtime list from the type (e.g., typing.get_args(ObfuscationMode)) or centralize the allowed modes as a single constant reused for both validation and typing.

        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")
        if mode not in valid_modes:
            raise ValueError(f"Invalid mode '{mode}'. Must be one of: {valid_modes}")

tests/unit/converter/test_scientific_translation_converter.py

pyrit/prompt_converter/scientific_translation_converter.py

pyrit/prompt_converter/scientific_obfuscation_converter.py

pyrit/prompt_converter/scientific_translation_converter.py

ValbuenaVC · 2026-02-19T22:28:24Z

pyrit/prompt_converter/scientific_translation_converter.py

+        Raises:
+            ValueError: If an invalid mode is provided.
+        """
+        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")


I suggested one below by just attaching the valid modes to the class itself, but it's a nit so feel free to disregard

ValbuenaVC · 2026-02-19T22:31:12Z

pyrit/datasets/prompt_converters/scientific_translation_converter.yaml

+
+  ## Mode-specific guidelines:
+
+  {% if mode == "academic" %}


Also not sure if I understand the combined mode. In the class it's explicitly listed as a mode, but here it's a catchall, so "foobar" would resolve to a combined prompt. I feel like it would be better to just drop "combined" and refer to the default/wildcard as a combined mode

pyrit/prompt_converter/scientific_translation_converter.py

hannahwestra25 · 2026-02-24T15:03:39Z

pyrit/models/seeds/seed.py

-            if any(var not in kwargs for var in for_vars):
-                # Don't render if we're missing loop collection variables - preserve the template as-is
+            # Extract variable names from {% if var ... %} and {% elif var ... %} patterns
+            if_vars = re.findall(r"\{%[-\s]*(?:el)?if\s+(\w+)", self.value)


Rather than parsing here and having one yaml, is it possible to have multiple yamls each for the mode and then we can leave the seed.py untouched (I'm assuming this parsing is specific to this converter which I would prefer not to do since seed is a generic data structure) and redirect to the corresponding yaml based on the mode in the scientific_translation_converter.py

hannahwestra25 · 2026-02-24T15:07:31Z

pyrit/prompt_converter/scientific_translation_converter.py

+
+        # Load default template if not provided
+        prompt_template = (
+            prompt_template


prompt_template as an input is weird to me bc we have translation_mode so a user could make a prompt_template that doesn't correspond to a mode at all / the translation_mode is useless if the prompt_template doesn't use it so imo we should enforce the translation_mode in the prompt template or not allow the prompt_template (on the fence on whether the prompt_template is useful; you and Cristian would probably have more insight into that than me)

…converter

Bolor added 3 commits February 19, 2026 11:19

initial commit adding all changed files new converter

fcb33a8

adding in api

501c1b2

Merge remote-tracking branch 'origin' into users/bjagdagdorj/science_…

30349f1

…converter

jbolor21 changed the title ~~[DRAFT]: FEAT: Scientific Translation Converter~~ FEAT: Scientific Translation Converter Feb 19, 2026

romanlutz reviewed Feb 19, 2026

View reviewed changes

romanlutz requested a review from Copilot February 19, 2026 21:13

Copilot started reviewing on behalf of romanlutz February 19, 2026 21:14 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

tests/unit/converter/test_scientific_translation_converter.py Outdated Show resolved Hide resolved

tests/unit/converter/test_scientific_translation_converter.py Outdated Show resolved Hide resolved

tests/unit/converter/test_scientific_translation_converter.py Outdated Show resolved Hide resolved

yaml edit

ef716a9

ValbuenaVC reviewed Feb 19, 2026

View reviewed changes

hannahwestra25 reviewed Feb 20, 2026

View reviewed changes

pyrit/prompt_converter/scientific_translation_converter.py Outdated Show resolved Hide resolved

Bolor added 3 commits February 23, 2026 11:05

revising yaml instructions adding a mode

e8d15aa

rename file

97cfad5

address feedback

863125e

hannahwestra25 reviewed Feb 24, 2026

View reviewed changes

hannahwestra25 self-assigned this Feb 24, 2026

Bolor added 2 commits February 25, 2026 09:51

breaking yaml file up into multiple

4760410

Merge remote-tracking branch 'origin' into users/bjagdagdorj/science_…

32bb290

…converter

Conversation

jbolor21 commented Feb 19, 2026

Description

Tests and Documentation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants