-
Notifications
You must be signed in to change notification settings - Fork 673
FEAT: Add modality support detection for targets #1383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fitzpr
wants to merge
13
commits into
Azure:main
Choose a base branch
from
fitzpr:feature/modality-detection-v2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
2a403c6
FEAT: Add modality support detection with set[frozenset[PromptDataTyp…
d72b2b4
CLEAN: Remove temporary verification scripts
6be87e6
FEAT: Add SUPPORTED_OUTPUT_MODALITIES and make output method use vari…
4905948
FEAT: Implement Roman's static API + verification architecture
3202098
Fix: Move modality verification to prompt_target and add proper typing
94b9aa8
Fix: Improve error logging in modality verification
8ecc5bc
Fix: Use 'value' instead of 'data' in MessagePiece constructor
0d7277c
Fix: Use actual file paths for image/audio/video modalities
56b68c8
Fix: Rename verify_actual_capabilities to verify_actual_modalities
02f6f1e
Merge branch 'main' into feature/modality-detection-v2
fitzpr cce207c
Rewrite modality support detection for all prompt targets
romanlutz 51a20bb
Merge remote-tracking branch 'origin/main' into pr-1383
romanlutz aa036f2
Merge remote-tracking branch 'origin/main' into pr-1383
romanlutz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # Modality Test Assets | ||
|
|
||
| Benign, minimal test files used by `pyrit.prompt_target.modality_verification` to | ||
| verify which modalities a target actually supports at runtime. | ||
|
|
||
| - **test_image.png** — 1×1 white pixel PNG | ||
| - **test_audio.wav** — TTS-generated speech: "raccoons are extraordinary creatures" | ||
| - **test_video.mp4** — 1-frame, 16×16 solid color video | ||
|
|
||
| These are intentionally simple and non-controversial so they won't be blocked by | ||
| content filters during modality verification. |
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,156 @@ | ||
| # Copyright (c) Microsoft Corporation. | ||
| # Licensed under the MIT license. | ||
|
|
||
| """ | ||
| Optional modality verification system for prompt targets. | ||
|
|
||
| This module provides runtime modality discovery to determine what modalities | ||
| a specific target actually supports, beyond what the API declares as possible. | ||
|
|
||
| Usage: | ||
| from pyrit.prompt_target.modality_verification import verify_target_modalities | ||
|
|
||
| # Get static API modalities | ||
| api_modalities = target.SUPPORTED_INPUT_MODALITIES | ||
|
|
||
| # Optionally verify actual model modalities | ||
| actual_modalities = await verify_target_modalities(target) | ||
| """ | ||
|
|
||
| import logging | ||
| import os | ||
| from typing import Optional | ||
|
|
||
| from pyrit.common.path import DATASETS_PATH | ||
| from pyrit.models import Message, MessagePiece, PromptDataType | ||
| from pyrit.prompt_target.common.prompt_target import PromptTarget | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # Path to the assets directory containing test files for modality verification | ||
| _ASSETS_DIR = DATASETS_PATH / "modality_test_assets" | ||
|
|
||
| # Mapping from PromptDataType to test asset filenames | ||
| _TEST_ASSETS: dict[str, str] = { | ||
| "image_path": str(_ASSETS_DIR / "test_image.png"), | ||
| "audio_path": str(_ASSETS_DIR / "test_audio.wav"), | ||
| "video_path": str(_ASSETS_DIR / "test_video.mp4"), | ||
| } | ||
|
|
||
|
|
||
| async def verify_target_modalities( | ||
| target: PromptTarget, | ||
| test_modalities: Optional[set[frozenset[PromptDataType]]] = None, | ||
| ) -> set[frozenset[PromptDataType]]: | ||
| """ | ||
| Verify which modality combinations a target actually supports. | ||
|
|
||
| This function tests the target with minimal requests to determine actual | ||
| modalities, trimming down from the static API declarations. | ||
|
|
||
| Args: | ||
| target: The prompt target to test | ||
| test_modalities: Specific modalities to test (defaults to target's declared modalities) | ||
|
|
||
| Returns: | ||
| Set of actually supported input modality combinations | ||
|
|
||
| Example: | ||
| actual = await verify_target_modalities(openai_target) | ||
| # Returns: {frozenset(["text"])} or {frozenset(["text"]), frozenset(["text", "image_path"])} | ||
| """ | ||
| if test_modalities is None: | ||
| test_modalities = target.SUPPORTED_INPUT_MODALITIES | ||
|
|
||
| verified_modalities: set[frozenset[PromptDataType]] = set() | ||
|
|
||
| for modality_combination in test_modalities: | ||
| try: | ||
| is_supported = await _test_modality_combination(target, modality_combination) | ||
| if is_supported: | ||
| verified_modalities.add(modality_combination) | ||
| except Exception as e: | ||
| logger.info(f"Failed to verify {modality_combination}: {e}") | ||
|
|
||
| return verified_modalities | ||
|
|
||
|
|
||
| async def _test_modality_combination( | ||
| target: PromptTarget, | ||
| modalities: frozenset[PromptDataType], | ||
| ) -> bool: | ||
| """ | ||
| Test a specific modality combination with a minimal API request. | ||
|
|
||
| Args: | ||
| target: The target to test | ||
| modalities: The combination of modalities to test | ||
|
|
||
| Returns: | ||
| True if the combination is supported, False otherwise | ||
| """ | ||
| test_message = _create_test_message(modalities) | ||
|
|
||
| try: | ||
| responses = await target.send_prompt_async(message=test_message) | ||
|
|
||
| # Check if the response itself indicates an error | ||
| for response in responses: | ||
| for piece in response.message_pieces: | ||
| if piece.response_error != "none": | ||
| logger.info(f"Modality {modalities} returned error response: {piece.converted_value}") | ||
| return False | ||
|
|
||
| return True | ||
|
|
||
| except Exception as e: | ||
romanlutz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| logger.info(f"Modality {modalities} not supported: {e}") | ||
| return False | ||
|
|
||
|
|
||
| def _create_test_message(modalities: frozenset[PromptDataType]) -> Message: | ||
| """ | ||
| Create a minimal test message for the specified modalities. | ||
|
|
||
| Args: | ||
| modalities: The modalities to include in the test message | ||
|
|
||
| Returns: | ||
| A Message object with minimal content for each requested modality | ||
|
|
||
| Raises: | ||
| FileNotFoundError: If a required test asset file is missing | ||
| ValueError: If a modality has no configured test asset or no pieces could be created | ||
| """ | ||
| pieces: list[MessagePiece] = [] | ||
| conversation_id = "modality-verification-test" | ||
|
|
||
| for modality in modalities: | ||
| if modality == "text": | ||
| pieces.append( | ||
| MessagePiece( | ||
| role="user", | ||
| original_value="test", | ||
| original_value_data_type="text", | ||
| conversation_id=conversation_id, | ||
| ) | ||
| ) | ||
| elif modality in _TEST_ASSETS: | ||
| asset_path = _TEST_ASSETS[modality] | ||
| if not os.path.isfile(asset_path): | ||
| raise FileNotFoundError(f"Test asset not found for modality '{modality}': {asset_path}") | ||
| pieces.append( | ||
| MessagePiece( | ||
| role="user", | ||
| original_value=asset_path, | ||
| original_value_data_type=modality, | ||
| conversation_id=conversation_id, | ||
| ) | ||
| ) | ||
| else: | ||
| raise ValueError(f"No test asset configured for modality: {modality}") | ||
|
|
||
| if not pieces: | ||
| raise ValueError(f"Could not create test message for modalities: {modalities}") | ||
|
|
||
| return Message(pieces) | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.