Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ mypy:

docs-build:
uv run jb build -W -v ./doc
cp -r assets doc/_build/assets
uv run ./build_scripts/generate_rss.py

# Because of import time, "auto" seemed to actually go slower than just using 4 processes
Expand Down
86 changes: 77 additions & 9 deletions doc/code/scenarios/0_scenarios.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@
"Available Scenarios:\n",
"================================================================================\n",
"\u001b[1m\u001b[36m\n",
" airt.content_harms\u001b[0m\n",
" content_harms\u001b[0m\n",
" Class: ContentHarms\n",
" Description:\n",
" Content Harms Scenario implementation for PyRIT. This scenario contains\n",
Expand All @@ -235,7 +235,7 @@
" airt_hate, airt_fairness, airt_violence, airt_sexual, airt_harassment,\n",
" airt_misinformation, airt_leakage\n",
"\u001b[1m\u001b[36m\n",
" airt.cyber\u001b[0m\n",
" cyber\u001b[0m\n",
" Class: Cyber\n",
" Description:\n",
" Cyber scenario implementation for PyRIT. This scenario tests how willing\n",
Expand All @@ -250,7 +250,67 @@
" Default Datasets (1, max 4 per dataset):\n",
" airt_malware\n",
"\u001b[1m\u001b[36m\n",
" airt.scam\u001b[0m\n",
" jailbreak\u001b[0m\n",
" Class: Jailbreak\n",
" Description:\n",
" Jailbreak scenario implementation for PyRIT. This scenario tests how\n",
" vulnerable models are to jailbreak attacks by applying various\n",
" single-turn jailbreak templates to a set of test prompts. The responses\n",
" are scored to determine if the jailbreak was successful.\n",
" Aggregate Strategies:\n",
" - all\n",
" Available Strategies (1):\n",
" pyrit\n",
" Default Strategy: all\n",
" Default Datasets (1, max 4 per dataset):\n",
" airt_harms\n",
"\u001b[1m\u001b[36m\n",
" leakage\u001b[0m\n",
" Class: Leakage\n",
" Description:\n",
" Leakage scenario implementation for PyRIT. This scenario tests how\n",
" susceptible models are to leaking training data, PII, intellectual\n",
" property, or other confidential information. The LeakageScenario class\n",
" contains different attack variations designed to extract sensitive\n",
" information from models.\n",
" Aggregate Strategies:\n",
" - all, single_turn, multi_turn, ip, sensitive_data\n",
" Available Strategies (4):\n",
" first_letter, image, role_play, crescendo\n",
" Default Strategy: all\n",
" Default Datasets (1, max 4 per dataset):\n",
" airt_leakage\n",
"\u001b[1m\u001b[36m\n",
" psychosocial\u001b[0m\n",
" Class: Psychosocial\n",
" Description:\n",
" Psychosocial Harms Scenario implementation for PyRIT. This scenario\n",
" contains various psychosocial harm-based checks that evaluate model\n",
" behavior for responses that may harm the well-being or mental health of\n",
" users. Examples include mistreating users in a mental health crisis or\n",
" aiding them in self-harm. All datasets in this scenario share the\n",
" unified 'psychosocial' harm category, but each dataset (e.g.,\n",
" imminent_crisis, dependency) can have unique: - Multiturn attack\n",
" strategies - Scoring rubrics - Seed objectives The scenario tests\n",
" whether models: - Recognize mental health crises and respond\n",
" appropriately - Avoid facilitating self-harm or harmful behaviors -\n",
" Provide crisis resources and encourage professional help Example usage\n",
" with custom configurations: # Define custom configurations per subharm\n",
" category custom_configs = { \"airt_imminent_crisis\": SubharmConfig(\n",
" crescendo_system_prompt_path=\"path/to/custom_escalation.yaml\",\n",
" scoring_rubric_path=\"path/to/custom_rubric.yaml\", ), } scenario =\n",
" PsychosocialHarmsScenario(subharm_configs=custom_configs) await\n",
" scenario.initialize_async( objective_target=target_llm,\n",
" scenario_strategies=[PsychosocialHarmsStrategy.MULTI_TURN], )\n",
" Aggregate Strategies:\n",
" - all\n",
" Available Strategies (2):\n",
" imminent_crisis, licensed_therapist\n",
" Default Strategy: all\n",
" Default Datasets (1, max 4 per dataset):\n",
" airt_imminent_crisis\n",
"\u001b[1m\u001b[36m\n",
" scam\u001b[0m\n",
" Class: Scam\n",
" Description:\n",
" Scam scenario evaluates an endpoint's ability to generate scam-related\n",
Expand All @@ -264,11 +324,19 @@
" Default Datasets (1, max 4 per dataset):\n",
" airt_scams\n",
"\u001b[1m\u001b[36m\n",
" foundry.foundry\u001b[0m\n",
" Class: FoundryScenario\n",
" red_team_agent\u001b[0m\n",
" Class: RedTeamAgent\n",
" Description:\n",
" Deprecated alias for Foundry. This class is deprecated and will be\n",
" removed in version 0.13.0. Use `Foundry` instead.\n",
" RedTeamAgent is a preconfigured scenario that automatically generates\n",
" multiple AtomicAttack instances based on the specified attack\n",
" strategies. It supports both single-turn attacks (with various\n",
" converters) and multi-turn attacks (Crescendo, RedTeaming), making it\n",
" easy to quickly test a target against multiple attack vectors. The\n",
" scenario can expand difficulty levels (EASY, MODERATE, DIFFICULT) into\n",
" their constituent attack strategies, or you can specify individual\n",
" strategies directly. This scenario is designed for use with the Foundry\n",
" AI Red Teaming Agent library, providing a consistent PyRIT contract for\n",
" their integration.\n",
" Aggregate Strategies:\n",
" - all, easy, moderate, difficult\n",
" Available Strategies (25):\n",
Expand All @@ -280,7 +348,7 @@
" Default Datasets (1, max 4 per dataset):\n",
" harmbench\n",
"\u001b[1m\u001b[36m\n",
" garak.encoding\u001b[0m\n",
" encoding\u001b[0m\n",
" Class: Encoding\n",
" Description:\n",
" Encoding Scenario implementation for PyRIT. This scenario tests how\n",
Expand All @@ -305,7 +373,7 @@
"\n",
"================================================================================\n",
"\n",
"Total scenarios: 5\n"
"Total scenarios: 8\n"
]
},
{
Expand Down
1 change: 0 additions & 1 deletion pyrit/scenario/core/atomic_attack.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,6 @@ def __init__(
self._seed_groups = seed_groups
self._adversarial_chat = adversarial_chat
self._objective_scorer = objective_scorer
self._objective_scorer = objective_scorer
self._memory_labels = memory_labels or {}
self._attack_execute_params = attack_execute_params

Expand Down
4 changes: 2 additions & 2 deletions pyrit/scenario/core/scenario.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ class Scenario(ABC):
def __init__(
self,
*,
name: str,
name: str = "",
version: int,
strategy_class: Type[ScenarioStrategy],
objective_scorer: Scorer,
Expand Down Expand Up @@ -103,7 +103,7 @@ def __init__(
self._objective_scorer = objective_scorer
self._objective_scorer_identifier = objective_scorer.get_identifier()

self._name = name
self._name = name if name else type(self).__name__
self._memory = CentralMemory.get_memory_instance()
self._atomic_attacks: List[AtomicAttack] = []
self._scenario_result_id: Optional[str] = str(scenario_result_id) if scenario_result_id else None
Expand Down
35 changes: 32 additions & 3 deletions pyrit/scenario/core/scenario_strategy.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,43 @@
It also provides ScenarioCompositeStrategy for representing composed attack strategies.
"""

from enum import Enum
from typing import List, Sequence, Set, TypeVar
from enum import Enum, EnumType
from typing import Any, List, Sequence, Set, TypeVar

from pyrit.common.deprecation import print_deprecation_message

# TypeVar for the enum subclass itself
T = TypeVar("T", bound="ScenarioStrategy")


class ScenarioStrategy(Enum):
class _DeprecatedEnumMeta(EnumType):
"""
Custom Enum metaclass that supports deprecated member aliases.

Subclasses of ScenarioStrategy can define deprecated member name mappings
by setting ``__deprecated_members__`` on the class after definition.
Each entry maps the old name to a ``(new_name, removed_in)`` tuple::

MyStrategy.__deprecated_members__ = {"OLD_NAME": ("NewName", "0.13.0")}

Accessing ``MyStrategy.OLD_NAME`` will emit a DeprecationWarning and return
the same enum member as ``MyStrategy.NewName``.
"""

def __getattr__(cls, name: str) -> Any:
deprecated = cls.__dict__.get("__deprecated_members__")
if deprecated and name in deprecated:
new_name, removed_in = deprecated[name]
print_deprecation_message(
old_item=f"{cls.__name__}.{name}",
new_item=f"{cls.__name__}.{new_name}",
removed_in=removed_in,
)
return cls[new_name]
raise AttributeError(name)


class ScenarioStrategy(Enum, metaclass=_DeprecatedEnumMeta):
"""
Base class for attack strategies with tag-based categorization and aggregation.

Expand Down
6 changes: 4 additions & 2 deletions pyrit/scenario/scenarios/airt/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,21 @@
)
from pyrit.scenario.scenarios.airt.cyber import Cyber, CyberStrategy
from pyrit.scenario.scenarios.airt.jailbreak import Jailbreak, JailbreakStrategy
from pyrit.scenario.scenarios.airt.leakage_scenario import LeakageScenario, LeakageStrategy
from pyrit.scenario.scenarios.airt.psychosocial_scenario import PsychosocialScenario, PsychosocialStrategy
from pyrit.scenario.scenarios.airt.leakage import Leakage, LeakageScenario, LeakageStrategy
from pyrit.scenario.scenarios.airt.psychosocial import Psychosocial, PsychosocialScenario, PsychosocialStrategy
from pyrit.scenario.scenarios.airt.scam import Scam, ScamStrategy

__all__ = [
"ContentHarms",
"ContentHarmsStrategy",
"Psychosocial",
"PsychosocialScenario",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these were released that way it could be breaking.

"PsychosocialStrategy",
"Cyber",
"CyberStrategy",
"Jailbreak",
"JailbreakStrategy",
"Leakage",
"LeakageScenario",
"LeakageStrategy",
"Scam",
Expand Down
Loading
Loading