Open
Conversation
Integrate the CBT-Bench psychotherapy benchmark dataset from HuggingFace (Psychotherapy-LLM/CBT-Bench) into PyRIT.
Author
|
@microsoft-github-policy-service agree company="Microsoft" |
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request integrates the CBT-Bench (Cognitive Behavioral Therapy benchmark) dataset from HuggingFace into PyRIT, enabling evaluation of LLM safety and alignment in psychotherapy contexts. The implementation supersedes stale PR #888 and addresses issue #865.
Changes:
- Added a new remote dataset loader for CBT-Bench with support for 39 HuggingFace subsets
- Registered the new loader in the remote datasets init file
- Added comprehensive unit tests with 7 test cases covering initialization, fetching, edge cases, and metadata validation
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
pyrit/datasets/seed_datasets/remote/cbt_bench_dataset.py |
New dataset loader class implementing fetch logic, combining situation and thoughts into prompt value, storing core beliefs in metadata |
pyrit/datasets/seed_datasets/remote/__init__.py |
Registered _CBTBenchDataset in imports and all list following alphabetical ordering |
tests/unit/datasets/test_cbt_bench_dataset.py |
Unit tests with fixtures and mocking covering normal operation, custom configs, edge cases, and metadata validation |
Comments suppressed due to low confidence (1)
pyrit/datasets/seed_datasets/remote/cbt_bench_dataset.py:117
- The metadata field for core_belief_fine_grained is being set to a list, but the Seed.metadata field is typed as dict[str, Union[str, int]]. This creates a type mismatch. To fix this, either convert the list to a string (e.g., JSON string or comma-separated) before storing in metadata, or use a local variable annotation like other datasets do (from typing import Any; metadata: dict[str, Any] = {...}).
metadata["core_belief_fine_grained"] = core_beliefs
Contributor
|
Oh and we need to rerun the notebook that lists all the datasets. It's the one in doc/code/datasets/ with index 0 I think. It should add CBT bench as a new line after execution. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Integrate the CBT-Bench psychotherapy benchmark dataset from HuggingFace
(Psychotherapy-LLM/CBT-Bench) into PyRIT.
Changes
pyrit/datasets/seed_datasets/remote/cbt_bench_dataset.py—_CBTBenchDatasetloaderpyrit/datasets/seed_datasets/remote/__init__.py— registered the new loadertests/unit/datasets/test_cbt_bench_dataset.py— 7 unit testsKey Design Decisions
situation+thoughts(per reviewer feedback on FEAT: CBT-Bench Dataset #888)["psycho-social harms"](per @jbolor21 and @romanlutz discussion on FEAT: CBT-Bench Dataset #888)core_fine_seedbut supports all 39 HuggingFace subsets viaconfigparametercore_belief_fine_grainedstored in metadata for downstream useCloses #865 (supersedes stale PR #888)
@romanlutz
Tests and Documentation
tests/unit/datasets/test_cbt_bench_dataset.py. All tests mock_fetch_from_huggingface(no network calls). All 98dataset unit tests pass (91 existing + 7 new). Ruff lint clean.
_prefixed) and auto-register viaSeedDatasetProvider.__init_subclass__.api.rstonly listsSeedDatasetProvider, not individual dataset loaders.