Red Team Agent Scenario Integration#44551
Open
slister1001 wants to merge 55 commits intoAzure:mainfrom
Open
Conversation
romanlutz
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
| ) | ||
|
|
||
| # Plus any context prompts | ||
| context_prompts = [...] |
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
rlundeen2
reviewed
Jan 6, 2026
- Update imports from initialize_pyrit to CentralMemory - Change PromptRequestResponse to Message, PromptRequestPiece to MessagePiece - Update request_pieces to message_pieces throughout - Change orchestrator_identifier to attack_identifier - Fix PyritException instantiation to use keyword argument - Add skip decorators for tests relying on removed orchestrator module - Add skip decorators for tests when scorer class is abstract
- test_foundry_basic_execution: Basic Foundry execution path - test_foundry_indirect_jailbreak: XPIA attacks with context - test_foundry_multiple_risk_categories: Multiple risk categories - test_foundry_with_application_scenario: Application scenario context - test_foundry_strategy_combination: Multiple attack strategies
- Update RAIServiceScorer to inherit from TrueFalseScorer and match new score_async signature - Update _CallbackChatTarget.send_prompt_async to use Message parameter and return List[Message] - Fix AttackScoringConfig to use objective_scorer and use_score_as_feedback parameters - Change scenario.run_attack_async() to scenario.run_async() - Fix _scenario_result.attack_results access pattern - Add None handling for context_type in dataset builder - Remove context prompts for standard attacks (converters only support text) - Update memory API from get_chat_messages_with_conversation_id to get_conversation - Handle both dict and object message formats in strategy_utils - Pass adversarial_chat_target to FoundryExecutionManager - Update unit tests for new API signatures
- Add include_baseline parameter to ScenarioOrchestrator.execute() - Log warning when baseline-only is requested (PyRIT requires Foundry strategy) - Update tests to use Baseline + Base64 (workaround for PyRIT limitation) - Fix _red_team.py to pass flattened_attack_strategies for proper Baseline detection - Update get_attack_results() and get_memory() to not raise when no scenario executed
- Point pyrit dependency to slister1001/PyRIT@feature/baseline-only-execution (TODO: Revert to @main once PR Azure#1321 is merged) - Simplify scenario orchestrator by removing baseline-only workaround - Update e2e tests to use baseline-only execution (AttackStrategy.Baseline)
The fork URL was causing CI failures. Reverting to main branch with baseline+Base64 workaround until baseline-only support is merged. Added TODO comments to track where changes are needed once PR Azure#1321 lands.
Create separate dev_requirements_redteam.txt for redteam tests to avoid dependency conflict between promptflow-devkit (pillow<=11.3.0) and pyrit (pillow>=12.1.0). - dev_requirements.txt: removes [redteam] extra, used for regular CI - dev_requirements_redteam.txt: includes [redteam] but excludes promptflow-devkit
Add run_redteam_tests.py script that installs from dev_requirements_redteam.txt and runs the red team e2e tests. This allows developers to run redteam tests locally without the pillow version conflicts from promptflow-devkit. Usage: python scripts/run_redteam_tests.py [pytest_args...]
Add a separate matrix entry (redteam_Ubuntu2404_310) that runs red team tests with dev_requirements_redteam.txt to avoid pillow version conflicts. The redteam job: - Uses Python 3.10 (required by pyrit) - Skips all standard tox environments - Installs from dev_requirements_redteam.txt (without promptflow-devkit) - Runs red team e2e tests with the [redteam] extra
Add tests for context file creation, extension mapping, data type determination, and cleanup functionality.
…test assertions - Fix NoneType crash when eval_result.results is None in _rai_service_eval_chat_target.py - Guard orchestrator instantiation with _ORCHESTRATOR_AVAILABLE checks - Validate callback response structure before key access in _callback_chat_target.py - Add __del__ and debug logging to DatasetConfigurationBuilder cleanup - Add try/except with helpful message for FoundryStrategy import - Add changelog note for pyrit/promptflow-devkit pillow version conflict - Fix tautological >= 0 assertions in test_foundry.py - Add assertion to cleanup test in test_dataset_builder_binary_path.py - Strengthen e2e test assertions in test_red_team_foundry.py
…am CI matrix - Add log line for orchestrator-based execution path (legacy PyRIT) - Add log suggesting upgrade to PyRIT 0.11+ for Foundry execution - Extract shared _read_seed_content() to deduplicate file-reading logic between _rai_scorer.py and _foundry_result_processor.py - Extract redteam matrix entry into separate platform-matrix-redteam.json with its own MatrixConfig in ci.yml so it always gets a PR build job
nagkumar91
approved these changes
Feb 9, 2026
Remove separate redteam MatrixConfig, AfterTestSteps, and platform-matrix-redteam.json. These don't work in the shared PR pipeline and the eng sys team is building InjectedPackages support for conflicting dependency scenarios. Added TODO comments in platform-matrix.json and dev_requirements.txt for what to change when the feature is delivered.
Bug fixes: - Map PromptSendingAttack to indirect_jailbreak in Foundry result/execution processors - Remap hate_fairness to hate_unfairness for Sync API in RAI scorer - Accept binary_path and image_path data types in callback chat target validation - Fix context KeyError in evaluation processor for messages without context field - Fix test callback to handle messages as list (not dict) Formatting: - Applied black 24.4.0 via tox to all red_team source and test files
The redteam extra conflicts with promptflow-devkit due to pillow version incompatibility (pyrit requires >=12.1, promptflow <11). The redteam extra is installed via InjectedPackages in platform-matrix.json for the dedicated CI job instead.
Member
Author
|
/check-enforcer evaluate |
PROXY_URL in devtools_testutils.config was changed from a module-level constant to a function in commit 9233cd8. This fix calls it properly as PROXY_URL() to get the string value instead of passing the function object.
… excluded, update assertion for changed error message
- Validate imported promptflow Configuration accepts override_config kwarg before using it; fall back to local impl on TypeError (fixes sk job where semantic-kernel brings incompatible promptflow version) - Add body key sanitizer for query field in sync_evals requests to handle dynamic adversarial prompt content in test recordings (fixes 5 red team foundry e2e test recording mismatches) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- _check.py: Also verify promptflow.client.PFClient is importable so MISSING_LEGACY_SDK is True when promptflow-devkit 1.18.1 drops the promptflow namespace package. Tests that depend on PFClient now correctly skip. - conftest.py: Use (?s).+ regex in the query body sanitizer so multi-line adversarial prompt values are fully replaced. The default .+ regex doesn't match newlines, causing recording/playback body mismatches for hate_unfairness queries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PyRIT Foundry Integration - Technical Specification
This specification documents the integration of PyRIT's FoundryScenario into Azure AI Evaluation's Red Teaming module. This architecture delegates attack orchestration entirely to PyRIT while maintaining Azure-specific scoring and result processing.
Why FoundryScenario?
The previous integration approach used PyRIT's lower-level orchestrator APIs, which:
FoundryScenario (also known as
RedTeamAgent) is PyRIT's high-level scenario API designed for exactly this use case. It provides:DatasetConfiguration,SeedGroup,SeedObjectivefor structured dataKey Architecture Decisions
AttackStrategyandFoundryStrategyenumsAdvantages Over Previous Approach
Architecture Overview
High-Level Data Flow
Component Responsibilities
_foundry/_execution_manager.py_foundry/_scenario_orchestrator.py_foundry/_dataset_builder.py_foundry/_rai_scorer.py_foundry/_foundry_result_processor.py_foundry/_strategy_mapping.pyKey Integration Points
1. Strategy Mapping
Azure SDK's
AttackStrategyenum maps to PyRIT'sFoundryStrategy:2. Data Model Transformation
RAI Service objectives are transformed to PyRIT's native data model:
3. FoundryScenario Configuration
4. Baseline-Only Execution
With PyRIT PR #1321, baseline-only execution is now supported:
XPIA (Indirect Jailbreak) Handling
For
AttackStrategy.IndirectJailbreak, attack strings are injected into context "attack vehicles":Result Processing
AttackResult to JSONL
PyRIT's
AttackResultobjects are converted to JSONL format:ASR Calculation
Attack Success Rate is calculated from
AttackResult.outcome:File Structure
Dependencies
PyRIT Requirements
Baseline-Only Support
Requires PyRIT PR #1321 (or later version that includes it):
prepare_scenario_strategies()_get_baseline()method handles both first-attack-derived and standalone baselinesCI/Build Considerations
Separate Dev Requirements
Due to dependency conflicts between
promptflow-devkit(requirespillow<=11.3.0) andpyrit(requirespillow>=12.1.0), red team tests use separate requirements:dev_requirements.txt[redteam]extra)dev_requirements_redteam.txtpromptflow-devkit)Dedicated CI Job
Red team tests run in a dedicated CI job (
redteam_Ubuntu2404_310) configured in:platform-matrix.json- Matrix entry withIsRedteamJob: trueci.yml-AfterTestStepsto install redteam requirements and run testsSpell Check (cspell)
The
cspell.jsonfile includes red team–specific words:pyrit,Pyrit- PyRIT library namee2etests,etests- Test directory namesredteam- Module and job namesXPIA- Cross-prompt injection attack acronymSphinx Documentation
The
red_team/__init__.pyhandles optionalpyritdependency gracefully for documentation builds:This allows Sphinx to document the module without requiring the optional
pyritdependency, while still raising the proper error when users try to use the module without installing it.Testing
Unit Tests
Location:
tests/unittests/test_redteam/test_foundry.pyTestDatasetConfigurationBuilderTestStrategyMapperTestRAIServiceScorerTestScenarioOrchestratorTestFoundryResultProcessorTestFoundryExecutionManagerE2E Tests
Location:
tests/e2etests/test_red_team_foundry.pytest_foundry_basic_execution- Basic attack strategiestest_foundry_indirect_jailbreak- XPIA attackstest_foundry_multiple_risk_categories- Baseline-only with multiple categoriestest_foundry_with_application_scenario- Baseline-only with app contexttest_foundry_strategy_combination- Multiple converter strategies