Skip to content

Release v0.3.14#284

Merged
placerda merged 25 commits into
mainfrom
release/v0.3.14
Jun 9, 2026
Merged

Release v0.3.14#284
placerda merged 25 commits into
mainfrom
release/v0.3.14

Conversation

@placerda

@placerda placerda commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Release v0.3.14

Automated release branch created from develop.

What happened

  • Branch release/v0.3.14 created from develop
  • CHANGELOG.md updated: versioned section [0.3.14] added
  • Plugin versions synced to 0.3.14 (package.json, plugin.json, marketplace.json)
  • Staging pipeline triggered automatically (build → TestPyPI + VSIX pre-release → verify)

Next steps

  1. Wait for the Staging pipeline to pass
  2. Review and approve this PR
  3. Merge to main
  4. Tag and push: git tag v0.3.14 && git push origin v0.3.14
  5. Approve the PyPI publish and VSIX stable publish in the Release workflow
  6. Sync develop: git checkout develop && git merge main && git push origin develop

Checklist

  • Staging pipeline passes (build + TestPyPI + VSIX pre-release + verify)
  • CHANGELOG entries reviewed
  • PR approved and merged to main
  • Tag v0.3.14 pushed
  • PyPI publish approved
  • VSIX stable publish approved
  • develop synced from main

placerda and others added 25 commits June 9, 2026 08:23
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rial

docs: showcase ASSERT evidence in tutorial
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ases

docs: clarify synthetic multi-turn tutorial rows
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-step

docs: add Foundry full multi-turn evaluation step
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-note

docs: clarify CLI conversation gate scope
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…taset-source

docs: explain Foundry full conversation data source
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…step

docs: remove manual Foundry evaluation URL step
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…l-guided

docs: require rubric gate in prompt tutorial
docs: mark full multi-turn evaluation optional
docs: explain rubric placeholders with concrete file lookups
…283)

* feat: add agentops assert run and agentops redteam run as active CI gates

Turn ASSERT (open-source assert-ai framework) and the Foundry/PyRIT AI Red
Teaming agent from passive evidence-only references into active, gated CI
steps that AgentOps orchestrates end-to-end.

New commands

- agentops assert run invokes the assert-ai CLI as a subprocess, locates
  the run output, parses metrics.json and scores.jsonl, and writes a
  normalized summary at .agentops/assert/latest.json. Exits 2 on any
  policy violation unless --no-gate or assert.fail_on_violations: false.
- agentops redteam run invokes azure.ai.evaluation.red_team.RedTeam
  against an Azure OpenAI deployment, Foundry agent, or HTTP endpoint,
  then aggregates per-category and per-strategy attack-success-rate into
  .agentops/redteam/latest.json. Exits 2 when ASR exceeds
  redteam.fail_on_attack_success_rate unless --no-gate.

Schema

- Adds AssertRunConfig and RedTeamRunConfig Pydantic models.
- Adds assert_run / redteam_run fields on AgentOpsConfig with aliases
  assert / redteam so YAML stays natural while Python avoids the
  reserved keyword. Enables populate_by_name on the root model.

Services

- src/agentops/services/assert_runner.py: subprocess wrapper, run-output
  locator with suite/run/most-recent fallback, dimension summarizer,
  normalized JSON writer.
- src/agentops/services/redteam_runner.py: lazy import of the Foundry
  Red Team SDK, target callback builder for deployment/agent/endpoint
  shapes, per-category and per-strategy aggregation, normalized JSON
  writer.

CLI

- New assert_app and redteam_app Typer groups with run and explain
  subcommands.
- Long-form manuals added to EXPLAIN_PAGES for both groups and surfaced
  via agentops explain.
- Fixes a stale loaded.config access in the new command handlers.

Tutorial

- docs/tutorial-prompt-agent-quickstart.md replaces the passive
  assert_path evidence section with active 10a/10b/10c subsections that
  install assert-ai and azure-ai-evaluation[redteam], scaffold
  assert/eval_config.yaml and the redteam block, and pull both runners
  into the evidence pack.
- Success criteria updated accordingly.

README

- Repositions the accelerator as an open-source framework + CLI that
  orchestrates continuous evaluation, safety testing, and release
  readiness (rather than reinventing them).
- Tagline, six-step release loop, core-outputs table, and exit-code
  contract reworked. Foundry boundary table now lists ASSERT and the
  AI Red Teaming agent under "Probe safety" with active commands.

Tests

- tests/unit/test_assert_and_redteam_runners.py covers schema aliases,
  run-output discovery, dimension summarization, totals aggregation,
  target callback resolution, normalized JSON writing, gating, and CLI
  smoke (missing config block, missing dependency, explain manuals).
- Full suite: 921 passed, 1 skipped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: resolve mypy errors in new assert/redteam runners

- assert_runner._aggregate_totals: narrow Optional dict from metrics.get
  totals before subscripting, by binding the result to a typed local.
- redteam_runner.run_redteam: validate azure_ai_project is not None
  before passing it to the RedTeam SDK (raises RedTeamRunnerError with
  a clear hint when project metadata is missing).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@placerda placerda merged commit 9667179 into main Jun 9, 2026
5 checks passed
@placerda placerda deleted the release/v0.3.14 branch June 9, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant