From ee9da1a77c752cbc0f449df77716853b0b253841 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Wed, 10 Jun 2026 16:11:15 -0300 Subject: [PATCH 01/16] docs(tutorial): copy-pasteable prompt blocks for governance skill Option A Replaced the blockquote-style example prompts with fenced code blocks for both ASSERT and Red Team Option A subsections in tutorial step 12, so users can click the GitHub copy button and paste directly into Copilot Chat without editing the > markers or wrapping. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 38e0f7a..3e329c3 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1080,11 +1080,13 @@ You have two ways to wire up ASSERT — pick whichever fits your workflow. If you installed the AgentOps coding-agent skills in step 4 (`agentops skills install`), the `agentops-governance` skill knows the full -recipe. In Copilot Chat (or Claude Code), say: +recipe. In Copilot Chat (or Claude Code), paste this prompt: -> Use the `agentops-governance` skill to scaffold ASSERT for this workspace. -> Target the `gpt-4o-mini` deployment, cover prompt_injection / pii_leak / -> jailbreak, 5 cases per dimension. +```text +Use the agentops-governance skill to scaffold ASSERT for this workspace. +Target the gpt-4o-mini deployment, cover prompt_injection / pii_leak / +jailbreak, 5 cases per dimension. +``` Copilot will install `assert-ai`, create `./assert/eval_config.yaml`, and append the `assert:` block to `agentops.yaml` for you. Skip to **Run it @@ -1145,8 +1147,13 @@ Same pattern: Copilot can do it, or you can run the commands yourself. #### Option A — Ask Copilot -> Use the `agentops-governance` skill to scaffold the Red Team runner. -> Target `gpt-4o-mini`, fail when attack success rate exceeds 20%. +Paste this prompt into Copilot Chat (or Claude Code): + +```text +Use the agentops-governance skill to scaffold the Red Team runner for this +workspace. Target the gpt-4o-mini deployment, fail when attack success rate +exceeds 20%. +``` #### Option B — Run the commands yourself From 547b6ffe609d7ddcdaaa362708bad191c02d7510 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 00:12:05 -0300 Subject: [PATCH 02/16] docs: harden OIDC tenant setup guidance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/ci-github-actions.md | 6 +++ docs/tutorial-end-to-end.md | 5 ++- docs/tutorial-hosted-agent-quickstart.md | 13 ++++--- docs/tutorial-prompt-agent-quickstart.md | 14 +++++-- .../skills/agentops-workflow/SKILL.md | 39 ++++++++++++++----- .../skills/agentops-workflow/SKILL.md | 39 ++++++++++++++----- 6 files changed, 85 insertions(+), 31 deletions(-) diff --git a/docs/ci-github-actions.md b/docs/ci-github-actions.md index 0274ad5..0dde1f9 100644 --- a/docs/ci-github-actions.md +++ b/docs/ci-github-actions.md @@ -119,6 +119,12 @@ In Settings → Secrets and variables → Actions → **Variables**, add: | `AZURE_OPENAI_DEPLOYMENT` | Model deployment used by local evaluators and AgentOps cloud eval judges | | `APPLICATIONINSIGHTS_CONNECTION_STRING` | Optional fallback when the Foundry project's App Insights connection cannot be auto-discovered | +Set `AZURE_TENANT_ID` to the tenant that owns the app registration / federated +credential used by `AZURE_CLIENT_ID`. Do not use a subscription +`managedByTenants` tenant id unless the app registration and federated +credential are also visible in that tenant; otherwise `azure/login` can fail at +token issuance before AgentOps starts. + Then on the Azure side, configure Workload Identity Federation (federated credentials) on the app registration so it can be assumed from GitHub Actions runs. See diff --git a/docs/tutorial-end-to-end.md b/docs/tutorial-end-to-end.md index e5475a1..07231f0 100644 --- a/docs/tutorial-end-to-end.md +++ b/docs/tutorial-end-to-end.md @@ -524,8 +524,9 @@ environment variable or equivalent Azure DevOps pipeline variable, verify the OIDC principal has **both** Foundry User access on the dev Foundry project **and** Cognitive Services OpenAI User access on the underlying Azure AI Services account that hosts the evaluator model (both are required — without -the OpenAI User role, every cloud eval metric returns null), and show me the -plan before changing GitHub or Azure. +the OpenAI User role, every cloud eval metric returns null), verify +AZURE_TENANT_ID is the tenant that owns the Entra app registration and its +federated credential, and show me the plan before changing GitHub or Azure. ``` That value is not an `agentops init` answer. It tells the Foundry cloud eval diff --git a/docs/tutorial-hosted-agent-quickstart.md b/docs/tutorial-hosted-agent-quickstart.md index d68b906..84a7483 100644 --- a/docs/tutorial-hosted-agent-quickstart.md +++ b/docs/tutorial-hosted-agent-quickstart.md @@ -786,12 +786,13 @@ hosted-agent project. Create or connect the GitHub repo if needed, set AGENTOPS_AGENT_ENDPOINT in the `dev` environment to the deployed HTTPS endpoint, wire Azure OIDC and required -Actions variables in the `dev` environment, and set any required endpoint token -as a secret. The PR gate uses --doctor-gate critical so the workflow blocks on -critical Doctor findings (regressions or other strict signals). Do not add -scheduled Doctor, QA, or production workflows yet. Show me the plan before -changing GitHub or Azure, and call out anything that needs owner/admin -permission. +Actions variables in the `dev` environment, verify AZURE_TENANT_ID is the tenant +that owns the Entra app registration and its federated credential, and set any +required endpoint token as a secret. The PR gate uses --doctor-gate critical so +the workflow blocks on critical Doctor findings (regressions or other strict +signals). Do not add scheduled Doctor, QA, or production workflows yet. Show me +the plan before changing GitHub or Azure, and call out anything that needs +owner/admin permission. ``` Open both Doctor outputs. The report explains the findings; the evidence pack diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 3e329c3..22291c0 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1326,8 +1326,10 @@ principal has **both** Foundry User access on the **dev** Foundry project **and** Cognitive Services OpenAI User on the underlying Azure AI Services account that hosts the evaluator model (both roles are required — without the OpenAI User role, the Foundry cloud graders fail with a 401 and every -metric comes back null), and do not set up `qa`, `production`, scheduled -Doctor, or hosted deployment workflows yet. +metric comes back null), verify `AZURE_TENANT_ID` is the tenant that owns +the Entra app registration and its federated credential (not just a +subscription `managedByTenants` value), and do not set up `qa`, +`production`, scheduled Doctor, or hosted deployment workflows yet. I am using trunk-based development with `main` as both my trunk and dev branch. The generator's stock dev-deploy trigger is `push: branches: @@ -1353,6 +1355,11 @@ it skips: - Set Actions variables `AZURE_TENANT_ID`, `AZURE_SUBSCRIPTION_ID`, `AZURE_CLIENT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT` (the dev endpoint), and `APPLICATIONINSIGHTS_CONNECTION_STRING` if available. +- Verify `AZURE_TENANT_ID` against the app registration / federated + credential tenant before the first run. A subscription can be associated + with another tenant through `managedByTenants`; do not copy that tenant id + into the GitHub environment unless the app registration and federated + credential are actually visible there. - **Rewrite the dev deploy trigger to `main`.** The generator emits the stock GitFlow defaults (`pull_request: branches: [develop, "release/**", main]` on `agentops-pr.yml`, `push: branches: [develop]` on @@ -1413,7 +1420,8 @@ If you want to wait on the first PR-workflow verification run from the terminal instead of the Actions UI: ```powershell -$runId = gh run list --workflow agentops-pr.yml --branch main --limit 1 --json databaseId --jq '.[0].databaseId' +$prBranch = gh pr view --json headRefName --jq '.headRefName' +$runId = gh run list --workflow agentops-pr.yml --branch $prBranch --event pull_request --limit 1 --json databaseId --jq '.[0].databaseId' gh run view $runId --web gh run watch $runId --exit-status ``` diff --git a/plugins/agentops/skills/agentops-workflow/SKILL.md b/plugins/agentops/skills/agentops-workflow/SKILL.md index db15e44..c70dedf 100644 --- a/plugins/agentops/skills/agentops-workflow/SKILL.md +++ b/plugins/agentops/skills/agentops-workflow/SKILL.md @@ -81,7 +81,21 @@ by discovering the whole Azure subscription. `azd env get-values` values before `az account show`. - `az account show` only as a proposal for tenant/subscription; confirm before writing it to GitHub variables. -6. Copy CI variables from local AgentOps/azd configuration into the GitHub +6. For GitHub OIDC, treat `AZURE_TENANT_ID` as the tenant that owns the app + registration / federated credential, not merely the tenant associated with + the subscription or a `managedByTenants` entry. Before writing + `AZURE_TENANT_ID`, verify the chosen tenant can see the app registration and + the exact federated credential: + - `az ad app show --id ` in the active tenant, or an + equivalent Microsoft Graph query scoped to the proposed tenant. + - `az ad app federated-credential list --id ` and confirm + the `subject`, `issuer`, and `audiences`. + If the app is visible in one tenant but the Azure subscription is associated + with another tenant, use the app/federated-credential tenant for + `AZURE_TENANT_ID`; the subscription id remains `AZURE_SUBSCRIPTION_ID`. + Do not copy a `managedByTenants[*].tenantId` value into GitHub variables + unless the app and federated credential are verified there too. +7. Copy CI variables from local AgentOps/azd configuration into the GitHub environment used by the workflow. Reuse local values for `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT`, and optional @@ -89,17 +103,17 @@ by discovering the whole Azure subscription. them again. Explain `AZURE_OPENAI_DEPLOYMENT` only if it is missing: it is the Azure OpenAI deployment used as the evaluator/judge model, not the user's agent. -7. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or +8. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or model deployments to guess missing values. If `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, or `AZURE_OPENAI_DEPLOYMENT` is absent from AgentOps/azd/local env, ask the user to choose or provide it. Only run a scoped Azure query after the user confirms the subscription and the exact missing value. -8. For GitHub OIDC, derive the federated credential subject from the generated +9. For GitHub OIDC, derive the federated credential subject from the generated workflow. If the job has `environment: dev`, the subject is normally `repo:/:environment:dev`. Do not assume branch or `pull_request` subjects without reading the workflow. -9. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / +10. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / service principal has **two** RBAC assignments. Both are required; the eval step fails silently (every metric returns `null`) if only one is in place. 1. **Foundry User** on the Foundry project (or the Foundry resource scope @@ -118,7 +132,7 @@ by discovering the whole Azure subscription. metric scores" warning so the cause is visible in CI logs, but the workflow still fails the gate. Grant this role **before** the first run. Azure **Reader** is not enough for either step. -10. If either RBAC assignment is missing, do not run the workflow yet. +11. If either RBAC assignment is missing, do not run the workflow yet. Show the exact GitHub OIDC client ID / service principal, desired role, target scope (project for Foundry User, AI Services account for Cognitive Services OpenAI User), then ask the user to approve the role assignment or @@ -134,25 +148,30 @@ by discovering the whole Azure subscription. `/subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/` and can be derived from `az cognitiveservices account list --resource-group --query "[?kind=='AIServices'].id" -o tsv`. -11. Ask before creating or updating GitHub repos, GitHub environments, +12. Ask before creating or updating GitHub repos, GitHub environments, variables/secrets, Entra app registrations/service principals, federated credentials, managed identities, or Azure RBAC assignments. -12. When creating federated credentials from PowerShell, avoid fragile +13. When creating federated credentials from PowerShell, avoid fragile interpolation. Do **not** write `"repo:$repo:environment:$envName"` because `$repo:` can be parsed as a scoped variable. Use `"repo:${repo}:environment:${envName}"` or `("repo:{0}:environment:{1}" -f $repo, $envName)`, then build JSON from a PowerShell object with `ConvertTo-Json`. -13. After creating or updating a federated credential, read it back and verify +14. After creating or updating a federated credential, read it back and verify before triggering a workflow: - `subject` exactly matches the generated workflow subject. - `issuer` is `https://token.actions.githubusercontent.com`. - `audiences` includes `api://AzureADTokenExchange`. If any value differs, fix the credential before running GitHub Actions. -14. Do not dispatch `gh workflow run` as a surprise validation step. First show +15. After setting GitHub environment variables, read them back and verify + `AZURE_TENANT_ID` still matches the app/federated-credential tenant before + triggering a run. If `azure/login` fails with `AADSTS53003`, first re-check + this tenant/app alignment before assuming Conditional Access is the root + cause. +16. Do not dispatch `gh workflow run` as a surprise validation step. First show that the GitHub environment, variables/secrets, federated credential, and Foundry RBAC are ready, then ask the user before triggering workflows. -15. Avoid broad discovery unless local config is missing. Do **not** run broad +17. Avoid broad discovery unless local config is missing. Do **not** run broad `az resource list`, `az graph query`, SDK inspection, or web search to find the Foundry project when `agentops init show`, `.agentops/.env`, or `.azure//.env` already has `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`. If the diff --git a/src/agentops/templates/skills/agentops-workflow/SKILL.md b/src/agentops/templates/skills/agentops-workflow/SKILL.md index db15e44..c70dedf 100644 --- a/src/agentops/templates/skills/agentops-workflow/SKILL.md +++ b/src/agentops/templates/skills/agentops-workflow/SKILL.md @@ -81,7 +81,21 @@ by discovering the whole Azure subscription. `azd env get-values` values before `az account show`. - `az account show` only as a proposal for tenant/subscription; confirm before writing it to GitHub variables. -6. Copy CI variables from local AgentOps/azd configuration into the GitHub +6. For GitHub OIDC, treat `AZURE_TENANT_ID` as the tenant that owns the app + registration / federated credential, not merely the tenant associated with + the subscription or a `managedByTenants` entry. Before writing + `AZURE_TENANT_ID`, verify the chosen tenant can see the app registration and + the exact federated credential: + - `az ad app show --id ` in the active tenant, or an + equivalent Microsoft Graph query scoped to the proposed tenant. + - `az ad app federated-credential list --id ` and confirm + the `subject`, `issuer`, and `audiences`. + If the app is visible in one tenant but the Azure subscription is associated + with another tenant, use the app/federated-credential tenant for + `AZURE_TENANT_ID`; the subscription id remains `AZURE_SUBSCRIPTION_ID`. + Do not copy a `managedByTenants[*].tenantId` value into GitHub variables + unless the app and federated credential are verified there too. +7. Copy CI variables from local AgentOps/azd configuration into the GitHub environment used by the workflow. Reuse local values for `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT`, and optional @@ -89,17 +103,17 @@ by discovering the whole Azure subscription. them again. Explain `AZURE_OPENAI_DEPLOYMENT` only if it is missing: it is the Azure OpenAI deployment used as the evaluator/judge model, not the user's agent. -7. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or +8. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or model deployments to guess missing values. If `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, or `AZURE_OPENAI_DEPLOYMENT` is absent from AgentOps/azd/local env, ask the user to choose or provide it. Only run a scoped Azure query after the user confirms the subscription and the exact missing value. -8. For GitHub OIDC, derive the federated credential subject from the generated +9. For GitHub OIDC, derive the federated credential subject from the generated workflow. If the job has `environment: dev`, the subject is normally `repo:/:environment:dev`. Do not assume branch or `pull_request` subjects without reading the workflow. -9. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / +10. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / service principal has **two** RBAC assignments. Both are required; the eval step fails silently (every metric returns `null`) if only one is in place. 1. **Foundry User** on the Foundry project (or the Foundry resource scope @@ -118,7 +132,7 @@ by discovering the whole Azure subscription. metric scores" warning so the cause is visible in CI logs, but the workflow still fails the gate. Grant this role **before** the first run. Azure **Reader** is not enough for either step. -10. If either RBAC assignment is missing, do not run the workflow yet. +11. If either RBAC assignment is missing, do not run the workflow yet. Show the exact GitHub OIDC client ID / service principal, desired role, target scope (project for Foundry User, AI Services account for Cognitive Services OpenAI User), then ask the user to approve the role assignment or @@ -134,25 +148,30 @@ by discovering the whole Azure subscription. `/subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/` and can be derived from `az cognitiveservices account list --resource-group --query "[?kind=='AIServices'].id" -o tsv`. -11. Ask before creating or updating GitHub repos, GitHub environments, +12. Ask before creating or updating GitHub repos, GitHub environments, variables/secrets, Entra app registrations/service principals, federated credentials, managed identities, or Azure RBAC assignments. -12. When creating federated credentials from PowerShell, avoid fragile +13. When creating federated credentials from PowerShell, avoid fragile interpolation. Do **not** write `"repo:$repo:environment:$envName"` because `$repo:` can be parsed as a scoped variable. Use `"repo:${repo}:environment:${envName}"` or `("repo:{0}:environment:{1}" -f $repo, $envName)`, then build JSON from a PowerShell object with `ConvertTo-Json`. -13. After creating or updating a federated credential, read it back and verify +14. After creating or updating a federated credential, read it back and verify before triggering a workflow: - `subject` exactly matches the generated workflow subject. - `issuer` is `https://token.actions.githubusercontent.com`. - `audiences` includes `api://AzureADTokenExchange`. If any value differs, fix the credential before running GitHub Actions. -14. Do not dispatch `gh workflow run` as a surprise validation step. First show +15. After setting GitHub environment variables, read them back and verify + `AZURE_TENANT_ID` still matches the app/federated-credential tenant before + triggering a run. If `azure/login` fails with `AADSTS53003`, first re-check + this tenant/app alignment before assuming Conditional Access is the root + cause. +16. Do not dispatch `gh workflow run` as a surprise validation step. First show that the GitHub environment, variables/secrets, federated credential, and Foundry RBAC are ready, then ask the user before triggering workflows. -15. Avoid broad discovery unless local config is missing. Do **not** run broad +17. Avoid broad discovery unless local config is missing. Do **not** run broad `az resource list`, `az graph query`, SDK inspection, or web search to find the Foundry project when `agentops init show`, `.agentops/.env`, or `.azure//.env` already has `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`. If the From 8324affa0eb6cfc519ea23aae8398967323d2780 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 13:18:11 -0300 Subject: [PATCH 03/16] docs: make prompt regression branch setup safe Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 22291c0..7537f5f 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1589,9 +1589,9 @@ thresholds are loose enough that a regression slips through, Doctor still catches it. ```powershell -git switch main -git pull -git switch -c feature/regress-travel-agent +git fetch origin +$branch = "feature/regress-travel-agent-step16-$((Get-Date).ToString('yyyyMMddHHmmss'))" +git switch -c $branch origin/main ``` Edit `.agentops/prompts/travel-agent.md` to this intentionally vague @@ -1607,8 +1607,8 @@ Commit and push: ```powershell git add .agentops\prompts\travel-agent.md git commit -m "Intentional regression: vague travel prompt" -git push -u origin feature/regress-travel-agent -gh pr create --base main --head feature/regress-travel-agent --title "Test AgentOps regression gate" --body "Evaluates an intentionally regressed travel-agent prompt." +git push -u origin $branch +gh pr create --base main --head $branch --title "Test AgentOps regression gate" --body "Evaluates an intentionally regressed travel-agent prompt." ``` Watch the PR check: From 0794ad1bff3e27a48606d0bd9875889cf0508969 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 13:20:08 -0300 Subject: [PATCH 04/16] docs: require upstream tracking in workflow setup Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 26 ++++++++------- .../skills/agentops-workflow/SKILL.md | 33 ++++++++++++------- .../skills/agentops-workflow/SKILL.md | 33 ++++++++++++------- 3 files changed, 59 insertions(+), 33 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 7537f5f..12da808 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1320,16 +1320,17 @@ project. This may be a brand-new folder with no Git repo or GitHub remote yet. Keep the scope to the PR gate and dev deploy only: create or connect the -GitHub repo if needed, wire Azure OIDC and required Actions -variables/secrets, create only the `dev` environment, verify the OIDC -principal has **both** Foundry User access on the **dev** Foundry project -**and** Cognitive Services OpenAI User on the underlying Azure AI Services -account that hosts the evaluator model (both roles are required — without -the OpenAI User role, the Foundry cloud graders fail with a 401 and every -metric comes back null), verify `AZURE_TENANT_ID` is the tenant that owns -the Entra app registration and its federated credential (not just a -subscription `managedByTenants` value), and do not set up `qa`, -`production`, scheduled Doctor, or hosted deployment workflows yet. +GitHub repo if needed, ensure local `main` tracks `origin/main` after the +first push/connect, wire Azure OIDC and required Actions variables/secrets, +create only the `dev` environment, verify the OIDC principal has **both** +Foundry User access on the **dev** Foundry project **and** Cognitive Services +OpenAI User on the underlying Azure AI Services account that hosts the +evaluator model (both roles are required — without the OpenAI User role, the +Foundry cloud graders fail with a 401 and every metric comes back null), +verify `AZURE_TENANT_ID` is the tenant that owns the Entra app registration +and its federated credential (not just a subscription `managedByTenants` +value), and do not set up `qa`, `production`, scheduled Doctor, or hosted +deployment workflows yet. I am using trunk-based development with `main` as both my trunk and dev branch. The generator's stock dev-deploy trigger is `push: branches: @@ -1349,7 +1350,10 @@ that needs owner/admin permission. The workflow skill will normally do the following, but call out anything it skips: -- Create/connect the GitHub remote. +- Create/connect the GitHub remote and ensure local `main` tracks + `origin/main` (`git branch -vv` should show `[origin/main]`). If the skill + skips this, run `git branch --set-upstream-to=origin/main main` before the + later tutorial steps that use `git pull`. - Create the `dev` GitHub environment. - Configure OIDC federated credentials between GitHub and Entra ID. - Set Actions variables `AZURE_TENANT_ID`, `AZURE_SUBSCRIPTION_ID`, diff --git a/plugins/agentops/skills/agentops-workflow/SKILL.md b/plugins/agentops/skills/agentops-workflow/SKILL.md index c70dedf..ef1e201 100644 --- a/plugins/agentops/skills/agentops-workflow/SKILL.md +++ b/plugins/agentops/skills/agentops-workflow/SKILL.md @@ -76,6 +76,8 @@ by discovering the whole Azure subscription. - optional `APPLICATIONINSIGHTS_CONNECTION_STRING`. 5. Prefer existing values and exact checks: - `git remote get-url origin` and `gh repo view --json nameWithOwner`. + - `git branch -vv` to confirm the local trunk branch tracks + `origin/main` when the tutorial uses trunk-based `main`. - `gh variable list --env ` and `gh secret list --env `. - `agentops init show`, local `.agentops/.env` or `.azure//.env`, and `azd env get-values` values before `az account show`. @@ -95,7 +97,16 @@ by discovering the whole Azure subscription. `AZURE_TENANT_ID`; the subscription id remains `AZURE_SUBSCRIPTION_ID`. Do not copy a `managedByTenants[*].tenantId` value into GitHub variables unless the app and federated credential are verified there too. -7. Copy CI variables from local AgentOps/azd configuration into the GitHub +7. When creating or connecting the GitHub remote for the prompt-agent tutorial, + make sure the local trunk branch tracks the remote trunk before telling the + user to continue: + - If `main` is newly pushed, use `git push -u origin main`. + - If `origin/main` already exists, use + `git branch --set-upstream-to=origin/main main`. + - Verify with `git branch -vv`; `main` must show `[origin/main]`. + Without this, a later `git pull` on `main` can fetch but not update the + local branch. +8. Copy CI variables from local AgentOps/azd configuration into the GitHub environment used by the workflow. Reuse local values for `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT`, and optional @@ -103,17 +114,17 @@ by discovering the whole Azure subscription. them again. Explain `AZURE_OPENAI_DEPLOYMENT` only if it is missing: it is the Azure OpenAI deployment used as the evaluator/judge model, not the user's agent. -8. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or +9. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or model deployments to guess missing values. If `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, or `AZURE_OPENAI_DEPLOYMENT` is absent from AgentOps/azd/local env, ask the user to choose or provide it. Only run a scoped Azure query after the user confirms the subscription and the exact missing value. -9. For GitHub OIDC, derive the federated credential subject from the generated +10. For GitHub OIDC, derive the federated credential subject from the generated workflow. If the job has `environment: dev`, the subject is normally `repo:/:environment:dev`. Do not assume branch or `pull_request` subjects without reading the workflow. -10. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / +11. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / service principal has **two** RBAC assignments. Both are required; the eval step fails silently (every metric returns `null`) if only one is in place. 1. **Foundry User** on the Foundry project (or the Foundry resource scope @@ -132,7 +143,7 @@ by discovering the whole Azure subscription. metric scores" warning so the cause is visible in CI logs, but the workflow still fails the gate. Grant this role **before** the first run. Azure **Reader** is not enough for either step. -11. If either RBAC assignment is missing, do not run the workflow yet. +12. If either RBAC assignment is missing, do not run the workflow yet. Show the exact GitHub OIDC client ID / service principal, desired role, target scope (project for Foundry User, AI Services account for Cognitive Services OpenAI User), then ask the user to approve the role assignment or @@ -148,30 +159,30 @@ by discovering the whole Azure subscription. `/subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/` and can be derived from `az cognitiveservices account list --resource-group --query "[?kind=='AIServices'].id" -o tsv`. -12. Ask before creating or updating GitHub repos, GitHub environments, +13. Ask before creating or updating GitHub repos, GitHub environments, variables/secrets, Entra app registrations/service principals, federated credentials, managed identities, or Azure RBAC assignments. -13. When creating federated credentials from PowerShell, avoid fragile +14. When creating federated credentials from PowerShell, avoid fragile interpolation. Do **not** write `"repo:$repo:environment:$envName"` because `$repo:` can be parsed as a scoped variable. Use `"repo:${repo}:environment:${envName}"` or `("repo:{0}:environment:{1}" -f $repo, $envName)`, then build JSON from a PowerShell object with `ConvertTo-Json`. -14. After creating or updating a federated credential, read it back and verify +15. After creating or updating a federated credential, read it back and verify before triggering a workflow: - `subject` exactly matches the generated workflow subject. - `issuer` is `https://token.actions.githubusercontent.com`. - `audiences` includes `api://AzureADTokenExchange`. If any value differs, fix the credential before running GitHub Actions. -15. After setting GitHub environment variables, read them back and verify +16. After setting GitHub environment variables, read them back and verify `AZURE_TENANT_ID` still matches the app/federated-credential tenant before triggering a run. If `azure/login` fails with `AADSTS53003`, first re-check this tenant/app alignment before assuming Conditional Access is the root cause. -16. Do not dispatch `gh workflow run` as a surprise validation step. First show +17. Do not dispatch `gh workflow run` as a surprise validation step. First show that the GitHub environment, variables/secrets, federated credential, and Foundry RBAC are ready, then ask the user before triggering workflows. -17. Avoid broad discovery unless local config is missing. Do **not** run broad +18. Avoid broad discovery unless local config is missing. Do **not** run broad `az resource list`, `az graph query`, SDK inspection, or web search to find the Foundry project when `agentops init show`, `.agentops/.env`, or `.azure//.env` already has `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`. If the diff --git a/src/agentops/templates/skills/agentops-workflow/SKILL.md b/src/agentops/templates/skills/agentops-workflow/SKILL.md index c70dedf..ef1e201 100644 --- a/src/agentops/templates/skills/agentops-workflow/SKILL.md +++ b/src/agentops/templates/skills/agentops-workflow/SKILL.md @@ -76,6 +76,8 @@ by discovering the whole Azure subscription. - optional `APPLICATIONINSIGHTS_CONNECTION_STRING`. 5. Prefer existing values and exact checks: - `git remote get-url origin` and `gh repo view --json nameWithOwner`. + - `git branch -vv` to confirm the local trunk branch tracks + `origin/main` when the tutorial uses trunk-based `main`. - `gh variable list --env ` and `gh secret list --env `. - `agentops init show`, local `.agentops/.env` or `.azure//.env`, and `azd env get-values` values before `az account show`. @@ -95,7 +97,16 @@ by discovering the whole Azure subscription. `AZURE_TENANT_ID`; the subscription id remains `AZURE_SUBSCRIPTION_ID`. Do not copy a `managedByTenants[*].tenantId` value into GitHub variables unless the app and federated credential are verified there too. -7. Copy CI variables from local AgentOps/azd configuration into the GitHub +7. When creating or connecting the GitHub remote for the prompt-agent tutorial, + make sure the local trunk branch tracks the remote trunk before telling the + user to continue: + - If `main` is newly pushed, use `git push -u origin main`. + - If `origin/main` already exists, use + `git branch --set-upstream-to=origin/main main`. + - Verify with `git branch -vv`; `main` must show `[origin/main]`. + Without this, a later `git pull` on `main` can fetch but not update the + local branch. +8. Copy CI variables from local AgentOps/azd configuration into the GitHub environment used by the workflow. Reuse local values for `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT`, and optional @@ -103,17 +114,17 @@ by discovering the whole Azure subscription. them again. Explain `AZURE_OPENAI_DEPLOYMENT` only if it is missing: it is the Azure OpenAI deployment used as the evaluator/judge model, not the user's agent. -8. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or +9. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or model deployments to guess missing values. If `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, or `AZURE_OPENAI_DEPLOYMENT` is absent from AgentOps/azd/local env, ask the user to choose or provide it. Only run a scoped Azure query after the user confirms the subscription and the exact missing value. -9. For GitHub OIDC, derive the federated credential subject from the generated +10. For GitHub OIDC, derive the federated credential subject from the generated workflow. If the job has `environment: dev`, the subject is normally `repo:/:environment:dev`. Do not assume branch or `pull_request` subjects without reading the workflow. -10. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / +11. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / service principal has **two** RBAC assignments. Both are required; the eval step fails silently (every metric returns `null`) if only one is in place. 1. **Foundry User** on the Foundry project (or the Foundry resource scope @@ -132,7 +143,7 @@ by discovering the whole Azure subscription. metric scores" warning so the cause is visible in CI logs, but the workflow still fails the gate. Grant this role **before** the first run. Azure **Reader** is not enough for either step. -11. If either RBAC assignment is missing, do not run the workflow yet. +12. If either RBAC assignment is missing, do not run the workflow yet. Show the exact GitHub OIDC client ID / service principal, desired role, target scope (project for Foundry User, AI Services account for Cognitive Services OpenAI User), then ask the user to approve the role assignment or @@ -148,30 +159,30 @@ by discovering the whole Azure subscription. `/subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/` and can be derived from `az cognitiveservices account list --resource-group --query "[?kind=='AIServices'].id" -o tsv`. -12. Ask before creating or updating GitHub repos, GitHub environments, +13. Ask before creating or updating GitHub repos, GitHub environments, variables/secrets, Entra app registrations/service principals, federated credentials, managed identities, or Azure RBAC assignments. -13. When creating federated credentials from PowerShell, avoid fragile +14. When creating federated credentials from PowerShell, avoid fragile interpolation. Do **not** write `"repo:$repo:environment:$envName"` because `$repo:` can be parsed as a scoped variable. Use `"repo:${repo}:environment:${envName}"` or `("repo:{0}:environment:{1}" -f $repo, $envName)`, then build JSON from a PowerShell object with `ConvertTo-Json`. -14. After creating or updating a federated credential, read it back and verify +15. After creating or updating a federated credential, read it back and verify before triggering a workflow: - `subject` exactly matches the generated workflow subject. - `issuer` is `https://token.actions.githubusercontent.com`. - `audiences` includes `api://AzureADTokenExchange`. If any value differs, fix the credential before running GitHub Actions. -15. After setting GitHub environment variables, read them back and verify +16. After setting GitHub environment variables, read them back and verify `AZURE_TENANT_ID` still matches the app/federated-credential tenant before triggering a run. If `azure/login` fails with `AADSTS53003`, first re-check this tenant/app alignment before assuming Conditional Access is the root cause. -16. Do not dispatch `gh workflow run` as a surprise validation step. First show +17. Do not dispatch `gh workflow run` as a surprise validation step. First show that the GitHub environment, variables/secrets, federated credential, and Foundry RBAC are ready, then ask the user before triggering workflows. -17. Avoid broad discovery unless local config is missing. Do **not** run broad +18. Avoid broad discovery unless local config is missing. Do **not** run broad `az resource list`, `az graph query`, SDK inspection, or web search to find the Foundry project when `agentops init show`, `.agentops/.env`, or `.azure//.env` already has `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`. If the From 80b3c2342040368a4c61323a3d8d8a5e993b88b3 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 16:59:54 -0300 Subject: [PATCH 05/16] docs: clarify Foundry observability checkout Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 28 ++++++++++++++++-------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 12da808..55dfa39 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1694,9 +1694,9 @@ remember to look at a dashboard. ## 18. Brief observability checkout (Foundry side) -The Foundry side of the loop is worth a short tour, even though it is -not what AgentOps owns. This is the "Foundry tells you what happened" -side of the conversation. +Take a short tour of the Foundry runtime view: this is where you inspect +the traces, spans, latency, model calls, and input/output details that show +what actually happened during an eval or live conversation. 1. Open the `travel-agent-dev` project in the Foundry portal. 2. Open the `travel-agent` agent and switch to the **Traces** tab. If @@ -1713,13 +1713,23 @@ side of the conversation. the last 24 hours. ``` -5. Optionally, sample the same operation through Application Insights - Logs (KQL) for the engineer-level view. +5. Optional deep dive: open the connected **Application Insights** resource, + go to **Logs**, set the time range to **Last 24 hours**, and run a small + KQL query to inspect the raw telemetry behind the trace view: + + ```kusto + AppTraces + | where TimeGenerated > ago(24h) + | where Message has_any ("travel-agent", "travel") + or tostring(Properties) has_any ("travel-agent", "travel") + | project TimeGenerated, Message, SeverityLevel, Properties + | order by TimeGenerated desc + | take 50 + ``` -This is the observability surface AgentOps does **not** replace. Doctor -will check whether this telemetry is wired (App Insights connection -string, recent traces, etc.) and include it in the readiness call, but -the runtime view itself lives in Foundry. +Foundry gives you the runtime trace view; AgentOps Doctor checks that the +telemetry is wired and includes those signals in the release-readiness +evidence. ## 19. Sync local evidence and create the release evidence pack From 399b683a5c3880cd850f259cbaa17ef31874ce53 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 17:26:07 -0300 Subject: [PATCH 06/16] docs: explain trace sampling in observability checkout Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 76 ++++++++++++++++-------- 1 file changed, 52 insertions(+), 24 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 55dfa39..ef6e848 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1692,44 +1692,72 @@ regressions that thresholds alone miss, and the merge promotes through the deploy workflow. None of those gates require the developer to remember to look at a dashboard. -## 18. Brief observability checkout (Foundry side) +## 18. Observability checkout: traces into continuous evaluation -Take a short tour of the Foundry runtime view: this is where you inspect -the traces, spans, latency, model calls, and input/output details that show -what actually happened during an eval or live conversation. +Take a short tour of the Foundry runtime view, then turn the same production +signal into evaluation coverage. This is the bridge from "what happened in +real traces" to "what should keep getting evaluated." 1. Open the `travel-agent-dev` project in the Foundry portal. 2. Open the `travel-agent` agent and switch to the **Traces** tab. If Application Insights is not yet connected, connect or create the resource now. -3. Find the most recent eval run in **Conversations** or +3. Find a recent eval or playground run in **Conversations** or **Responses** and click the **Trace ID**. Inspect spans, latency, - model call, and the input/output panes. -4. Switch to **Operate → Overview** and use **Ask AI** for a - dashboard-level summary. Example: + model calls, and the input/output panes. +4. Switch to **Operate → Overview** and use **Ask AI** for a dashboard-level + summary. Example: ```text Help me identify any issues or anomalies in my agent metrics for the last 24 hours. ``` -5. Optional deep dive: open the connected **Application Insights** resource, - go to **Logs**, set the time range to **Last 24 hours**, and run a small - KQL query to inspect the raw telemetry behind the trace view: - - ```kusto - AppTraces - | where TimeGenerated > ago(24h) - | where Message has_any ("travel-agent", "travel") - or tostring(Properties) has_any ("travel-agent", "travel") - | project TimeGenerated, Message, SeverityLevel, Properties - | order by TimeGenerated desc - | take 50 - ``` +5. Now use the traces as evaluation signal. In the project, open + **Data Generation**, then select **Create dataset → From traces**. +6. In **Create dataset**, configure: + + | Field | Value | + |---|---| + | **Dataset usage** | `Evaluation` | + | **Name** | `travel-agent-traces-step18` | + | **Agent** | `travel-agent` | + | **Date range** | Last day or last 7 days | + | **Maximum samples** | At least `15` | + + Leave **Intelligent sampling** enabled when the time-range UI shows it. + Foundry will filter noisy traces, deduplicate near-identical prompts, and + select a representative sample instead of evaluating every request. +7. Select **Create** and track the background job on the **Data Generation** + tab. When it finishes, open the generated dataset from the **Data** tab and + preview the rows. This is the evaluation-ready sample created from real + traces. +8. If the portal offers to start an evaluation from the completed job, open it + and confirm the generated dataset is selected. You do not need to finish a + new eval for this tutorial step; the point is to see how Foundry turns + traced behavior into a dataset you can evaluate continuously. + +> **Public preview.** Trace-to-dataset generation and intelligent sampling are +> currently preview Foundry features. If your region or project does not show +> **Create dataset → From traces**, continue with step 19 and treat this section +> as a product tour. + +Optional KQL deep dive: use Application Insights **Logs** only when you want to +debug raw telemetry. Set the time range to **Last 24 hours** and run: + +```kusto +AppTraces +| where TimeGenerated > ago(24h) +| where Message has_any ("travel-agent", "travel") + or tostring(Properties) has_any ("travel-agent", "travel") +| project TimeGenerated, Message, SeverityLevel, Properties +| order by TimeGenerated desc +| take 50 +``` -Foundry gives you the runtime trace view; AgentOps Doctor checks that the -telemetry is wired and includes those signals in the release-readiness -evidence. +Foundry gives you the runtime trace view and trace-sampled evaluation datasets; +AgentOps Doctor checks that telemetry and release evidence are wired into the +readiness story. ## 19. Sync local evidence and create the release evidence pack From 247066ee012c37670e285edd150ec72ccd35c604 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 18:17:14 -0300 Subject: [PATCH 07/16] docs: require App Insights Reader for trace sampling Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-end-to-end.md | 2 +- docs/tutorial-prompt-agent-quickstart.md | 32 ++++++++++++++++++++++-- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/docs/tutorial-end-to-end.md b/docs/tutorial-end-to-end.md index 07231f0..73a91df 100644 --- a/docs/tutorial-end-to-end.md +++ b/docs/tutorial-end-to-end.md @@ -122,7 +122,7 @@ prompts. | Azure CLI is installed and `az login` succeeds with the tenant that owns the Foundry project. | AgentOps, Foundry SDK calls, Doctor, Cockpit, and CI setup all need the same Azure identity context. | | You have the Foundry project endpoint and can create or publish one Travel Agent target. | The target is either `travel-agent:` for prompt agents or an HTTP endpoint for hosted agents. | | You have a chat-capable Azure OpenAI deployment, for example `gpt-4o-mini`. | Local evals and CI variables need a judge model for evaluator calls. | -| Application Insights is connected to the Foundry project or agent runtime, or you can create/attach it. | Foundry Traces, Operate metrics/Ask AI when available, Azure Monitor, Doctor, Cockpit, and evidence links need telemetry. | +| Application Insights is connected to the Foundry project or agent runtime, or you can create/attach it. For Foundry trace-to-dataset flows, you can also grant Reader on App Insights to the Foundry project managed identity. | Foundry Traces, Operate metrics/Ask AI when available, trace sampling, Azure Monitor, Doctor, Cockpit, and evidence links need telemetry. | | You can deploy or expose any hosted endpoint that CI will call. | `localhost` works for local eval; remote CI needs a reachable HTTPS URL. | | You can push to the tutorial GitHub repository and run GitHub Actions or Azure Pipelines. | PR and environment workflows only run after the repo is published. | | GitHub CLI is authenticated with `gh auth login` if you use GitHub PR commands while testing CI. | The regression and release-gate steps are smoother when repo, PR, and Actions access are already confirmed. | diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index ef6e848..2e4c27f 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -57,7 +57,7 @@ permission prompts. | You can create **two** Foundry projects in the same Azure subscription (or have two existing projects you can use). | The tutorial uses a sandbox project for authoring and experimentation plus a shared dev project for the PR gate. You only need to publish the agent in sandbox — CI auto-bootstraps it in dev (and later qa / prod). | | You can publish a prompt agent in the **sandbox** Foundry project. | The tutorial seeds `travel-agent:2` only in sandbox (Foundry portal typically numbers the first published version `:2`, not `:1`). Dev / qa / prod start empty; the prompt-agent deploy workflow creates the first version in those projects automatically using `prompt_agent_bootstrap` defaults plus `prompt_file`. | | The **same model deployment name** (for example `gpt-4o-mini`) exists in every Foundry project you plan to deploy to. | `prompt_agent_bootstrap.model` is a single value reused for every environment. If dev does not have that deployment, the first auto-bootstrap fails. | -| You can create or attach Application Insights for at least the dev Foundry project. | Foundry Traces, the Operate dashboard, Doctor, and Cockpit need telemetry to tell the observability story. Sandbox observability is optional. | +| You can create or attach Application Insights for at least the dev Foundry project, and can grant Reader to the dev project's managed identity on that App Insights resource. | Foundry Traces, the Operate dashboard, trace-to-dataset generation, Doctor, and Cockpit need telemetry to tell the observability story. Sandbox observability is optional. | | You can push to the tutorial GitHub repository and run GitHub Actions. | The PR gate only runs after the repo is pushed. | | GitHub CLI is authenticated with `gh auth login` if you use the PR commands in this tutorial. | The regression step opens PRs and sends the reader directly to the workflow run. | | You can create a GitHub environment named `dev` and add Actions variables/secrets. | The generated workflow uses that environment for Azure auth and the dev Foundry project endpoint. | @@ -336,6 +336,11 @@ For each project, please: uses a single bootstrap model value for every environment. - Attach or create an Application Insights resource for telemetry, starting with the dev project. +- Grant or verify **Reader** on that Application Insights resource to the + **managed identity of the `travel-agent-dev` Foundry project**. Foundry's + trace-to-dataset flow runs as the project identity when it reads traces; the + Operate dashboard may still render for my signed-in user even when this + project identity permission is missing. - Grant or verify `Foundry User` access for my signed-in user on the parent Foundry / AI Services account so I can build agents in the Foundry UI. Some portal screens still call this role `Azure AI User`. @@ -610,7 +615,7 @@ build the prompt agent. One of two things will be true: | What you see | What it means | What to do | |---|---|---| -| An `appinsights` row with category `AppInsights` | The resource exists and is connected to the dev project. Auto-discovery will pick it up. | **You are done.** Skip the rest of this subsection and continue to section 9. | +| An `appinsights` row with category `AppInsights` | The resource exists and is connected to the dev project. Auto-discovery will pick it up. | Continue with the trace-to-dataset access check below. | | No App Insights row in **Connected resources** | The resource was not connected in step 3. | Click **Add connection**, connect or create an Application Insights resource for the dev project, or paste a connection string manually. | **If Connected resources does not show App Insights**, the fastest fix is @@ -620,6 +625,22 @@ in the same resource group as the dev project. Once an `appinsights` row appears under **Connected resources**, you can again skip the manual env variable — auto-discovery will pick it up. +**Also verify trace-to-dataset access now.** For the step 18 +trace-sampling flow, the **managed identity of the `travel-agent-dev` +Foundry project** needs **Reader** on the connected Application Insights +resource. This is separate from your signed-in user's portal access and +separate from GitHub OIDC. If you connected App Insights manually, open the +Application Insights resource in Azure Portal → **Access control (IAM)** and +add: + +| Field | Value | +|---|---| +| **Role** | Reader | +| **Assign access to** | Managed identity | +| **Managed identity** | `travel-agent-dev` Foundry project | + +Wait a few minutes for RBAC propagation before creating a dataset from traces. + **Only if you specifically want to override which resource telemetry goes to** (advanced case, e.g. you have a dedicated observability resource group), grab the connection string and paste it into @@ -1728,6 +1749,13 @@ real traces" to "what should keep getting evaluated." Leave **Intelligent sampling** enabled when the time-range UI shows it. Foundry will filter noisy traces, deduplicate near-identical prompts, and select a representative sample instead of evaluating every request. + + If the dialog shows **Setup incomplete: Assign the Foundry project's managed + identity the Reader role on Application Insights**, click **Resolve** if you + have permission. Otherwise ask an Azure admin to grant **Reader** on the + connected Application Insights resource to the **managed identity of the + `travel-agent-dev` Foundry project**, then wait a few minutes for RBAC to + propagate and reopen the dialog. 7. Select **Create** and track the background job on the **Data Generation** tab. When it finishes, open the generated dataset from the **Data** tab and preview the rows. This is the evaluation-ready sample created from real From 2147ed1d1908eb5a6fac899f58304d4b4491a7bd Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 18:23:13 -0300 Subject: [PATCH 08/16] docs: cover workspace-backed App Insights trace access Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-end-to-end.md | 2 +- docs/tutorial-prompt-agent-quickstart.md | 24 ++++++++++++++++-------- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/docs/tutorial-end-to-end.md b/docs/tutorial-end-to-end.md index 73a91df..2dbaacc 100644 --- a/docs/tutorial-end-to-end.md +++ b/docs/tutorial-end-to-end.md @@ -122,7 +122,7 @@ prompts. | Azure CLI is installed and `az login` succeeds with the tenant that owns the Foundry project. | AgentOps, Foundry SDK calls, Doctor, Cockpit, and CI setup all need the same Azure identity context. | | You have the Foundry project endpoint and can create or publish one Travel Agent target. | The target is either `travel-agent:` for prompt agents or an HTTP endpoint for hosted agents. | | You have a chat-capable Azure OpenAI deployment, for example `gpt-4o-mini`. | Local evals and CI variables need a judge model for evaluator calls. | -| Application Insights is connected to the Foundry project or agent runtime, or you can create/attach it. For Foundry trace-to-dataset flows, you can also grant Reader on App Insights to the Foundry project managed identity. | Foundry Traces, Operate metrics/Ask AI when available, trace sampling, Azure Monitor, Doctor, Cockpit, and evidence links need telemetry. | +| Application Insights is connected to the Foundry project or agent runtime, or you can create/attach it. For Foundry trace-to-dataset flows, you can also grant Reader on App Insights and its backing Log Analytics workspace to the Foundry project managed identity. | Foundry Traces, Operate metrics/Ask AI when available, trace sampling, Azure Monitor, Doctor, Cockpit, and evidence links need telemetry. | | You can deploy or expose any hosted endpoint that CI will call. | `localhost` works for local eval; remote CI needs a reachable HTTPS URL. | | You can push to the tutorial GitHub repository and run GitHub Actions or Azure Pipelines. | PR and environment workflows only run after the repo is published. | | GitHub CLI is authenticated with `gh auth login` if you use GitHub PR commands while testing CI. | The regression and release-gate steps are smoother when repo, PR, and Actions access are already confirmed. | diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 2e4c27f..9934776 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -57,7 +57,7 @@ permission prompts. | You can create **two** Foundry projects in the same Azure subscription (or have two existing projects you can use). | The tutorial uses a sandbox project for authoring and experimentation plus a shared dev project for the PR gate. You only need to publish the agent in sandbox — CI auto-bootstraps it in dev (and later qa / prod). | | You can publish a prompt agent in the **sandbox** Foundry project. | The tutorial seeds `travel-agent:2` only in sandbox (Foundry portal typically numbers the first published version `:2`, not `:1`). Dev / qa / prod start empty; the prompt-agent deploy workflow creates the first version in those projects automatically using `prompt_agent_bootstrap` defaults plus `prompt_file`. | | The **same model deployment name** (for example `gpt-4o-mini`) exists in every Foundry project you plan to deploy to. | `prompt_agent_bootstrap.model` is a single value reused for every environment. If dev does not have that deployment, the first auto-bootstrap fails. | -| You can create or attach Application Insights for at least the dev Foundry project, and can grant Reader to the dev project's managed identity on that App Insights resource. | Foundry Traces, the Operate dashboard, trace-to-dataset generation, Doctor, and Cockpit need telemetry to tell the observability story. Sandbox observability is optional. | +| You can create or attach Application Insights for at least the dev Foundry project, and can grant Reader to the dev project's managed identity on that App Insights resource and its backing Log Analytics workspace when workspace-based. | Foundry Traces, the Operate dashboard, trace-to-dataset generation, Doctor, and Cockpit need telemetry to tell the observability story. Sandbox observability is optional. | | You can push to the tutorial GitHub repository and run GitHub Actions. | The PR gate only runs after the repo is pushed. | | GitHub CLI is authenticated with `gh auth login` if you use the PR commands in this tutorial. | The regression step opens PRs and sends the reader directly to the workflow run. | | You can create a GitHub environment named `dev` and add Actions variables/secrets. | The generated workflow uses that environment for Azure auth and the dev Foundry project endpoint. | @@ -340,7 +340,8 @@ For each project, please: **managed identity of the `travel-agent-dev` Foundry project**. Foundry's trace-to-dataset flow runs as the project identity when it reads traces; the Operate dashboard may still render for my signed-in user even when this - project identity permission is missing. + project identity permission is missing. If Application Insights is + workspace-based, also grant Reader on the backing Log Analytics workspace. - Grant or verify `Foundry User` access for my signed-in user on the parent Foundry / AI Services account so I can build agents in the Foundry UI. Some portal screens still call this role `Azure AI User`. @@ -628,10 +629,11 @@ variable — auto-discovery will pick it up. **Also verify trace-to-dataset access now.** For the step 18 trace-sampling flow, the **managed identity of the `travel-agent-dev` Foundry project** needs **Reader** on the connected Application Insights -resource. This is separate from your signed-in user's portal access and -separate from GitHub OIDC. If you connected App Insights manually, open the -Application Insights resource in Azure Portal → **Access control (IAM)** and -add: +resource. If the App Insights component is workspace-based, grant the same +Reader role on the backing Log Analytics workspace too. This is separate from +your signed-in user's portal access and separate from GitHub OIDC. If you +connected App Insights manually, open the Application Insights resource in +Azure Portal → **Access control (IAM)** and add: | Field | Value | |---|---| @@ -639,6 +641,11 @@ add: | **Assign access to** | Managed identity | | **Managed identity** | `travel-agent-dev` Foundry project | +Then open the Application Insights resource → **Properties** and check +**Workspace Resource ID**. If it points to a Log Analytics workspace, open that +workspace and repeat the same **Reader** assignment for the `travel-agent-dev` +managed identity. + Wait a few minutes for RBAC propagation before creating a dataset from traces. **Only if you specifically want to override which resource telemetry @@ -1754,8 +1761,9 @@ real traces" to "what should keep getting evaluated." identity the Reader role on Application Insights**, click **Resolve** if you have permission. Otherwise ask an Azure admin to grant **Reader** on the connected Application Insights resource to the **managed identity of the - `travel-agent-dev` Foundry project**, then wait a few minutes for RBAC to - propagate and reopen the dialog. + `travel-agent-dev` Foundry project**. If Application Insights is + workspace-based, grant Reader on its backing Log Analytics workspace too. + Then wait a few minutes for RBAC to propagate and reopen the dialog. 7. Select **Create** and track the background job on the **Data Generation** tab. When it finishes, open the generated dataset from the **Data** tab and preview the rows. This is the evaluation-ready sample created from real From e787c040d6a5c7151af7e90732bebfb015c8e2bb Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 18:26:37 -0300 Subject: [PATCH 09/16] docs: add trace sampling RBAC skill guidance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../skills/agentops-workflow/SKILL.md | 38 ++++++++++++++----- .../skills/agentops-workflow/SKILL.md | 38 ++++++++++++++----- 2 files changed, 56 insertions(+), 20 deletions(-) diff --git a/plugins/agentops/skills/agentops-workflow/SKILL.md b/plugins/agentops/skills/agentops-workflow/SKILL.md index ef1e201..90cb635 100644 --- a/plugins/agentops/skills/agentops-workflow/SKILL.md +++ b/plugins/agentops/skills/agentops-workflow/SKILL.md @@ -114,17 +114,28 @@ by discovering the whole Azure subscription. them again. Explain `AZURE_OPENAI_DEPLOYMENT` only if it is missing: it is the Azure OpenAI deployment used as the evaluator/judge model, not the user's agent. -9. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or +9. For prompt-agent tutorials that use Foundry trace sampling / trace-to-dataset, + verify observability RBAC before telling the user step 18 is ready: + - Resolve the dev Foundry project managed identity principal id. + - Resolve the connected Application Insights resource. + - Grant or verify **Reader** on that Application Insights resource to the dev + Foundry project managed identity. + - If the App Insights component is workspace-based, also grant or verify + **Reader** on the backing Log Analytics workspace. + This is separate from GitHub OIDC and separate from the signed-in user's + portal access. Operate dashboards can still render while trace-to-dataset + fails if the project identity cannot read App Insights. +10. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or model deployments to guess missing values. If `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, or `AZURE_OPENAI_DEPLOYMENT` is absent from AgentOps/azd/local env, ask the user to choose or provide it. Only run a scoped Azure query after the user confirms the subscription and the exact missing value. -10. For GitHub OIDC, derive the federated credential subject from the generated +11. For GitHub OIDC, derive the federated credential subject from the generated workflow. If the job has `environment: dev`, the subject is normally `repo:/:environment:dev`. Do not assume branch or `pull_request` subjects without reading the workflow. -11. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / +12. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / service principal has **two** RBAC assignments. Both are required; the eval step fails silently (every metric returns `null`) if only one is in place. 1. **Foundry User** on the Foundry project (or the Foundry resource scope @@ -143,7 +154,7 @@ by discovering the whole Azure subscription. metric scores" warning so the cause is visible in CI logs, but the workflow still fails the gate. Grant this role **before** the first run. Azure **Reader** is not enough for either step. -12. If either RBAC assignment is missing, do not run the workflow yet. +13. If either RBAC assignment is missing, do not run the workflow yet. Show the exact GitHub OIDC client ID / service principal, desired role, target scope (project for Foundry User, AI Services account for Cognitive Services OpenAI User), then ask the user to approve the role assignment or @@ -159,30 +170,30 @@ by discovering the whole Azure subscription. `/subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/` and can be derived from `az cognitiveservices account list --resource-group --query "[?kind=='AIServices'].id" -o tsv`. -13. Ask before creating or updating GitHub repos, GitHub environments, +14. Ask before creating or updating GitHub repos, GitHub environments, variables/secrets, Entra app registrations/service principals, federated credentials, managed identities, or Azure RBAC assignments. -14. When creating federated credentials from PowerShell, avoid fragile +15. When creating federated credentials from PowerShell, avoid fragile interpolation. Do **not** write `"repo:$repo:environment:$envName"` because `$repo:` can be parsed as a scoped variable. Use `"repo:${repo}:environment:${envName}"` or `("repo:{0}:environment:{1}" -f $repo, $envName)`, then build JSON from a PowerShell object with `ConvertTo-Json`. -15. After creating or updating a federated credential, read it back and verify +16. After creating or updating a federated credential, read it back and verify before triggering a workflow: - `subject` exactly matches the generated workflow subject. - `issuer` is `https://token.actions.githubusercontent.com`. - `audiences` includes `api://AzureADTokenExchange`. If any value differs, fix the credential before running GitHub Actions. -16. After setting GitHub environment variables, read them back and verify +17. After setting GitHub environment variables, read them back and verify `AZURE_TENANT_ID` still matches the app/federated-credential tenant before triggering a run. If `azure/login` fails with `AADSTS53003`, first re-check this tenant/app alignment before assuming Conditional Access is the root cause. -17. Do not dispatch `gh workflow run` as a surprise validation step. First show +18. Do not dispatch `gh workflow run` as a surprise validation step. First show that the GitHub environment, variables/secrets, federated credential, and Foundry RBAC are ready, then ask the user before triggering workflows. -18. Avoid broad discovery unless local config is missing. Do **not** run broad +19. Avoid broad discovery unless local config is missing. Do **not** run broad `az resource list`, `az graph query`, SDK inspection, or web search to find the Foundry project when `agentops init show`, `.agentops/.env`, or `.azure//.env` already has `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`. If the @@ -347,6 +358,13 @@ across environments, set: Insights from the Foundry project endpoint; this value makes eval and Doctor telemetry explicit. +For Foundry prompt-agent projects that use trace sampling or +**Create dataset → From traces**, also verify the Foundry project managed +identity can read telemetry: grant or verify **Reader** on the connected +Application Insights resource, and on the backing Log Analytics workspace when +the App Insights component is workspace-based. This permission is not covered by +the GitHub OIDC service principal roles above. + Then configure Workload Identity Federation on the Azure side (`federated-credentials` on the app registration) for **each branch / environment** the workflows will run from. See diff --git a/src/agentops/templates/skills/agentops-workflow/SKILL.md b/src/agentops/templates/skills/agentops-workflow/SKILL.md index ef1e201..90cb635 100644 --- a/src/agentops/templates/skills/agentops-workflow/SKILL.md +++ b/src/agentops/templates/skills/agentops-workflow/SKILL.md @@ -114,17 +114,28 @@ by discovering the whole Azure subscription. them again. Explain `AZURE_OPENAI_DEPLOYMENT` only if it is missing: it is the Azure OpenAI deployment used as the evaluator/judge model, not the user's agent. -9. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or +9. For prompt-agent tutorials that use Foundry trace sampling / trace-to-dataset, + verify observability RBAC before telling the user step 18 is ready: + - Resolve the dev Foundry project managed identity principal id. + - Resolve the connected Application Insights resource. + - Grant or verify **Reader** on that Application Insights resource to the dev + Foundry project managed identity. + - If the App Insights component is workspace-based, also grant or verify + **Reader** on the backing Log Analytics workspace. + This is separate from GitHub OIDC and separate from the signed-in user's + portal access. Operate dashboards can still render while trace-to-dataset + fails if the project identity cannot read App Insights. +10. Do not enumerate subscriptions, Foundry projects, Azure OpenAI resources, or model deployments to guess missing values. If `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`, or `AZURE_OPENAI_DEPLOYMENT` is absent from AgentOps/azd/local env, ask the user to choose or provide it. Only run a scoped Azure query after the user confirms the subscription and the exact missing value. -10. For GitHub OIDC, derive the federated credential subject from the generated +11. For GitHub OIDC, derive the federated credential subject from the generated workflow. If the job has `environment: dev`, the subject is normally `repo:/:environment:dev`. Do not assume branch or `pull_request` subjects without reading the workflow. -11. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / +12. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app / service principal has **two** RBAC assignments. Both are required; the eval step fails silently (every metric returns `null`) if only one is in place. 1. **Foundry User** on the Foundry project (or the Foundry resource scope @@ -143,7 +154,7 @@ by discovering the whole Azure subscription. metric scores" warning so the cause is visible in CI logs, but the workflow still fails the gate. Grant this role **before** the first run. Azure **Reader** is not enough for either step. -12. If either RBAC assignment is missing, do not run the workflow yet. +13. If either RBAC assignment is missing, do not run the workflow yet. Show the exact GitHub OIDC client ID / service principal, desired role, target scope (project for Foundry User, AI Services account for Cognitive Services OpenAI User), then ask the user to approve the role assignment or @@ -159,30 +170,30 @@ by discovering the whole Azure subscription. `/subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/` and can be derived from `az cognitiveservices account list --resource-group --query "[?kind=='AIServices'].id" -o tsv`. -13. Ask before creating or updating GitHub repos, GitHub environments, +14. Ask before creating or updating GitHub repos, GitHub environments, variables/secrets, Entra app registrations/service principals, federated credentials, managed identities, or Azure RBAC assignments. -14. When creating federated credentials from PowerShell, avoid fragile +15. When creating federated credentials from PowerShell, avoid fragile interpolation. Do **not** write `"repo:$repo:environment:$envName"` because `$repo:` can be parsed as a scoped variable. Use `"repo:${repo}:environment:${envName}"` or `("repo:{0}:environment:{1}" -f $repo, $envName)`, then build JSON from a PowerShell object with `ConvertTo-Json`. -15. After creating or updating a federated credential, read it back and verify +16. After creating or updating a federated credential, read it back and verify before triggering a workflow: - `subject` exactly matches the generated workflow subject. - `issuer` is `https://token.actions.githubusercontent.com`. - `audiences` includes `api://AzureADTokenExchange`. If any value differs, fix the credential before running GitHub Actions. -16. After setting GitHub environment variables, read them back and verify +17. After setting GitHub environment variables, read them back and verify `AZURE_TENANT_ID` still matches the app/federated-credential tenant before triggering a run. If `azure/login` fails with `AADSTS53003`, first re-check this tenant/app alignment before assuming Conditional Access is the root cause. -17. Do not dispatch `gh workflow run` as a surprise validation step. First show +18. Do not dispatch `gh workflow run` as a surprise validation step. First show that the GitHub environment, variables/secrets, federated credential, and Foundry RBAC are ready, then ask the user before triggering workflows. -18. Avoid broad discovery unless local config is missing. Do **not** run broad +19. Avoid broad discovery unless local config is missing. Do **not** run broad `az resource list`, `az graph query`, SDK inspection, or web search to find the Foundry project when `agentops init show`, `.agentops/.env`, or `.azure//.env` already has `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT`. If the @@ -347,6 +358,13 @@ across environments, set: Insights from the Foundry project endpoint; this value makes eval and Doctor telemetry explicit. +For Foundry prompt-agent projects that use trace sampling or +**Create dataset → From traces**, also verify the Foundry project managed +identity can read telemetry: grant or verify **Reader** on the connected +Application Insights resource, and on the backing Log Analytics workspace when +the App Insights component is workspace-based. This permission is not covered by +the GitHub OIDC service principal roles above. + Then configure Workload Identity Federation on the Azure side (`federated-credentials` on the app registration) for **each branch / environment** the workflows will run from. See From 2e7e4af0666af78dc241fdae2175e625c3b7bf94 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 19:00:36 -0300 Subject: [PATCH 10/16] docs: make optional App Insights KQL table-safe Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 26 +++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 9934776..eb5daca 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1779,15 +1779,27 @@ real traces" to "what should keep getting evaluated." > as a product tour. Optional KQL deep dive: use Application Insights **Logs** only when you want to -debug raw telemetry. Set the time range to **Last 24 hours** and run: +debug raw telemetry. App Insights can expose either the classic `traces` table +or the workspace-backed `AppTraces` table, depending on where you opened Logs. +Set the time range to **Last 24 hours** and run this table-safe query: ```kusto -AppTraces -| where TimeGenerated > ago(24h) -| where Message has_any ("travel-agent", "travel") - or tostring(Properties) has_any ("travel-agent", "travel") -| project TimeGenerated, Message, SeverityLevel, Properties -| order by TimeGenerated desc +union isfuzzy=true traces, AppTraces +| extend EventTime = coalesce( + column_ifexists("TimeGenerated", datetime(null)), + column_ifexists("timestamp", datetime(null)) +) +| extend MessageText = tostring(column_ifexists("Message", "")) +| extend MessageText = iff(isempty(MessageText), tostring(column_ifexists("message", "")), MessageText) +| extend PropertiesText = tostring(column_ifexists("Properties", "")) +| extend PropertiesText = iff(isempty(PropertiesText), tostring(column_ifexists("customDimensions", "")), PropertiesText) +| extend SeverityText = tostring(column_ifexists("SeverityLevel", "")) +| extend SeverityText = iff(isempty(SeverityText), tostring(column_ifexists("severityLevel", "")), SeverityText) +| where EventTime > ago(24h) +| where MessageText has_any ("travel-agent", "travel") + or PropertiesText has_any ("travel-agent", "travel") +| project EventTime, MessageText, SeverityText, PropertiesText +| order by EventTime desc | take 50 ``` From 60a585b3cbea0b730492b19de680408631706618 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 19:35:15 -0300 Subject: [PATCH 11/16] docs: query gen_ai evaluation metrics from AppEvents in step 18 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 61 ++++++++++++++++-------- 1 file changed, 40 insertions(+), 21 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index eb5daca..ae0fa9f 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1778,31 +1778,50 @@ real traces" to "what should keep getting evaluated." > **Create dataset → From traces**, continue with step 19 and treat this section > as a product tour. -Optional KQL deep dive: use Application Insights **Logs** only when you want to -debug raw telemetry. App Insights can expose either the classic `traces` table -or the workspace-backed `AppTraces` table, depending on where you opened Logs. -Set the time range to **Last 24 hours** and run this table-safe query: +Optional KQL deep dive: query the evaluation metrics Foundry emits as +`gen_ai.evaluation.result` events. These land in the **`AppEvents`** table, which +only resolves in the **Log Analytics workspace** that backs your Application +Insights resource — not in the App Insights *scoped* Logs blade. Open +**Monitor → Logs** (or the connected Log Analytics workspace), set **Time range** +to **Set in query** (the query below uses `ago(30d)`), and run: ```kusto -union isfuzzy=true traces, AppTraces -| extend EventTime = coalesce( - column_ifexists("TimeGenerated", datetime(null)), - column_ifexists("timestamp", datetime(null)) -) -| extend MessageText = tostring(column_ifexists("Message", "")) -| extend MessageText = iff(isempty(MessageText), tostring(column_ifexists("message", "")), MessageText) -| extend PropertiesText = tostring(column_ifexists("Properties", "")) -| extend PropertiesText = iff(isempty(PropertiesText), tostring(column_ifexists("customDimensions", "")), PropertiesText) -| extend SeverityText = tostring(column_ifexists("SeverityLevel", "")) -| extend SeverityText = iff(isempty(SeverityText), tostring(column_ifexists("severityLevel", "")), SeverityText) -| where EventTime > ago(24h) -| where MessageText has_any ("travel-agent", "travel") - or PropertiesText has_any ("travel-agent", "travel") -| project EventTime, MessageText, SeverityText, PropertiesText -| order by EventTime desc -| take 50 +AppEvents +| where TimeGenerated > ago(30d) +| where Name == "gen_ai.evaluation.result" +| extend p = parse_json(tostring(Properties)) +| extend Conversation = tostring(p["gen_ai.conversation.id"]), + Agent = tostring(p["gen_ai.agent.id"]), + Evaluator = tostring(p["gen_ai.evaluation.name"]), + Score = todouble(p["gen_ai.evaluation.score.value"]) +| summarize Time = max(TimeGenerated), AvgScore = round(avg(Score), 2), + Metrics = make_bag(pack(Evaluator, Score)) + by Conversation, Agent +| order by Time desc +| take 20 ``` +Each row is one conversation with its average score and a `Metrics` bag holding +every evaluator score side by side. For a per-day rollup of average scores by +evaluator, pivot instead: + +```kusto +AppEvents +| where TimeGenerated > ago(30d) +| where Name == "gen_ai.evaluation.result" +| extend p = parse_json(tostring(Properties)) +| extend Evaluator = tostring(p["gen_ai.evaluation.name"]), + Score = todouble(p["gen_ai.evaluation.score.value"]) +| summarize AvgScore = round(avg(Score), 2) by Day = bin(TimeGenerated, 1d), Evaluator +| evaluate pivot(Evaluator, any(AvgScore)) +| order by Day desc +``` + +> **Empty results?** Telemetry can be sparse, so `Last 24 hours` / `Last 7 days` +> may return nothing. Widen the time range (`ago(30d)` with **Set in query**, or +> **Last 30 days**) and confirm you are in the **Log Analytics workspace**, where +> `AppEvents` resolves. + Foundry gives you the runtime trace view and trace-sampled evaluation datasets; AgentOps Doctor checks that telemetry and release evidence are wired into the readiness story. From 5aad54402c2c2d177c89b5f400a6a92fd7b3e732 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 22:27:10 -0300 Subject: [PATCH 12/16] docs: make step 20 Cockpit walkthrough concrete and accurate Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 29 ++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index ae0fa9f..83bacd8 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1880,10 +1880,31 @@ Guardrail setup, and red-team scans still happen in their owning tools. agentops cockpit --workspace . ``` -Open the local URL printed by the command. The Cockpit should show -Foundry connection (sandbox by default; you can switch in the URL), -AgentOps cloud-eval readiness, Doctor findings, release evidence, the -PR and dev deploy CI pipelines, and next actions. +Cockpit starts a read-only local web server and prints +`http://127.0.0.1:8090`. Open that URL in your browser; press `Ctrl+C` +in the terminal to stop it. It reflects the **active azd environment** +(`sandbox`, from `defaultEnvironment` in `.azure/config.json`) — there is +no URL switch. To inspect `dev` instead, stop Cockpit, point the active +env at `dev` (set `defaultEnvironment: dev` in `.azure/config.json`, or +export `AZURE_ENV_NAME=dev`), then rerun the command. + +Read the page top to bottom and confirm each card against what you built: + +| Section | What to confirm in this run | +|---|---| +| **Foundry connection** | Foundry project = `travel-agent-sandbox`, your Azure tenant is resolved (`az login`), and Agent = `travel-agent:2`. | +| **Open in Foundry** | The deep-links open your sandbox project in the correct tenant. | +| **Observability readiness** | Trace setup / sampling status pulled from the latest Doctor analysis. | +| **AgentOps Doctor** | The same finding rollup you saw in step 19 — **2 critical** (`latency.p95_production`, `errors.production_rate`), plus warnings. | +| **Local eval history** | Your `agentops eval run` from step 19 appears as the latest entry. | +| **Quality metrics** | coherence / fluency / similarity / response_completeness trend cards from your runs. | +| **Production telemetry** | App Insights p95 latency (~11.7s) and error rate (~12%) — the source of the two criticals. | +| **CI/CD Pipelines** | The `pr` and `dev` workflows you generated are listed; `qa`/`prod`/scheduled are absent (expected). | +| **Next actions** | The prioritized backlog Cockpit derives from the open findings. | + +Cockpit does not run checks or mutate anything — it renders the latest +`results.json`, Doctor report, and evidence pack you already produced, and +links out to Foundry / Azure Monitor for live runtime data. ## Success criteria From 62915862e72e18f6fc4bae7fed45a54daa133474 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 23:25:01 -0300 Subject: [PATCH 13/16] docs: explain expected production-telemetry criticals in step 19 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-prompt-agent-quickstart.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md index 83bacd8..905975f 100644 --- a/docs/tutorial-prompt-agent-quickstart.md +++ b/docs/tutorial-prompt-agent-quickstart.md @@ -1861,6 +1861,23 @@ deploys, explicit thresholds, or red-team/governance evidence. Treat those as th hardening backlog. The eval gates and the dev deploy loop are production-ready. +You will likely also see **two critical findings** here, and that is expected +in this tutorial: + +| Critical finding | Why it shows up | +|---|---| +| `latency.p95_production` | App Insights p95 latency exceeds the 5s default (a prompt agent reasoning over each request runs ~9–12s). | +| `errors.production_rate` | Your own tutorial traffic (including the earlier `az login` / token retries) pushed the production error rate above the 5% default. | + +These criticals come from **real production telemetry of your own test +traffic**, not from the release candidate's eval gate (which passed). They are +honest signals: a real release would investigate latency and errors before +promoting. For the tutorial they simply demonstrate that Doctor reads live +runtime data. If you want to relax them for a demo, raise the Doctor thresholds +in `.agentops/agent.yaml` (`checks.latency.p95_threshold_seconds` and +`checks.errors.rate_threshold`) — these are separate from the `agentops.yaml` +eval-gate thresholds. + If you want to show the governance evidence path in the video, keep it as a short optional callout: From 454970f2b731f457ba0e462773cd2ace8f5d8117 Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Thu, 11 Jun 2026 23:28:19 -0300 Subject: [PATCH 14/16] docs: replicate concrete Cockpit walkthrough and Doctor threshold pointer to end-to-end and hosted tutorials Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/tutorial-end-to-end.md | 40 ++++++++++++++++++------ docs/tutorial-hosted-agent-quickstart.md | 35 +++++++++++++++++++-- 2 files changed, 64 insertions(+), 11 deletions(-) diff --git a/docs/tutorial-end-to-end.md b/docs/tutorial-end-to-end.md index 2dbaacc..e9b4bb7 100644 --- a/docs/tutorial-end-to-end.md +++ b/docs/tutorial-end-to-end.md @@ -878,6 +878,13 @@ may not have live traffic, scheduled workflows may not have history, and trace regression candidates may not exist yet. That is useful tutorial feedback, not a failure of Doctor. +If production telemetry *does* carry enough live traffic to trip latency or +error criticals, those are honest signals — not tutorial noise. The thresholds +that decide critical-vs-warning live in `.agentops/agent.yaml` +(`checks.latency.p95_threshold_seconds`, `checks.errors.rate_threshold`) and are +separate from the `agentops.yaml` eval-gate thresholds; raise them only if you +deliberately want to relax the production gate for a demo. + ## 10. Run Foundry red-team scans Red-team scans are a Foundry capability. Run them from Foundry Observability / @@ -953,16 +960,31 @@ reviews and accepts them. agentops cockpit --workspace . ``` -Use Cockpit as the local command center: +Cockpit starts a read-only local web server and prints +`http://127.0.0.1:8090`. Open that URL in your browser; press `Ctrl+C` in +the terminal to stop it. It reflects the **active azd environment** +(`sandbox`, from `defaultEnvironment` in `.azure/config.json`) — there is no +URL switch. To inspect `dev`, stop Cockpit, point the active env at `dev` +(set `defaultEnvironment: dev` in `.azure/config.json`, or export +`AZURE_ENV_NAME=dev`), then rerun the command. -- Foundry connection and deep links; -- Microsoft Foundry eval or AgentOps local eval gate status; -- Doctor findings; -- release evidence; -- local eval history; -- production telemetry snapshot; -- CI/CD workflow status; -- next actions. +Read the page top to bottom and confirm each card: + +| Section | What to confirm | +|---|---| +| **Foundry connection** | The Foundry project and tenant resolve, and the agent identity matches your `agentops.yaml` target. | +| **Open in Foundry** | The deep-links open your project in the correct tenant. | +| **Observability readiness** | Trace setup / sampling status from the latest Doctor analysis. | +| **AgentOps Doctor** | The same finding rollup from the Doctor / evidence-pack step (criticals first, then warnings). | +| **Local eval history** | Your `agentops eval run` baseline and regression reruns appear. | +| **Quality metrics** | Evaluator score trends from your runs. | +| **Production telemetry** | App Insights latency / error snapshot (or a clear "no live traffic" state in a fresh workspace). | +| **CI/CD Pipelines** | The workflows you generated are listed. | +| **Next actions** | The prioritized backlog Cockpit derives from the open findings. | + +Cockpit does not run checks or mutate anything — it renders the latest +`results.json`, Doctor report, and evidence pack you already produced, and +links out to Foundry / Azure Monitor for live runtime data. ## Completion checklist diff --git a/docs/tutorial-hosted-agent-quickstart.md b/docs/tutorial-hosted-agent-quickstart.md index 84a7483..f51461b 100644 --- a/docs/tutorial-hosted-agent-quickstart.md +++ b/docs/tutorial-hosted-agent-quickstart.md @@ -801,6 +801,13 @@ In a fresh tutorial workspace, warnings about production telemetry, CI history, regression history are expected and useful: they show what remains before this local endpoint becomes an operated service. +If production telemetry *does* carry enough live traffic to trip latency or +error criticals, those are honest signals. The thresholds that decide +critical-vs-warning live in `.agentops/agent.yaml` +(`checks.latency.p95_threshold_seconds`, `checks.errors.rate_threshold`) and are +separate from the `agentops.yaml` eval-gate thresholds; raise them only if you +deliberately want to relax the production gate for a demo. + If you later want a separate cadence outside PRs, generate the optional Doctor workflow with `agentops workflow generate --kinds doctor --force`. @@ -817,8 +824,32 @@ look self-contained inside AgentOps. agentops cockpit --workspace . ``` -Cockpit shows the endpoint readiness, eval history, Doctor findings, telemetry -status, release evidence, CI/CD, and next actions. +Cockpit starts a read-only local web server and prints +`http://127.0.0.1:8090` (this is the Cockpit UI port, not your agent's +`:8000`). Open that URL in your browser; press `Ctrl+C` in the terminal to +stop it. It reflects the **active azd environment** (`sandbox`, from +`defaultEnvironment` in `.azure/config.json`) — there is no URL switch. To +inspect `dev`, stop Cockpit, point the active env at `dev` (set +`defaultEnvironment: dev` in `.azure/config.json`, or export +`AZURE_ENV_NAME=dev`), then rerun the command. + +Read the page top to bottom and confirm each card: + +| Section | What to confirm | +|---|---| +| **Foundry connection** | The Foundry project / tenant resolve, and the agent is your hosted endpoint URL. | +| **Open in Foundry** | The deep-links open your project in the correct tenant. | +| **Observability readiness** | Trace setup / sampling status from the latest Doctor analysis. | +| **AgentOps Doctor** | The same finding rollup from the Doctor / evidence-pack step (criticals first, then warnings). | +| **Local eval history** | Your `agentops eval run` baseline, regressed, and fixed reruns appear. | +| **Quality metrics** | Evaluator score trends from your runs. | +| **Production telemetry** | App Insights latency / error snapshot for the `travel-agent.chat` operation (or a "no live traffic" state in a fresh workspace). | +| **CI/CD Pipelines** | The PR and dev deploy workflows you generated are listed. | +| **Next actions** | The prioritized backlog Cockpit derives from the open findings. | + +Cockpit does not run checks or mutate anything — it renders the latest +`results.json`, Doctor report, and evidence pack you already produced, and +links out to Foundry / Azure Monitor for live runtime data. ## Success criteria From 0ad2c66a59d7574c9c0303ce35f47b6fea28b26a Mon Sep 17 00:00:00 2001 From: Paulo Lacerda Date: Fri, 12 Jun 2026 09:41:15 -0300 Subject: [PATCH 15/16] docs: record Unreleased changes since 0.3.20 (workflow skill OIDC/RBAC/upstream + tutorial hardening) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- CHANGELOG.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 590823e..afb9e65 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,27 @@ This format follows [Keep a Changelog](https://keepachangelog.com/) and adheres ## [Unreleased] +### Changed +- **`agentops-workflow` skill now verifies OIDC tenant, branch upstream + tracking, and trace-sampling RBAC before wiring CI.** The packaged skill + instructs agents to treat `AZURE_TENANT_ID` as the tenant that owns the Entra + app registration / federated credential (not the subscription tenant), to set + and verify the local trunk branch upstream (`git branch -vv` must show + `[origin/main]`), and to grant **Reader** on Application Insights (and its + backing Log Analytics workspace) to the Foundry project managed identity for + trace-to-dataset flows. + +### Docs +- **Prompt-agent, hosted-agent, and end-to-end tutorials hardened end to end.** + OIDC setup calls out the app-registration tenant; observability steps require + App Insights Reader for trace sampling and cover workspace-backed App Insights; + the telemetry step queries `gen_ai.evaluation` results from `AppEvents` + (table-safe, no hard-coded dates); the evidence step explains expected + production-telemetry criticals and where the Doctor thresholds live + (`.agentops/agent.yaml`); and the Cockpit step is now a concrete walkthrough + (exact `http://127.0.0.1:8090` URL, read-only note, per-section checks, and + azd-env switching instead of a non-existent URL switch). + ## [0.3.20] - 2026-06-10 ### Changed From 395d4ea71631aca3b0e95c8e93ae2d76fba38ffc Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Fri, 12 Jun 2026 12:44:22 +0000 Subject: [PATCH 16/16] chore: prepare release 0.3.21 --- .claude-plugin/marketplace.json | 2 +- .github/plugin/marketplace.json | 2 +- CHANGELOG.md | 2 ++ plugins/agentops/package.json | 2 +- plugins/agentops/plugin.json | 2 +- 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index a641416..b28acd1 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -13,7 +13,7 @@ "name": "agentops-accelerator", "source": "../../plugins/agentops", "description": "Copilot agent skills for running standardized evaluation workflows with AgentOps Toolkit and Microsoft Foundry agents.", - "version": "0.3.20", + "version": "0.3.21", "keywords": [ "agentops", "evaluation", diff --git a/.github/plugin/marketplace.json b/.github/plugin/marketplace.json index a641416..b28acd1 100644 --- a/.github/plugin/marketplace.json +++ b/.github/plugin/marketplace.json @@ -13,7 +13,7 @@ "name": "agentops-accelerator", "source": "../../plugins/agentops", "description": "Copilot agent skills for running standardized evaluation workflows with AgentOps Toolkit and Microsoft Foundry agents.", - "version": "0.3.20", + "version": "0.3.21", "keywords": [ "agentops", "evaluation", diff --git a/CHANGELOG.md b/CHANGELOG.md index afb9e65..7c786b0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,8 @@ This format follows [Keep a Changelog](https://keepachangelog.com/) and adheres ## [Unreleased] +## [0.3.21] - 2026-06-12 + ### Changed - **`agentops-workflow` skill now verifies OIDC tenant, branch upstream tracking, and trace-sampling RBAC before wiring CI.** The packaged skill diff --git a/plugins/agentops/package.json b/plugins/agentops/package.json index 29781f6..105e5b4 100644 --- a/plugins/agentops/package.json +++ b/plugins/agentops/package.json @@ -2,7 +2,7 @@ "name": "agentops-accelerator", "displayName": "AgentOps Accelerator — Skills for GitHub Copilot", "description": "Copilot agent skills for running standardized evaluation workflows with AgentOps Accelerator and Microsoft Foundry agents.", - "version": "0.3.20", + "version": "0.3.21", "publisher": "AgentOpsAccelerator", "icon": "icon.png", "license": "MIT", diff --git a/plugins/agentops/plugin.json b/plugins/agentops/plugin.json index 9aea4e5..fbd9382 100644 --- a/plugins/agentops/plugin.json +++ b/plugins/agentops/plugin.json @@ -1,7 +1,7 @@ { "name": "agentops-accelerator", "description": "Copilot agent skills for running standardized evaluation workflows with AgentOps Accelerator and Microsoft Foundry agents.", - "version": "0.3.20", + "version": "0.3.21", "author": { "name": "AgentOps Accelerator", "url": "https://github.com/Azure/agentops"