From 21c8d7c578dbaa59bcf20c36ab826e900050db42 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 13:26:48 +0800 Subject: [PATCH 01/11] docs: add technical design spec for unifying Foundry config in azure.yaml Engineering design that backs the unified azure.yaml product brief. Covers the end to end command experience and the technical design (schema composition, config binding, the microsoft.foundry service target, ref includes, templating, init rework, reconciliation), plus a provision dependency callout and new open questions. --- docs/specs/unify-azure-yaml/spec.md | 490 ++++++++++++++++++++++++++++ 1 file changed, 490 insertions(+) create mode 100644 docs/specs/unify-azure-yaml/spec.md diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md new file mode 100644 index 00000000000..97de213f71e --- /dev/null +++ b/docs/specs/unify-azure-yaml/spec.md @@ -0,0 +1,490 @@ +# Technical design: unify Foundry agent config in `azure.yaml` + + + +## Overview + +This document is the engineering design for the unified `azure.yaml` proposal. It +covers two things the product brief does not: the **technical design** that azd core +and the `azure.ai.agents` extension need, and the **end to end experience** a user +gets at the command line. + +It does not restate the product brief. For the problem framing, the chosen file shape, +the decision table, and the product level open questions, read those sources directly: + +- Product brief and sample file shapes: `therealjohn/foundry-azd-config-preview` + (the `simple` and `complex` branches, and the proposed schemas under `schemas/`). +- RFC issue [#7962](https://github.com/Azure/azure-dev/issues/7962): Unify Foundry + agent configuration in `azure.yaml`. +- RFC issue [#8049](https://github.com/Azure/azure-dev/issues/8049): Composition + commands (`azd ai project add ...`), which depends on this work. + +One note on shape, because it affects everything below. The earlier issue #7962 +proposed two host kinds, `azure.ai.project` for shared state and `azure.ai.agent` per +agent, linked with `uses:`. The current brief replaces that with a single +`host: microsoft.foundry` service that owns all Foundry state, with agents nested as an +array inside that one entry. This design follows the single service shape. The two +service shape is treated as superseded. + +### Scope + +In scope: the schema change, the new service target, the config binding, `$ref` +includes, templating, the consolidated `init`, the deprecation of the old host and old +files, and the lifecycle behavior. + +Out of scope, tracked elsewhere: + +- Built in Bicep and the provision-less flow. This has its own RFC (not yet filed). + This document only flags where it touches the provision experience. +- The `azd ai project add` composition commands. That is issue #8049, and it builds on + this work. +- The Foundry Toolkit for VS Code parser switch. That is owned by the Toolkit team. + +## Part 1: End to end experience + +This part describes what a developer sees. The product brief defines the file shape but +does not walk the command flows, so they are spelled out here. + +### 1.1 First run with `azd ai agent init` + +Today `init` writes three files: `azure.yaml`, `agent.yaml`, and +`agent.manifest.yaml`. After this change it writes one file. The Foundry project, its +model deployments, and its agents all live in a single `services:` entry. + +```yaml +services: + agent-project: + host: microsoft.foundry + deployments: + - name: gpt-4.1-mini + model: { format: OpenAI, name: gpt-4.1-mini, version: "2025-04-14" } + sku: { name: GlobalStandard, capacity: 10 } + agents: + - name: basic-agent + kind: hosted + description: A basic agent hosted by Foundry. + project: src/basic-agent + docker: { path: Dockerfile, remoteBuild: true } + protocols: + - { protocol: responses, version: "1.0.0" } + startupCommand: python main.py + env: + FOUNDRY_MODEL_DEPLOYMENT_NAME: gpt-4.1-mini +``` + +There is no `agent.yaml` and no `agent.manifest.yaml`. A developer who opens the +project sees one source of truth. + +### 1.2 Inner loop: provision then deploy + +The verbs do not change. The mental model is provision creates the project, deploy +fills it in. + +1. `azd provision` creates the Foundry project at the ARM level. With the built in + Bicep flow (separate RFC), no `infra/` folder is required for a Foundry only + project. If the service sets an `endpoint:` value, provision is skipped and azd uses + the existing project (see 1.4). +2. `azd deploy` runs the Foundry service target. For each agent that has source code it + builds an artifact, publishes it, then writes the project state and posts each agent + to Foundry. +3. `azd up` does both in order. + +For a project with several agents, the deploy output groups work under the one Foundry +service and shows progress per agent, for example one line for building `support-agent` +and another for `research-agent`. azd core sees a single service. The per agent detail +comes from the extension (see Part 2.6). + +### 1.3 A Foundry project plus a normal service + +A non Foundry service can sit next to the Foundry service and consume it. The frontend +declares `uses:` on the Foundry service, which orders deploy and injects the project +endpoint as an environment value. + +```yaml +services: + support-platform: + host: microsoft.foundry + # deployments, connections, toolboxes, skills, routines, agents ... + webapp: + project: src/webapp + host: containerapp + uses: [support-platform] + env: + FOUNDRY_PROJECT_ENDPOINT: ${FOUNDRY_PROJECT_ENDPOINT} +``` + +No new ordering logic is needed. `uses:` already exists on every service. + +### 1.4 Use an existing Foundry project + +When `endpoint:` is set on the Foundry service, azd connects to that project instead of +creating one. This is the path for teams that provision infrastructure on their own, and +for reusing a shared or private network bound account. + +```yaml +services: + agent-project: + host: microsoft.foundry + endpoint: https://my-account.services.ai.azure.com/api/projects/my-project + agents: + - { name: basic-agent, kind: hosted, project: src/basic-agent, ... } +``` + +The absence of `endpoint:` is the signal to provision a new project. This keeps the +common case simple and makes reuse explicit. + +### 1.5 Opening an older project (migration) + +When a project still has `agent.yaml` or `agent.manifest.yaml` next to `azure.yaml`, or +still uses `host: azure.ai.agent`, azd keeps working during a deprecation window: + +1. It prints a warning that points at the migration guide. +2. It reads the old files and builds the same in memory Foundry service the new shape + would produce, so build and deploy still run. +3. It records a telemetry signal so the team can watch how fast old projects fade. + +After the window closes, the old path is removed and the warning becomes an error with a +link to the guide. + +### 1.6 Teardown with `azd down` + +The rule for Foundry data plane state is the same one issue #8049 uses: removing an +entry from `azure.yaml` means stop using it, not destroy it. The destructive path is +explicit and runs through `azd down` or the per resource `azd ai` commands. + +`azd down` removes the Foundry project that azd provisioned. If the service used an +`endpoint:` to point at an existing project, `azd down` does not delete that project, +because azd did not create it. + +### 1.7 When part of a deploy fails + +A Foundry service can hold several agents. If three agents build and the fourth fails, +the run stops with an error that names the failing agent, and the state already written +to Foundry stays in place. Re running `azd deploy` is safe: the service target upserts, +so finished work is detected and skipped or refreshed rather than duplicated. Part 2.8 +covers the reconcile rules. + +## Part 2: Technical design + +### 2.1 How the config reaches the extension + +The new top level keys (`deployments`, `connections`, `toolboxes`, `skills`, +`routines`, `agents`, and `endpoint`) do not need new fields on `ServiceConfig`. They +land in the existing inline map and travel to the extension unchanged. + +- `pkg/project/service_config.go` declares + `AdditionalProperties map[string]any` with the `yaml:",inline"` tag. Any key on the + service entry that is not a known field is captured here. The Foundry keys parse today + with no core struct change. +- `pkg/project/mapper_registry.go` converts both `Config` and `AdditionalProperties` + into a `google.protobuf.Struct` and sends them to the service target extension over + gRPC. The extension receives the Foundry keys as structured data. + +The only required core change for parsing is the JSON Schema (2.3). The Go side is +already wired. + +On the extension side, the provider unmarshals the struct into typed Go values. A sketch +of the shape it binds to: + +```go +type FoundryProjectConfig struct { + Endpoint string `json:"endpoint,omitempty"` + Deployments []Deployment `json:"deployments,omitempty"` + Connections []Connection `json:"connections,omitempty"` + Toolboxes []Toolbox `json:"toolboxes,omitempty"` + Skills []Skill `json:"skills,omitempty"` + Routines []Routine `json:"routines,omitempty"` + Agents []Agent `json:"agents,omitempty"` +} +``` + +`Agent` is the union of a hosted agent and a prompt agent. A hosted agent carries +`project`, plus one of `docker`, `runtime`, or a prebuilt `image`. A prompt agent +carries none of those and is pure config. + +### 2.2 Wiring the new host to a service target + +The extension already registers one service target. It adds a second one next to it. + +- `cli/azd/extensions/azure.ai.agents/internal/cmd/listen.go` calls + `WithServiceTarget("azure.ai.agent", ...)`. Add a sibling call + `WithServiceTarget("microsoft.foundry", ...)` that returns the new Foundry provider. +- Declare the new provider in `extension.yaml` under `providers` with + `type: service-target`. + +Core dispatch needs no change. When azd reads `host: microsoft.foundry`, the service +manager in `pkg/project/service_manager.go` resolves the host string against the IoC +container. Extension hosts are registered through the gRPC path in +`internal/grpcserver/service_target_service.go`, which wraps the extension in an +`ExternalServiceTarget`. The same path already serves `azure.ai.agent`. + +Optional rollout control: `service_manager.go` checks `alpha.IsFeatureKey(host)` before +it resolves a host. If the team wants the new shape behind a flag during preview, register +`microsoft.foundry` as an alpha feature so it only activates after the user enables it. +This is optional, because installing the extension is already an opt in step. + +### 2.3 Schema composition + +Add one conditional to `schemas/v1.0/azure.yaml.json`. It differs from the +`azure.ai.agent` block, which references an extension schema into the `config:` field. +For the Foundry host the schema is composed at the service level, because the Foundry +keys are direct properties of the entry, not nested under `config:`. + +```json +{ + "if": { "properties": { "host": { "const": "microsoft.foundry" } } }, + "then": { + "allOf": [ + { "$ref": "https://raw.githubusercontent.com/Azure/azure-dev/refs/heads/main/cli/azd/extensions/azure.ai.agents/schemas/microsoft.foundry.json" } + ], + "properties": { + "project": false, + "runtime": false, + "docker": false, + "image": false, + "config": false + } + } +} +``` + +The service level `project`, `runtime`, `docker`, and `image` are turned off because +those belong to each agent, not to the project. `config:` is turned off because the +Foundry schema is composed at the service level instead. + +The extension publishes `microsoft.foundry.json` plus the per resource files +(`Deployment.json`, `Connection.json`, `Toolbox.json`, `Skill.json`, `Routine.json`, +`Agent.json`, and `FileRef.json`). The composed schema points its arrays at those files. +Each array item is `oneOf` an inline object or a file reference, which is what enables +the split file layout in 2.4. + +One detail to settle: the preview `microsoft.foundry.json` sets +`additionalProperties: false` at the top, while the brief text asks for `true` so future +resource types (eval datasets, vector indexes, memories) can be added without a schema +break. Recommend `true` at the project level. + +### 2.4 `$ref` file includes and overlay overrides + +The `complex` sample splits large entries into their own files under `agents/`, +`toolboxes/`, and `skills/`, and pulls them in with `$ref`: + +```yaml +agents: + - name: triage-agent + kind: prompt + # ... inline ... + - $ref: ./agents/support-agent.yaml +``` + +`FileRef.json` allows sibling keys next to `$ref`, which act as overrides layered on top +of the loaded file. This is not an azd feature today and the brief does not call it out, +so the design defines it here. + +**Who resolves it.** Two options: + +- Core resolves `$ref` includes generically for any service field. This helps every + extension but is new core behavior and forces core to own the merge and validation + rules. +- The extension resolves `$ref` while it parses its own keys. This is contained to + Foundry and needs no core change. + +Recommend the extension owns resolution for the first version, with a clear path to move +it into core later if other extensions need includes. + +**Resolution rules.** + +- A `$ref` path resolves relative to the file that holds it, so nested includes work. + Absolute paths and URLs are also accepted. +- Sibling keys overlay on the loaded file. Use a shallow overlay at the top level of the + object. Scalars and arrays from the sibling replace the loaded value. This keeps the + result easy to predict. +- A loaded file is validated against the same per resource schema as an inline entry. + +**One concrete dependency.** The extension receives its keys as already parsed data, so +the `$ref` strings arrive as plain values. To open the files it needs the directory that +holds `azure.yaml`, not the agent source path. The provider must get the project root +from the azd client or environment. Call this out in the implementation, because the +gRPC `ServiceConfig` carries the agent `project` path, not the project root. + +**Interaction with #8049.** The composition commands write into these same arrays. If a +section is split into a file, the writer has to decide whether to append an inline entry +or edit the referenced file. Define the default as: append inline, and only edit a split +file when the command explicitly targets it. Both this design and #8049 should share one +YAML edit helper that understands `$ref` entries, so reads and writes agree. + +### 2.5 Templating: `${VAR}` and `${{...}}` + +Two resolvers have to coexist without stepping on each other. + +- `${VAR}` is an azd environment value, resolved on the client before anything is sent + to Foundry. Example: a connection secret read from the azd environment. +- `${{...}}` is resolved by Foundry on the server, for example + `${{connections.x.credentials.key}}`. It must reach Foundry unchanged. + +Core already leaves this alone. `pkg/osutil/expandable_string.go` runs `${VAR}` +expansion with `drone/envsubst`, but core only applies it to typed fields such as the +resource name and image. It does not expand `Config` or `AdditionalProperties`, so +`${{...}}` passes through core untouched. + +That moves the work to the extension. The risk is that if the extension expands `${VAR}` +with the same library, the library may try to read `${{...}}` as well. The extension +needs an expander that touches only `${VAR}` and leaves `${{...}}` intact. A simple and +safe approach: protect every `${{...}}` span with a placeholder, run `${VAR}` expansion, +then restore the placeholders. Put this in one shared helper so every Foundry field +expands the same way. + +Which fields take which: + +| Field | `${VAR}` | `${{...}}` | +|---|---|---| +| Agent `env` values | yes | yes | +| Connection credentials | yes (secret from azd env) | yes (Foundry managed) | +| Routine `input` values | yes | yes | +| Model deployment names and SKUs | yes | no | + +### 2.6 Service target lifecycle and per agent fan out + +The new provider implements the same interface as the current agent provider +(`Initialize`, `Package`, `Publish`, `Deploy`, `Endpoints`, `GetTargetResource`), but a +single service now stands for a whole project, so several methods fan out across the +agents inside it. + +| Method | Behavior | +|---|---| +| `Initialize` | Validate the Foundry config. Check agent kinds, that each agent has exactly one of `docker`, `runtime`, or `image`, and that every named toolbox, skill, connection, and model deployment an agent references exists. Resolve the project, provisioned or via `endpoint`. | +| `Package` | For each agent that has `project` plus `docker` or `runtime`, build. `docker` builds an image, local or in ACR when `remoteBuild` is set. `runtime` zips the source. Prompt agents and prebuilt `image` agents skip this. | +| `Publish` | Push each artifact. Images go to ACR. Zips upload to Foundry storage. | +| `Deploy` | First reconcile project state: deployments, connections, toolboxes, skills, routines, and prompt agents through Foundry APIs. Then post each agent with `createAgentVersion` and the published artifact reference. | +| `Endpoints` | Return the project endpoint, plus per agent endpoints when they are known. | +| `GetTargetResource` | Resolve the ARM resource for the Foundry project. | + +Deploy order inside the project matters, because later items reference earlier ones. +Apply deployments and connections first, then toolboxes and skills, then agents and +routines. + +Fan out details to define in code: + +- Build and publish can run agents concurrently. Bound the concurrency and aggregate the + results. +- A single agent failure stops the run with an error that names the agent, so the user + is not left guessing which one broke. Core only sees one service fail, so this naming + has to come from the extension. +- The project state reconcile in `Deploy` lifts out of the current post provision and + post deploy hook handlers in `listen.go`, which already do this work for the old host. + +### 2.7 Reworking `init` + +Today `init` reconciles data across three files. The functions +`extractToolboxAndConnectionConfigs` and `extractConnectionConfigs` in +`cli/azd/extensions/azure.ai.agents/internal/cmd/init.go` read the manifest, normalize +auth types, move secrets into the azd environment, and feed typed arrays that then get +written across `agent.yaml` and `azure.yaml`. The file write for `agent.yaml` happens in +`writeAgentDefinitionFile`. + +After this change `init` writes the Foundry service entry directly. The roughly two +hundred lines of cross file reconciliation collapse into building the in memory config +and writing it to the service top level fields. `agent.yaml` and `agent.manifest.yaml` +are no longer produced for new projects. + +Deprecation detection lives here too. If the old files are present, or the project uses +`host: azure.ai.agent`, print the warning, run the fallback that produces the equivalent +in memory Foundry service, and emit the telemetry signal. The error code +`CodeInvalidAgentManifest` in `internal/exterrors/codes.go` stays while the manifest path +is read. Rename or retire it when that path is removed. + +### 2.8 State reconciliation and idempotency + +Deploy compares the declared state in `azure.yaml` against the live state in Foundry and +applies the difference. The model is upsert: create what is missing, update what changed, +and leave the rest. Removing an entry from `azure.yaml` stops azd from managing it, but +does not delete it from Foundry. Deletion is the job of `azd down` or the per resource +`azd ai` commands. + +Two Foundry behaviors to confirm against the API contracts, because they decide how the +upsert is written: + +- Whether create calls are idempotent, or whether the provider has to check first and + then choose create or update. +- Whether a re run after a partial failure can safely repeat the calls that already + succeeded. + +This area overlaps with existing reports, noted in Part 3. + +### 2.9 Telemetry and errors + +- Emit a telemetry event when the old file fallback runs and when `host: azure.ai.agent` + is seen, so the deprecation window length can be driven by data. +- Surface per agent failures as structured errors that name the agent. Core sees one + service, so the attribution has to be added by the extension. +- Keep `CodeInvalidAgentManifest` while the manifest path exists, and plan its rename or + removal with that path. + +## Part 3: Provision experience dependencies (PM confirmation) + +These items change or touch the provision experience. They are not designed here. Please +confirm whether each one belongs inside the unification milestone or stays separate. + +- **Built in Bicep and the provision-less flow.** A Foundry only project should not need + an `infra/` folder. The extension would carry templates and generate them in memory at + provision time. This has its own RFC and is not yet filed. It clearly changes what + `azd provision` does for these projects. +- **Provision layers in multi service projects.** Issue + [#8587](https://github.com/Azure/azure-dev/issues/8587) reports + `azd provision ` failing with "no layers defined in azure.yaml", which left a + toolbox unprovisioned. The single service shape and the in memory Bicep both interact + with how provision layers are built, so confirm the intended behavior. +- **Reusing an existing project.** The `endpoint:` field and the private network case in + issue [#8165](https://github.com/Azure/azure-dev/issues/8165) both ask azd to use an + account it did not create. Confirm whether reuse is in scope for the first version. +- **Idempotency.** Issues [#8349](https://github.com/Azure/azure-dev/issues/8349) and + [#8350](https://github.com/Azure/azure-dev/issues/8350) report that provision does not + recreate a deleted resource and that tool connection changes are not applied. The new + model moves data plane changes to deploy, which may sidestep some of this. Confirm. +- **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) + notes that `azure.yaml` only represents the latest state of an agent, even though + agents are versioned. Deploy posts a new version each run, so the intended meaning of + the YAML, latest only or pinned, needs a decision. + +## Part 4: Open questions (new) + +These are specific to this design. The product level questions, such as the host kind +name and the schema ownership choice, already live in the product brief and are not +repeated here. + +1. **Who owns `$ref` resolution, and what are the overlay rules?** Core loader or the + extension. Shallow overlay or deep merge. Arrays replaced or merged. 2.4 recommends + extension owned with a shallow overlay, but this should be ratified. +2. **How are split files validated?** The language server can follow a `$ref` to a local + file for editor hints, but runtime validation of a loaded file against the per + resource schema needs to be confirmed. +3. **How large can the inline config get over gRPC?** A big project becomes a large + protobuf struct. Confirm there is no practical size limit, or define how to chunk it. +4. **How are per agent actions addressed at the CLI?** `azd deploy ` targets the + whole project. Is `azd ai agent deploy ` enough for per agent actions, or does + core need real sub service targets. This is tied to the fan out choice in 2.6. +5. **Naming for the composition surface.** Issue #8049 places the `add` commands in an + `azd ai project` surface, but an `azure.ai.projects` extension already exists. Confirm + where the schema and the `add` commands live so the two do not collide. + +## Summary of required changes + +azd core: + +- Add the `host: microsoft.foundry` conditional to `schemas/v1.0/azure.yaml.json`, + composing the extension schema at the service level and turning off `project`, + `runtime`, `docker`, `image`, and `config`. +- Optional: register `microsoft.foundry` as an alpha feature for a gated preview. +- Optional: a shared convention or helper for preserving `${{...}}` if core ever expands + these fields. + +`azure.ai.agents` extension: + +- Publish `microsoft.foundry.json` and the per resource schema files. +- Add the `microsoft.foundry` service target with per agent fan out for package, + publish, and deploy. +- Resolve `$ref` includes and overlay overrides while parsing the Foundry keys. +- Expand `${VAR}` while preserving `${{...}}` through a shared helper. +- Rework `init` to write the single Foundry service and stop emitting `agent.yaml` and + `agent.manifest.yaml`. +- Add the deprecation fallback and telemetry for the old files and `host: azure.ai.agent`. +- Add skills and routines reconciliation. From f38e2d41f6944c43f5a5890d04e7288eb69020c1 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 13:31:53 +0800 Subject: [PATCH 02/11] docs: remove external username reference from spec --- docs/specs/unify-azure-yaml/spec.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 97de213f71e..521220ae1ad 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -1,6 +1,6 @@ # Technical design: unify Foundry agent config in `azure.yaml` - + ## Overview @@ -12,8 +12,8 @@ gets at the command line. It does not restate the product brief. For the problem framing, the chosen file shape, the decision table, and the product level open questions, read those sources directly: -- Product brief and sample file shapes: `therealjohn/foundry-azd-config-preview` - (the `simple` and `complex` branches, and the proposed schemas under `schemas/`). +- Product brief and sample file shapes: the `simple` and `complex` branches of the + preview repo, and the proposed schemas under `schemas/`. - RFC issue [#7962](https://github.com/Azure/azure-dev/issues/7962): Unify Foundry agent configuration in `azure.yaml`. - RFC issue [#8049](https://github.com/Azure/azure-dev/issues/8049): Composition From 1df8039b0b60cceda9aa72aa645aab1a30cbf3f6 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 14:11:16 +0800 Subject: [PATCH 03/11] docs: fix review comments on unify-azure-yaml spec --- docs/specs/unify-azure-yaml/spec.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 521220ae1ad..63f9b8b53c4 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -160,7 +160,7 @@ because azd did not create it. A Foundry service can hold several agents. If three agents build and the fourth fails, the run stops with an error that names the failing agent, and the state already written -to Foundry stays in place. Re running `azd deploy` is safe: the service target upserts, +to Foundry stays in place. Re-running `azd deploy` is safe: the service target upserts, so finished work is detected and skipped or refreshed rather than duplicated. Part 2.8 covers the reconcile rules. @@ -172,11 +172,11 @@ The new top level keys (`deployments`, `connections`, `toolboxes`, `skills`, `routines`, `agents`, and `endpoint`) do not need new fields on `ServiceConfig`. They land in the existing inline map and travel to the extension unchanged. -- `pkg/project/service_config.go` declares +- `cli/azd/pkg/project/service_config.go` declares `AdditionalProperties map[string]any` with the `yaml:",inline"` tag. Any key on the service entry that is not a known field is captured here. The Foundry keys parse today with no core struct change. -- `pkg/project/mapper_registry.go` converts both `Config` and `AdditionalProperties` +- `cli/azd/pkg/project/mapper_registry.go` converts both `Config` and `AdditionalProperties` into a `google.protobuf.Struct` and sends them to the service target extension over gRPC. The extension receives the Foundry keys as structured data. @@ -213,9 +213,9 @@ The extension already registers one service target. It adds a second one next to `type: service-target`. Core dispatch needs no change. When azd reads `host: microsoft.foundry`, the service -manager in `pkg/project/service_manager.go` resolves the host string against the IoC -container. Extension hosts are registered through the gRPC path in -`internal/grpcserver/service_target_service.go`, which wraps the extension in an +manager in `cli/azd/pkg/project/service_manager.go` resolves the host string against the +IoC container. Extension hosts are registered through the gRPC path in +`cli/azd/internal/grpcserver/service_target_service.go`, which wraps the extension in an `ExternalServiceTarget`. The same path already serves `azure.ai.agent`. Optional rollout control: `service_manager.go` checks `alpha.IsFeatureKey(host)` before @@ -293,8 +293,10 @@ it into core later if other extensions need includes. **Resolution rules.** -- A `$ref` path resolves relative to the file that holds it, so nested includes work. - Absolute paths and URLs are also accepted. +- A `$ref` path must be a relative path, resolved relative to the file that holds it, + so nested includes work. Absolute paths and remote URLs are not accepted by default; + restrict to project-local paths to avoid reading arbitrary files or fetching remote + content. Broader support can be added behind an explicit opt-in if needed. - Sibling keys overlay on the loaded file. Use a shallow overlay at the top level of the object. Scalars and arrays from the sibling replace the loaded value. This keeps the result easy to predict. From 393252b24817a90ce2d4402b13d508c9225b64c5 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 15:08:22 +0800 Subject: [PATCH 04/11] docs: revise overview to focus on what the spec provides --- docs/specs/unify-azure-yaml/spec.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 63f9b8b53c4..36e3203f6e8 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -4,13 +4,11 @@ ## Overview -This document is the engineering design for the unified `azure.yaml` proposal. It -covers two things the product brief does not: the **technical design** that azd core -and the `azure.ai.agents` extension need, and the **end to end experience** a user -gets at the command line. +This document is the engineering design for the unified `azure.yaml` proposal. It covers +the **technical design** that azd core and the `azure.ai.agents` extension need, and the +**end to end experience** a user gets at the command line. -It does not restate the product brief. For the problem framing, the chosen file shape, -the decision table, and the product level open questions, read those sources directly: +Background and product spec can be found in the following sources: - Product brief and sample file shapes: the `simple` and `complex` branches of the preview repo, and the proposed schemas under `schemas/`. @@ -42,8 +40,7 @@ Out of scope, tracked elsewhere: ## Part 1: End to end experience -This part describes what a developer sees. The product brief defines the file shape but -does not walk the command flows, so they are spelled out here. +This part describes what a developer sees at the command line. ### 1.1 First run with `azd ai agent init` From c31ba7f16b924487019a4ae495706bfa7a7509f6 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 17:30:34 +0800 Subject: [PATCH 05/11] docs: align spec with brief, fill gaps, merge open questions Align $ref path handling with FileRef.json (accept absolute paths and URLs), document path rebasing and instruction/prompt file loading, model agent-level tools, scope deploy-mode validation to hosted agents, order routine reconciliation after agents, add a sibling-extension schema-ownership section (Option A), and merge Parts 3 and 4 into a single flat Open questions section. Trim the summary to deltas on the brief. --- docs/specs/unify-azure-yaml/spec.md | 193 ++++++++++++++++++---------- 1 file changed, 122 insertions(+), 71 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 36e3203f6e8..933bad4d0b0 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -1,6 +1,6 @@ # Technical design: unify Foundry agent config in `azure.yaml` - + ## Overview @@ -195,9 +195,13 @@ type FoundryProjectConfig struct { } ``` -`Agent` is the union of a hosted agent and a prompt agent. A hosted agent carries -`project`, plus one of `docker`, `runtime`, or a prebuilt `image`. A prompt agent -carries none of those and is pure config. +`Agent` is the union of a hosted agent and a prompt agent. Both kinds may carry +`toolboxes` (named references into the project `toolboxes`), `tools` (tools attached +directly to the agent), a single `skill` reference, `env`, and `metadata`. A hosted +agent additionally carries `project` plus exactly one deploy mode: `docker`, `runtime`, +or a prebuilt `image`. A prompt agent carries none of those build fields; instead it +carries `instructions`, an inline string or a path to a prompt file (see 2.4). These +references are validated in 2.6. ### 2.2 Wiring the new host to a service target @@ -290,20 +294,35 @@ it into core later if other extensions need includes. **Resolution rules.** -- A `$ref` path must be a relative path, resolved relative to the file that holds it, - so nested includes work. Absolute paths and remote URLs are not accepted by default; - restrict to project-local paths to avoid reading arbitrary files or fetching remote - content. Broader support can be added behind an explicit opt-in if needed. +- A relative `$ref` path resolves relative to the file that holds it, so nested includes + work. Per `FileRef.json`, absolute paths and URLs are also accepted. The security + tradeoff of remote and absolute includes (reading arbitrary files, fetching remote + content) is raised as an open question rather than restricted here, to stay consistent + with the brief. - Sibling keys overlay on the loaded file. Use a shallow overlay at the top level of the object. Scalars and arrays from the sibling replace the loaded value. This keeps the result easy to predict. - A loaded file is validated against the same per resource schema as an inline entry. -**One concrete dependency.** The extension receives its keys as already parsed data, so -the `$ref` strings arrive as plain values. To open the files it needs the directory that -holds `azure.yaml`, not the agent source path. The provider must get the project root -from the azd client or environment. Call this out in the implementation, because the -gRPC `ServiceConfig` carries the agent `project` path, not the project root. +**Path resolution and rebasing.** The extension receives its keys as already parsed data, +so the `$ref` strings arrive as plain values. To open a top level `$ref` it needs the +directory that holds `azure.yaml`, not the agent source path, so the provider must get the +project root from the azd client or environment (the gRPC `ServiceConfig` carries the +agent `project` path, not the project root). Once a split file is loaded, every relative +path inside it rebases to that file's own directory, not to `azure.yaml`. The `complex` +sample is explicit: in `agents/support-agent.yaml`, `project: ../src/support-agent` is +relative to the `agents/` folder, and in `skills/code-review.yaml`, +`instructions: ../prompts/code-review.md` is relative to the `skills/` folder. The +resolver must track each loaded file's base directory and apply it to that file's +`project`, `instructions`, and any nested `$ref`. + +**Instruction and prompt files.** Separate from `$ref`, a skill's `instructions` and a +prompt agent's `instructions` accept either an inline string or a path to a `.md` or +`.txt` file the extension reads at deploy time; the `complex` sample uses both forms. +This is a file read, not a structural include: the file's text becomes the field value, +with no overlay. The path follows the same rebasing rule above, so an `instructions:` +path inside a split agent or skill file resolves against that file's directory. `${VAR}` +and `${{...}}` expansion (2.5) applies to the loaded text. **Interaction with #8049.** The composition commands write into these same arrays. If a section is split into a file, the writer has to decide whether to append an inline entry @@ -350,16 +369,16 @@ agents inside it. | Method | Behavior | |---|---| -| `Initialize` | Validate the Foundry config. Check agent kinds, that each agent has exactly one of `docker`, `runtime`, or `image`, and that every named toolbox, skill, connection, and model deployment an agent references exists. Resolve the project, provisioned or via `endpoint`. | +| `Initialize` | Validate the Foundry config. Check agent kinds; for each `kind: hosted` agent require exactly one deploy mode (`docker`, `runtime`, or a prebuilt `image`), while `kind: prompt` agents carry none; and check that every toolbox, `tools` entry, `skill`, connection, and model deployment an agent references resolves. Resolve the project, provisioned or via `endpoint`. | | `Package` | For each agent that has `project` plus `docker` or `runtime`, build. `docker` builds an image, local or in ACR when `remoteBuild` is set. `runtime` zips the source. Prompt agents and prebuilt `image` agents skip this. | | `Publish` | Push each artifact. Images go to ACR. Zips upload to Foundry storage. | -| `Deploy` | First reconcile project state: deployments, connections, toolboxes, skills, routines, and prompt agents through Foundry APIs. Then post each agent with `createAgentVersion` and the published artifact reference. | +| `Deploy` | First reconcile project state: deployments, connections, toolboxes, skills, and prompt agents through Foundry APIs. Then post each hosted agent with `createAgentVersion` and the published artifact reference. Reconcile routines last, since they reference an agent by name. | | `Endpoints` | Return the project endpoint, plus per agent endpoints when they are known. | | `GetTargetResource` | Resolve the ARM resource for the Foundry project. | Deploy order inside the project matters, because later items reference earlier ones. -Apply deployments and connections first, then toolboxes and skills, then agents and -routines. +Apply deployments and connections first, then toolboxes and skills, then agents, and +routines last, since a routine references an agent by name. Fan out details to define in code: @@ -407,7 +426,7 @@ upsert is written: - Whether a re run after a partial failure can safely repeat the calls that already succeeded. -This area overlaps with existing reports, noted in Part 3. +This area overlaps with existing reports, noted in the open questions. ### 2.9 Telemetry and errors @@ -418,70 +437,102 @@ This area overlaps with existing reports, noted in Part 3. - Keep `CodeInvalidAgentManifest` while the manifest path exists, and plan its rename or removal with that path. -## Part 3: Provision experience dependencies (PM confirmation) - -These items change or touch the provision experience. They are not designed here. Please -confirm whether each one belongs inside the unification milestone or stays separate. - -- **Built in Bicep and the provision-less flow.** A Foundry only project should not need - an `infra/` folder. The extension would carry templates and generate them in memory at - provision time. This has its own RFC and is not yet filed. It clearly changes what - `azd provision` does for these projects. -- **Provision layers in multi service projects.** Issue - [#8587](https://github.com/Azure/azure-dev/issues/8587) reports - `azd provision ` failing with "no layers defined in azure.yaml", which left a - toolbox unprovisioned. The single service shape and the in memory Bicep both interact - with how provision layers are built, so confirm the intended behavior. -- **Reusing an existing project.** The `endpoint:` field and the private network case in - issue [#8165](https://github.com/Azure/azure-dev/issues/8165) both ask azd to use an - account it did not create. Confirm whether reuse is in scope for the first version. -- **Idempotency.** Issues [#8349](https://github.com/Azure/azure-dev/issues/8349) and - [#8350](https://github.com/Azure/azure-dev/issues/8350) report that provision does not - recreate a deleted resource and that tool connection changes are not applied. The new - model moves data plane changes to deploy, which may sidestep some of this. Confirm. -- **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) - notes that `azure.yaml` only represents the latest state of an agent, even though - agents are versioned. Deploy posts a new version each run, so the intended meaning of - the YAML, latest only or pinned, needs a decision. - -## Part 4: Open questions (new) - -These are specific to this design. The product level questions, such as the host kind -name and the schema ownership choice, already live in the product brief and are not -repeated here. - -1. **Who owns `$ref` resolution, and what are the overlay rules?** Core loader or the - extension. Shallow overlay or deep merge. Arrays replaced or merged. 2.4 recommends - extension owned with a shallow overlay, but this should be ratified. -2. **How are split files validated?** The language server can follow a `$ref` to a local - file for editor hints, but runtime validation of a loaded file against the per - resource schema needs to be confirmed. -3. **How large can the inline config get over gRPC?** A big project becomes a large - protobuf struct. Confirm there is no practical size limit, or define how to chunk it. -4. **How are per agent actions addressed at the CLI?** `azd deploy ` targets the - whole project. Is `azd ai agent deploy ` enough for per agent actions, or does - core need real sub service targets. This is tied to the fan out choice in 2.6. -5. **Naming for the composition surface.** Issue #8049 places the `add` commands in an - `azd ai project` surface, but an `azure.ai.projects` extension already exists. Confirm - where the schema and the `add` commands live so the two do not collide. +### 2.10 Sibling extensions and schema ownership + +Foundry already ships per resource extensions, bundled by the `microsoft.foundry` +meta-package: `azure.ai.toolboxes`, `azure.ai.connections`, `azure.ai.projects`, +`azure.ai.skills`, and `azure.ai.routines`. Each owns a data-plane CLI (`azd ai toolbox`, +`azd ai connection`, and so on) that acts on a live project. None of them participate in +`azure.yaml` today. + +That raises who owns the `microsoft.foundry.json` schema slices and their reconciliation. +Following the brief, v1 takes **Option A**: the `azure.ai.agents` extension owns the full +schema and reconciles every slice (deployments, connections, toolboxes, skills, routines, +agents), while the sibling extensions keep their imperative CLIs unchanged. The +alternative, Option B, has the meta-package own the schema and each sibling register a +slice contribution, which needs a new "one extension contributes part of another's schema" +mechanism in core. Recommend A for v1 and re-evaluate B once the shape is in users' hands. + +One naming note: `microsoft.foundry` is both the host kind string and the existing +meta-package extension id. That overlap is intentional in the brief, but the host kind is +resolved by `azure.ai.agents`, not by the meta-package. + +## Open questions + +Decisions the team needs to make. Provision and scope items sit in the same list as +design items. Product level questions already settled in the brief (host kind name, the +Option A versus B ownership choice) are referenced where relevant rather than re-asked. + +1. **Built in Bicep and the provision-less flow.** A Foundry only project should not need + an `infra/` folder; the extension would carry templates and generate them in memory at + provision time. This has its own RFC, not yet filed, and changes what `azd provision` + does for these projects. Confirm whether it belongs in this milestone. +2. **Provision layers in multi service projects.** Issue + [#8587](https://github.com/Azure/azure-dev/issues/8587) reports `azd provision ` + failing with "no layers defined in azure.yaml", which left a toolbox unprovisioned. The + single service shape and the in memory Bicep both interact with how provision layers are + built, so confirm the intended behavior. +3. **Reusing an existing project.** The `endpoint:` field and the private network case in + issue [#8165](https://github.com/Azure/azure-dev/issues/8165) both ask azd to use an + account it did not create. Confirm whether reuse is in scope for the first version. +4. **Idempotency across provision and deploy.** Issues + [#8349](https://github.com/Azure/azure-dev/issues/8349) and + [#8350](https://github.com/Azure/azure-dev/issues/8350) report that provision does not + recreate a deleted resource and that tool connection changes are not applied. The new + model moves data plane changes to deploy (2.8), which may sidestep some of this. + Confirm, and confirm whether create calls are idempotent or need check-then-create. +5. **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) + notes that `azure.yaml` only represents the latest state of an agent, even though agents + are versioned. Deploy posts a new version each run, so the intended meaning of the YAML, + latest only or pinned, needs a decision. +6. **`$ref` resolution and overlay rules.** Core loader or extension. Shallow overlay or + deep merge. Arrays replaced or merged. 2.4 recommends extension owned with a shallow + overlay, but this should be ratified. +7. **Absolute and remote `$ref` paths.** `FileRef.json` accepts absolute paths and URLs. + Confirm whether to keep that, accepting reading arbitrary files and fetching remote + content, or restrict to project-local paths behind an opt-in. The design follows the + brief and accepts them for now (2.4). +8. **Instruction and prompt file format.** Skill and prompt agent `instructions` accept an + inline string or a file path (2.4). Confirm both forms are supported and settle the + accepted file extensions (`.md`, `.txt`). +9. **Routines ownership and triggers.** Does the routines schema live in `azure.ai.agents` + (Option A) or in `azure.ai.routines`? `Routine.json` allows `schedule`, `webhook`, and + `event` triggers; confirm which the first version supports beyond cron schedules. +10. **Split file validation.** The language server can follow a `$ref` to a local file for + editor hints, but runtime validation of a loaded file against the per resource schema + needs to be confirmed. +11. **Inline config size over gRPC.** A big project becomes a large protobuf struct. + Confirm there is no practical size limit, or define how to chunk it. +12. **Per agent CLI addressability.** `azd deploy ` targets the whole project. Is + `azd ai agent deploy ` enough for per agent actions, or does core need real sub + service targets (2.6)? The brief raises this too; it is tied to the fan out choice. +13. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai + project` surface, but an `azure.ai.projects` extension already exists. Confirm where the + schema and the `add` commands live so the two do not collide. ## Summary of required changes +The brief's "Missing functionality" table already lists the baseline work. This spec +keeps that list and adds the deltas below. + azd core: - Add the `host: microsoft.foundry` conditional to `schemas/v1.0/azure.yaml.json`, composing the extension schema at the service level and turning off `project`, `runtime`, `docker`, `image`, and `config`. -- Optional: register `microsoft.foundry` as an alpha feature for a gated preview. -- Optional: a shared convention or helper for preserving `${{...}}` if core ever expands - these fields. +- Recommend `additionalProperties: true` at the project level in `microsoft.foundry.json` + (the preview currently has `false`). +- Optional: register `microsoft.foundry` as an alpha feature for a gated preview, and a + shared helper if core ever expands these fields while preserving `${{...}}`. -`azure.ai.agents` extension: +`azure.ai.agents` extension (v1 owns the full schema and reconciliation, Option A in 2.10): - Publish `microsoft.foundry.json` and the per resource schema files. -- Add the `microsoft.foundry` service target with per agent fan out for package, - publish, and deploy. -- Resolve `$ref` includes and overlay overrides while parsing the Foundry keys. +- Add the `microsoft.foundry` service target with per agent fan out for package, publish, + and deploy; require a deploy mode only for `kind: hosted` agents; reconcile routines + after agents. +- Resolve `$ref` includes with shallow overlay overrides, rebasing each loaded file's + paths to its own directory, and load `instructions` prompt files at deploy time. - Expand `${VAR}` while preserving `${{...}}` through a shared helper. - Rework `init` to write the single Foundry service and stop emitting `agent.yaml` and `agent.manifest.yaml`. From 5e4bbac7cad752adcbea4fc72f411b096028f657 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 19:50:27 +0800 Subject: [PATCH 06/11] docs: drop open questions already decided in the brief Remove the built-in Bicep, idempotency, and per-agent CLI items from the open questions list, since the brief raises each with a recommended option (opt-in in-memory Bicep, Bicep-like reconciliation semantics, and 3a per-agent build). Fold the #8349/#8350 references into the 2.8 reconciliation section and note where the brief leaves instructions-format and routines open. --- docs/specs/unify-azure-yaml/spec.md | 63 +++++++++++++---------------- 1 file changed, 29 insertions(+), 34 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 933bad4d0b0..7695179df99 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -426,7 +426,10 @@ upsert is written: - Whether a re run after a partial failure can safely repeat the calls that already succeeded. -This area overlaps with existing reports, noted in the open questions. +This area overlaps with existing reports [#8349](https://github.com/Azure/azure-dev/issues/8349) +and [#8350](https://github.com/Azure/azure-dev/issues/8350); the upsert model above and the +brief's "drop from config stops management, does not destroy" semantics are meant to cover +them. ### 2.9 Telemetry and errors @@ -459,54 +462,46 @@ resolved by `azure.ai.agents`, not by the meta-package. ## Open questions -Decisions the team needs to make. Provision and scope items sit in the same list as -design items. Product level questions already settled in the brief (host kind name, the -Option A versus B ownership choice) are referenced where relevant rather than re-asked. +Decisions still open after the brief. Where the brief already raises a question **and** +recommends an option, that call is taken as given and not re-asked here: schema ownership +(Option A), per-agent build orchestration (3a), idempotency semantics (Bicep-like, "drop +from config stops management, does not destroy", see 2.8), built-in Bicep (opt-in, +in-memory), and the per-agent CLI direction that follows from 3a. Host kind naming stays +with the brief. The items below are what this design adds or leaves unresolved; provision +and scope items sit in the same list as design items. -1. **Built in Bicep and the provision-less flow.** A Foundry only project should not need - an `infra/` folder; the extension would carry templates and generate them in memory at - provision time. This has its own RFC, not yet filed, and changes what `azd provision` - does for these projects. Confirm whether it belongs in this milestone. -2. **Provision layers in multi service projects.** Issue +1. **Provision layers in multi service projects.** Issue [#8587](https://github.com/Azure/azure-dev/issues/8587) reports `azd provision ` failing with "no layers defined in azure.yaml", which left a toolbox unprovisioned. The single service shape and the in memory Bicep both interact with how provision layers are built, so confirm the intended behavior. -3. **Reusing an existing project.** The `endpoint:` field and the private network case in +2. **Reusing an existing project.** The `endpoint:` field and the private network case in issue [#8165](https://github.com/Azure/azure-dev/issues/8165) both ask azd to use an account it did not create. Confirm whether reuse is in scope for the first version. -4. **Idempotency across provision and deploy.** Issues - [#8349](https://github.com/Azure/azure-dev/issues/8349) and - [#8350](https://github.com/Azure/azure-dev/issues/8350) report that provision does not - recreate a deleted resource and that tool connection changes are not applied. The new - model moves data plane changes to deploy (2.8), which may sidestep some of this. - Confirm, and confirm whether create calls are idempotent or need check-then-create. -5. **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) +3. **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) notes that `azure.yaml` only represents the latest state of an agent, even though agents are versioned. Deploy posts a new version each run, so the intended meaning of the YAML, latest only or pinned, needs a decision. -6. **`$ref` resolution and overlay rules.** Core loader or extension. Shallow overlay or +4. **`$ref` resolution and overlay rules.** Core loader or extension. Shallow overlay or deep merge. Arrays replaced or merged. 2.4 recommends extension owned with a shallow overlay, but this should be ratified. -7. **Absolute and remote `$ref` paths.** `FileRef.json` accepts absolute paths and URLs. +5. **Absolute and remote `$ref` paths.** `FileRef.json` accepts absolute paths and URLs. Confirm whether to keep that, accepting reading arbitrary files and fetching remote content, or restrict to project-local paths behind an opt-in. The design follows the brief and accepts them for now (2.4). -8. **Instruction and prompt file format.** Skill and prompt agent `instructions` accept an - inline string or a file path (2.4). Confirm both forms are supported and settle the - accepted file extensions (`.md`, `.txt`). -9. **Routines ownership and triggers.** Does the routines schema live in `azure.ai.agents` - (Option A) or in `azure.ai.routines`? `Routine.json` allows `schedule`, `webhook`, and - `event` triggers; confirm which the first version supports beyond cron schedules. -10. **Split file validation.** The language server can follow a `$ref` to a local file for - editor hints, but runtime validation of a loaded file against the per resource schema - needs to be confirmed. -11. **Inline config size over gRPC.** A big project becomes a large protobuf struct. - Confirm there is no practical size limit, or define how to chunk it. -12. **Per agent CLI addressability.** `azd deploy ` targets the whole project. Is - `azd ai agent deploy ` enough for per agent actions, or does core need real sub - service targets (2.6)? The brief raises this too; it is tied to the fan out choice. -13. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai +6. **Instruction and prompt file format.** Skill and prompt agent `instructions` accept an + inline string or a file path (2.4). The brief lists this as open too; confirm both forms + are supported and settle the accepted file extensions (`.md`, `.txt`). +7. **Routines ownership and triggers.** The brief leaves this open: does the routines schema + live in `azure.ai.agents` (Option A) or in `azure.ai.routines`? `Routine.json` allows + `schedule`, `webhook`, and `event` triggers; confirm which the first version supports + beyond cron schedules. +8. **Split file validation.** The language server can follow a `$ref` to a local file for + editor hints, but runtime validation of a loaded file against the per resource schema + needs to be confirmed. +9. **Inline config size over gRPC.** A big project becomes a large protobuf struct. Confirm + there is no practical size limit, or define how to chunk it. +10. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai project` surface, but an `azure.ai.projects` extension already exists. Confirm where the schema and the `add` commands live so the two do not collide. From 02df33f5b959795e908e1a0568d5dc738a0d5571 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 19:54:04 +0800 Subject: [PATCH 07/11] docs: simplify open questions intro to a plain lead-in --- docs/specs/unify-azure-yaml/spec.md | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 7695179df99..7dd9e05e546 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -462,13 +462,8 @@ resolved by `azure.ai.agents`, not by the meta-package. ## Open questions -Decisions still open after the brief. Where the brief already raises a question **and** -recommends an option, that call is taken as given and not re-asked here: schema ownership -(Option A), per-agent build orchestration (3a), idempotency semantics (Bicep-like, "drop -from config stops management, does not destroy", see 2.8), built-in Bicep (opt-in, -in-memory), and the per-agent CLI direction that follows from 3a. Host kind naming stays -with the brief. The items below are what this design adds or leaves unresolved; provision -and scope items sit in the same list as design items. +Decisions this design surfaces that the brief does not already settle. Provision and scope +items sit in the same list as the design items. 1. **Provision layers in multi service projects.** Issue [#8587](https://github.com/Azure/azure-dev/issues/8587) reports `azd provision ` From 1f5b0a62b081713f2dc4cf9f1eb944746415ab1c Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 19:59:33 +0800 Subject: [PATCH 08/11] docs: drop routines and instructions open questions covered by the brief John's brief already raises skill instructions format (open question 5) and routines ownership and triggers (open question 6). The instructions item also contradicted section 2.4, which already decides both inline and file forms. Remove both echoes; open questions now list only items this design adds. --- docs/specs/unify-azure-yaml/spec.md | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 7dd9e05e546..27bee7d479b 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -484,21 +484,14 @@ items sit in the same list as the design items. Confirm whether to keep that, accepting reading arbitrary files and fetching remote content, or restrict to project-local paths behind an opt-in. The design follows the brief and accepts them for now (2.4). -6. **Instruction and prompt file format.** Skill and prompt agent `instructions` accept an - inline string or a file path (2.4). The brief lists this as open too; confirm both forms - are supported and settle the accepted file extensions (`.md`, `.txt`). -7. **Routines ownership and triggers.** The brief leaves this open: does the routines schema - live in `azure.ai.agents` (Option A) or in `azure.ai.routines`? `Routine.json` allows - `schedule`, `webhook`, and `event` triggers; confirm which the first version supports - beyond cron schedules. -8. **Split file validation.** The language server can follow a `$ref` to a local file for +6. **Split file validation.** The language server can follow a `$ref` to a local file for editor hints, but runtime validation of a loaded file against the per resource schema needs to be confirmed. -9. **Inline config size over gRPC.** A big project becomes a large protobuf struct. Confirm +7. **Inline config size over gRPC.** A big project becomes a large protobuf struct. Confirm there is no practical size limit, or define how to chunk it. -10. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai - project` surface, but an `azure.ai.projects` extension already exists. Confirm where the - schema and the `add` commands live so the two do not collide. +8. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai + project` surface, but an `azure.ai.projects` extension already exists. Confirm where the + schema and the `add` commands live so the two do not collide. ## Summary of required changes From 877587ee1ab2b4675fd3793930d582d4aa61a9f7 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 20:04:50 +0800 Subject: [PATCH 09/11] docs: drop both ref open questions the brief already decides FileRef.json states absolute paths and URLs are accepted and that sibling properties act as overlay overrides, and the brief recommends Option A (azure.ai.agents owns schema and reconciliation). That settles ref ownership, overlay rules, and absolute/remote paths, so remove both from open questions and fold the accepted-paths decision into section 2.4. Open questions now list only items the brief leaves open. --- docs/specs/unify-azure-yaml/spec.md | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 27bee7d479b..f1304148a6f 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -295,10 +295,9 @@ it into core later if other extensions need includes. **Resolution rules.** - A relative `$ref` path resolves relative to the file that holds it, so nested includes - work. Per `FileRef.json`, absolute paths and URLs are also accepted. The security - tradeoff of remote and absolute includes (reading arbitrary files, fetching remote - content) is raised as an open question rather than restricted here, to stay consistent - with the brief. + work. Per `FileRef.json`, absolute paths and URLs are also accepted; the brief makes that + call, so the design follows it. Treat such includes as trusted input the same way + `azure.yaml` itself is, and surface a clear error when a path cannot be read. - Sibling keys overlay on the loaded file. Use a shallow overlay at the top level of the object. Scalars and arrays from the sibling replace the loaded value. This keeps the result easy to predict. @@ -477,19 +476,12 @@ items sit in the same list as the design items. notes that `azure.yaml` only represents the latest state of an agent, even though agents are versioned. Deploy posts a new version each run, so the intended meaning of the YAML, latest only or pinned, needs a decision. -4. **`$ref` resolution and overlay rules.** Core loader or extension. Shallow overlay or - deep merge. Arrays replaced or merged. 2.4 recommends extension owned with a shallow - overlay, but this should be ratified. -5. **Absolute and remote `$ref` paths.** `FileRef.json` accepts absolute paths and URLs. - Confirm whether to keep that, accepting reading arbitrary files and fetching remote - content, or restrict to project-local paths behind an opt-in. The design follows the - brief and accepts them for now (2.4). -6. **Split file validation.** The language server can follow a `$ref` to a local file for +4. **Split file validation.** The language server can follow a `$ref` to a local file for editor hints, but runtime validation of a loaded file against the per resource schema needs to be confirmed. -7. **Inline config size over gRPC.** A big project becomes a large protobuf struct. Confirm +5. **Inline config size over gRPC.** A big project becomes a large protobuf struct. Confirm there is no practical size limit, or define how to chunk it. -8. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai +6. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai project` surface, but an `azure.ai.projects` extension already exists. Confirm where the schema and the `add` commands live so the two do not collide. From ef2d119cfe5851be0746a1e676f41525c2ebccd4 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 20:13:17 +0800 Subject: [PATCH 10/11] docs: warmer, more collaborative tone for open questions --- docs/specs/unify-azure-yaml/spec.md | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index f1304148a6f..37058097b5e 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -461,29 +461,32 @@ resolved by `azure.ai.agents`, not by the meta-package. ## Open questions -Decisions this design surfaces that the brief does not already settle. Provision and scope -items sit in the same list as the design items. +A few things this design surfaces that the brief does not already settle. None of them +block the overall shape, but they are worth talking through together before we lock the +first version. Provision and scope items sit alongside the design ones. 1. **Provision layers in multi service projects.** Issue [#8587](https://github.com/Azure/azure-dev/issues/8587) reports `azd provision ` failing with "no layers defined in azure.yaml", which left a toolbox unprovisioned. The - single service shape and the in memory Bicep both interact with how provision layers are - built, so confirm the intended behavior. + single service shape and the in memory Bicep both touch how provision layers get built, + so it would help to agree on the intended behavior here. 2. **Reusing an existing project.** The `endpoint:` field and the private network case in issue [#8165](https://github.com/Azure/azure-dev/issues/8165) both ask azd to use an - account it did not create. Confirm whether reuse is in scope for the first version. + account it did not create. It would be good to confirm whether reuse is in scope for the + first version. 3. **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) notes that `azure.yaml` only represents the latest state of an agent, even though agents - are versioned. Deploy posts a new version each run, so the intended meaning of the YAML, - latest only or pinned, needs a decision. + are versioned. Deploy posts a new version each run, so let's decide together what the + YAML should mean: latest only, or pinned. 4. **Split file validation.** The language server can follow a `$ref` to a local file for - editor hints, but runtime validation of a loaded file against the per resource schema - needs to be confirmed. -5. **Inline config size over gRPC.** A big project becomes a large protobuf struct. Confirm - there is no practical size limit, or define how to chunk it. + editor hints, but it would be good to confirm how a loaded file gets validated against + the per resource schema at runtime. +5. **Inline config size over gRPC.** A big project becomes a large protobuf struct. This is + probably fine, but it is worth confirming there is no practical size limit, or deciding + how we would chunk it if one shows up. 6. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai - project` surface, but an `azure.ai.projects` extension already exists. Confirm where the - schema and the `add` commands live so the two do not collide. + project` surface, but an `azure.ai.projects` extension already exists. Let's settle where + the schema and the `add` commands live so the two do not collide. ## Summary of required changes From cdb2aa364e865511d9d0aa1d3717a616c609bca1 Mon Sep 17 00:00:00 2001 From: huimiu Date: Wed, 10 Jun 2026 20:24:42 +0800 Subject: [PATCH 11/11] docs: drop open questions the issues and design already resolve Issue 8587's preferred fix is deploy-time reconciliation of toolboxes and connections, which the design already does (folded into 2.8). Issue 8165's stated minimum, reusing an existing account, is the endpoint design (folded into 1.4). Split-file validation is already decided in 2.4. Remove all three from open questions, leaving only items with no settled direction: agent versioning (8066), gRPC config size, and composition surface naming (8049). --- docs/specs/unify-azure-yaml/spec.md | 56 ++++++++++++++--------------- 1 file changed, 27 insertions(+), 29 deletions(-) diff --git a/docs/specs/unify-azure-yaml/spec.md b/docs/specs/unify-azure-yaml/spec.md index 37058097b5e..17e287cc8e5 100644 --- a/docs/specs/unify-azure-yaml/spec.md +++ b/docs/specs/unify-azure-yaml/spec.md @@ -116,7 +116,10 @@ No new ordering logic is needed. `uses:` already exists on every service. When `endpoint:` is set on the Foundry service, azd connects to that project instead of creating one. This is the path for teams that provision infrastructure on their own, and -for reusing a shared or private network bound account. +for reusing a shared or private network bound account, which is the minimum issue +[#8165](https://github.com/Azure/azure-dev/issues/8165) asks for. Provisioning a new +network bound account from scratch is part of the separate built-in Bicep work, not this +field. ```yaml services: @@ -425,10 +428,14 @@ upsert is written: - Whether a re run after a partial failure can safely repeat the calls that already succeeded. -This area overlaps with existing reports [#8349](https://github.com/Azure/azure-dev/issues/8349) -and [#8350](https://github.com/Azure/azure-dev/issues/8350); the upsert model above and the -brief's "drop from config stops management, does not destroy" semantics are meant to cover -them. +This area overlaps with existing reports [#8349](https://github.com/Azure/azure-dev/issues/8349), +[#8350](https://github.com/Azure/azure-dev/issues/8350), and +[#8587](https://github.com/Azure/azure-dev/issues/8587); moving data-plane resources such as +toolboxes and connections to deploy-time reconciliation, together with the upsert model above +and the brief's "drop from config stops management, does not destroy" semantics, is meant to +cover them. #8587 in particular is the multi service provision failure that the single +service shape removes, because toolboxes are now created at deploy instead of through +provision layers. ### 2.9 Telemetry and errors @@ -461,32 +468,23 @@ resolved by `azure.ai.agents`, not by the meta-package. ## Open questions -A few things this design surfaces that the brief does not already settle. None of them -block the overall shape, but they are worth talking through together before we lock the -first version. Provision and scope items sit alongside the design ones. - -1. **Provision layers in multi service projects.** Issue - [#8587](https://github.com/Azure/azure-dev/issues/8587) reports `azd provision ` - failing with "no layers defined in azure.yaml", which left a toolbox unprovisioned. The - single service shape and the in memory Bicep both touch how provision layers get built, - so it would help to agree on the intended behavior here. -2. **Reusing an existing project.** The `endpoint:` field and the private network case in - issue [#8165](https://github.com/Azure/azure-dev/issues/8165) both ask azd to use an - account it did not create. It would be good to confirm whether reuse is in scope for the - first version. -3. **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) - notes that `azure.yaml` only represents the latest state of an agent, even though agents - are versioned. Deploy posts a new version each run, so let's decide together what the - YAML should mean: latest only, or pinned. -4. **Split file validation.** The language server can follow a `$ref` to a local file for - editor hints, but it would be good to confirm how a loaded file gets validated against - the per resource schema at runtime. -5. **Inline config size over gRPC.** A big project becomes a large protobuf struct. This is +A few things this design surfaces that the brief and the related issues do not already +settle. None of them block the overall shape, but they are worth talking through together +before we lock the first version. + +1. **Agent versioning.** Issue [#8066](https://github.com/Azure/azure-dev/issues/8066) notes + that `azure.yaml` only represents the latest state of an agent, even though agents are + versioned, and Foundry will add traffic splitting across versions. Deploy posts a new + version each run, so let's decide together what the YAML should mean: latest only, pinned, + or something that can express more than one version. +2. **Inline config size over gRPC.** A big project becomes a large protobuf struct. This is probably fine, but it is worth confirming there is no practical size limit, or deciding how we would chunk it if one shows up. -6. **Composition surface naming.** Issue #8049 places the `add` commands in an `azd ai - project` surface, but an `azure.ai.projects` extension already exists. Let's settle where - the schema and the `add` commands live so the two do not collide. +3. **Composition surface naming.** Issue [#8049](https://github.com/Azure/azure-dev/issues/8049) + puts the `add` commands in an `azd ai project` surface, written against the earlier two + host shape, while this design unifies on `host: microsoft.foundry` and an + `azure.ai.projects` extension already exists. Let's settle where the schema and the `add` + commands live so they do not collide. ## Summary of required changes