Skip to content

Tag the deploy step span with the service host kind for per-service deploy telemetry #8608

@therealjohn

Description

@therealjohn

Summary

The deploy step span emitted by the up/deploy execution graph does not carry the service host kind, so telemetry cannot reliably attribute a specific deploy step to its host (e.g. azure.ai.agent) in multi-service projects. Adding the host to the deploy step's tags (one line) makes per-host deploy success/failure metrics queryable across single- and multi-service projects.

Background

For each service, the up/deploy graph creates a deploy step:

// cli/azd/internal/cmd/service_graph.go (~line 517)
deployStepName := "deploy-" + svc.Name
...
g.AddStep(&exegraph.Step{
    Name: deployStepName,        // "deploy-<svcName>"  (service name, not host)
    Tags: []string{"deploy"},    // no host kind
    ...
})

The scheduler emits an exegraph.step span per step with exegraph.step.name, exegraph.step.tags, and per-step Success/ResultCode (cli/azd/pkg/exegraph/scheduler.go).

Neither serviceManager.Deploy (cli/azd/pkg/project/service_manager.go) nor the extension bridge ExternalServiceTarget.Deploy (cli/azd/pkg/project/service_target_external.go) starts a span or sets a host attribute, and the root command span's project.service.hosts / project.service.targets are independently sorted sets with no service-name -> host mapping (cli/azd/pkg/project/project.go, cli/azd/pkg/project/project_manager.go).

Impact

Today telemetry/dashboards can determine that a project contains an azure.ai.agent service and that a deploy-* step succeeded or failed, but cannot tie a specific deploy step to its host kind. Per-host deploy reliability metrics (for example, the Foundry hosted-agent deploy-step reliability metric in the agents telemetry data contract) are therefore only clean for single-service projects; multi-service projects conflate the agent's deploy step with other services' deploy steps.

Proposed fix (one line)

Include the host kind in the deploy step tags:

// cli/azd/internal/cmd/service_graph.go
Tags: []string{"deploy", string(svc.Host)},

This makes exegraph.step.tags equal to e.g. ["deploy","azure.ai.agent"], so a query like
Properties['exegraph.step.tags'] has "azure.ai.agent" identifies the agent's deploy step directly, in single- and multi-service projects.

Alternative: add a dedicated singular project.service.host attribute on the deploy step span for cleaner grouping (instead of overloading tags).

Notes

  • Host kinds are low-cardinality and non-PII (e.g. containerapp, appservice, azure.ai.agent), so they are safe to emit as telemetry.
  • Consider doing the same for the package-<svc> / publish-<svc> steps for symmetry.
  • The new tag value / attribute may need classification in the Data Catalog portal before it appears in dashboards.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions