Skip to content

Identify most common hosted-agent (azure.ai.agent) deployment errors to bootstrap validation #8617

@hemarina

Description

@hemarina

Related Teams

Summary

Analyze azd telemetry to identify the most common errors users hit when failing to deploy a hosted agent, scoped to the azure.ai.agent service host/target. These top errors become the first candidates to bootstrap validation (e.g., preflight checks) for the hosted-agent flow.

Background

We've done error-analysis investigations before (e.g., provision/deploy error reports broken down by ResultCode, error.code, error.category, and service.* fields). This is the same kind of analysis, but narrowed to the azure.ai.agent surface so we can prioritize the failures that actually block users from shipping a hosted agent.

Scope

  • Query telemetry for failed deployments where the project targets azure.ai.agent.
  • Rank the failure reasons by frequency (and, where possible, by number of distinct users/devices affected, not just raw event count).
  • Break errors down by the existing classification fields: error.category, error.code, error.type/error.chain.types, and service.{host, statusCode, errorCode} where applicable.
  • Distinguish user-fixable errors (config, auth, quota, missing prerequisites) from transient/service-side failures.

Deliverables

  • A short ranked list of the top errors hit when deploying a hosted agent, with frequency and affected-user counts.
  • For each top error: a brief classification (user-fixable vs. transient) and a note on whether it's a good candidate for a preflight/validation check.
  • A recommendation of which errors to tackle first to bootstrap hosted-agent validation.

Acceptance Criteria

  • Telemetry analysis is scoped to azure.ai.agent deployments.
  • Top failure modes are ranked by frequency and distinct-user impact.
  • Each top error is classified and assessed as a validation candidate.
  • A prioritized shortlist of "first errors to validate" is produced.

Notes

  • Reuse the existing Kusto functions / cooked tables and prior error-report patterns where possible rather than building analysis from scratch.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions