-
Notifications
You must be signed in to change notification settings - Fork 1
Revamp Observability section #640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
abhijaisrivastava15
wants to merge
7
commits into
dev
Choose a base branch
from
docs/observe-two-rewrite
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
4c8b6da
docs(observe-two): rewrite pages 1-7 with verified screenshots
abhijaisrivastava15 623d9ad
docs(observe-two): rewrite voice.mdx with verified screenshots
abhijaisrivastava15 852fa89
docs(observe-two): note LiveKit dashboard attribute limitation (TH-4660)
abhijaisrivastava15 626e963
docs(observe): replace old Observability docs with the rewritten ones
abhijaisrivastava15 defb68d
docs(observe): document LiveKit attribute support in Dashboards (TH-4…
abhijaisrivastava15 a00706c
docs(observe): swap light-mode hero image on Overview for dark-mode
abhijaisrivastava15 0aeaa44
docs(observe): address Suhani's review — zoom modal screenshots and a…
abhijaisrivastava15 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,84 +1,96 @@ | ||
| --- | ||
| title: "Alerts and monitors" | ||
| description: "Define monitors on Observe project metrics (system or evaluation) and get notified by email or Slack when values cross a threshold." | ||
| description: "Set up monitors that watch your AI app's metrics and notify you by email or Slack when something crosses a threshold." | ||
| --- | ||
|
|
||
| ## About | ||
|
|
||
| **Alerts and monitors** notify you when a metric goes above or below a value you set. Pick a metric (error rate, latency, cost, or an eval score), define a threshold, and choose where to get notified: email, Slack, or both. Monitors check the metric on a schedule. If the threshold is breached, you get an alert. You can review past alerts, mark them resolved, or mute a monitor without deleting it. | ||
| An **alert** is a watchman for one of your metrics. You pick what to watch (error rate, latency, cost, an eval score), set a threshold, and choose where to be notified — email or Slack. The monitor checks the metric on a schedule. When the threshold is crossed, you get notified. | ||
|
|
||
| The Alerts page shows every monitor in your workspace with its current health, last triggered time, and total trigger count. | ||
|
|
||
| <img src="/images/docs/observe/alerts-overview.png" alt="Alerts list — Issues, Status, Alert Type, Last Triggered, No. of triggers, Updated at" style={{ borderRadius: '5px' }} /> | ||
|
|
||
| --- | ||
|
|
||
| ## When to use | ||
|
|
||
| - **Catch errors early**: Get notified when error rate or API failure rate spikes after a deployment. | ||
| - **Stay within latency limits**: Alert when response time goes above your target. | ||
| - **Control costs**: Track token usage and get a warning before you hit your budget. | ||
| - **Monitor eval quality**: Know when a pass/fail eval like toxicity starts failing more often. | ||
| - **Stay informed without watching dashboards**: Send alerts to email, Slack, or both. | ||
| - **Catch errors after a deployment** — Get pinged the moment error rate spikes. | ||
| - **Stay within latency limits** — Alert when response time crosses your target. | ||
| - **Cost guardrails** — Get a warning before you blow through your token budget. | ||
| - **Quality regressions** — Know when a pass/fail eval (toxicity, faithfulness, etc.) starts failing more often. | ||
| - **Stay informed without watching dashboards** — Push alerts to email, Slack, or both. | ||
|
|
||
| --- | ||
|
|
||
| ## How to | ||
| ## The alerts list | ||
|
|
||
| <Steps> | ||
| <Step title="Choose the metric"> | ||
| Create a monitor for an Observe project and select the **metric type**: | ||
|  | ||
| The Alerts page shows one row per monitor. Columns: | ||
|
|
||
| - **System metrics**: count of errors, error-free session rates, LLM API failure rates, span response time, LLM response time, token usage, daily/monthly tokens spent. | ||
| - **Evaluation metrics**: attach an eval config for that project. For pass/fail or choice evals you can set **threshold_metric_value** to the specific value to monitor (e.g. fail rate or a choice label). | ||
| | Column | What it shows | | ||
| |---|---| | ||
| | **Issues** | The monitor's name. | | ||
| | **Status** | Whether the monitor is currently `Healthy` (threshold not crossed) or in alarm. | | ||
| | **Alert Type** | The metric being watched (e.g. LLM API failure rates, LLM response time, Count of errors). | | ||
| | **Last Triggered** | When this monitor last fired an alert. | | ||
| | **No. of triggers** | Total number of times the alert has fired. | | ||
| | **Updated at** | When the monitor was last modified. | | ||
|
|
||
| The monitor is scoped to one project (Observe projects only). | ||
| </Step> | ||
| Top of the page: | ||
| - **Search** — find a monitor by name. | ||
| - **View Docs** — link to documentation. | ||
| - **+ New Alert** — create a new monitor. | ||
|
|
||
| <Step title="Define the threshold"> | ||
| Set how the alert is triggered: | ||
|  | ||
| --- | ||
|
|
||
| - **threshold_operator**: **Greater than** or **Less than** (the current metric value is compared to the threshold). | ||
| - **threshold_type**: how the threshold is determined: | ||
| - **Static**: you set fixed **critical_threshold_value** and optionally **warning_threshold_value**. Alert fires when the metric is greater than (or less than) these values. | ||
| - **Percentage change**: threshold is based on percentage change from a baseline (e.g. historical mean over a time window). You set **critical_threshold_value** and optionally **warning_threshold_value** as percentage values. **auto_threshold_time_window** (default one week, in minutes) defines the window used to compute the baseline. | ||
| ## Creating a monitor | ||
|
|
||
| When the condition is met, the system creates an alert log (critical or warning) and triggers notifications. | ||
| </Step> | ||
| Click **+ New Alert**. A dialog asks you to **Choose a project** to scope the monitor to. | ||
|
|
||
| <Step title="Set alert frequency"> | ||
| **alert_frequency** is how often the monitor is evaluated, in minutes (minimum 5, default 60). The monitor runs on this schedule and checks the metric over the relevant time window. If the threshold is breached, an alert is created and notifications are sent. | ||
| </Step> | ||
| <img src="/images/docs/observe/alerts-create.png" alt="Choose a project dialog when creating a new alert" style={{ borderRadius: '5px' }} /> | ||
|
|
||
| <Step title="Configure notifications"> | ||
| - **Email**: add up to five addresses in **notification_emails**. They receive an email when an alert is triggered (subject and body include alert name, message, and type). | ||
| - **Slack**: set **slack_webhook_url** to your Slack incoming webhook. Optional **slack_notes** are included in the message. | ||
|  | ||
| You can use email only, Slack only, or both. Mute a monitor with **is_mute** to stop notifications without deleting it. | ||
| </Step> | ||
| Pick the project, click **Next**, and the configuration form opens. There you set: | ||
|
|
||
| <Step title="View and resolve alerts"> | ||
| Alert history is stored as **UserAlertMonitorLog** records (critical/warning, message, time window, link). You can list logs for a monitor, see when each alert fired, and mark them resolved. Use the monitor detail view in the UI to see trend data and unresolved count. | ||
| </Step> | ||
| </Steps> | ||
| - **Metric** — what to watch: | ||
| - **System metrics**: count of errors, error-free session rates, LLM API failure rates, span response time, LLM response time, token usage, daily/monthly tokens spent. | ||
| - **Evaluation metrics**: pick an eval task running on the project. For pass/fail or choice evals, you can target a specific value (e.g. fail rate or a label). | ||
| - **Threshold operator** — `Greater than` or `Less than`. | ||
| - **Threshold type**: | ||
| - **Static** — fixed critical (and optional warning) value. | ||
| - **Percentage change** — based on % change from a baseline computed over a time window. | ||
| - **Alert frequency** — how often the monitor runs (in minutes; minimum 5, default 60). | ||
| - **Notifications**: | ||
| - **Email** — up to five addresses. | ||
| - **Slack** — paste an incoming webhook URL. | ||
| - You can use email only, Slack only, or both. | ||
|
|
||
| <Note> | ||
| Monitors are only available for projects with **trace_type** `observe`. Optional **filters** (same structure as eval-task filters) can narrow which spans are included when computing the metric. | ||
| Monitors are scoped to a single project. Optional filters (same as eval-task filters) can narrow which spans count toward the metric. | ||
| </Note> | ||
|
|
||
| --- | ||
|
|
||
| ## Reviewing alerts | ||
|
|
||
| Each time a monitor's threshold is crossed, an alert log is recorded with the type (critical / warning), message, and time window. Open a monitor's detail view to see its trigger history, trend chart, and unresolved count. Mark alerts resolved as you handle them. | ||
|
|
||
| To pause a monitor without deleting it, mute it from its detail view. | ||
|
|
||
| --- | ||
|
|
||
| ## Next Steps | ||
|
|
||
| <CardGroup cols={2}> | ||
| <Card title="Set Up Observability" icon="play" href="/docs/observe/features/quickstart"> | ||
| Connect the SDK and start capturing traces. | ||
| <Card title="Charts" icon="chart-bar" href="/docs/observe/features/charts"> | ||
| See the same metrics visualized over time. | ||
| </Card> | ||
| <Card title="Run Evals on Traces" icon="chart-line" href="/docs/observe/features/evals"> | ||
| Run evaluations on your traced spans to score quality. | ||
| Run evaluations so you can alert on their scores. | ||
| </Card> | ||
| <Card title="Group Traces by Session" icon="table-rows" href="/docs/observe/features/session"> | ||
| <Card title="Sessions" icon="table-rows" href="/docs/observe/features/session"> | ||
| Group traces into sessions for multi-turn analysis. | ||
| </Card> | ||
| <Card title="Users" icon="tags" href="/docs/observe/features/users"> | ||
| View activity and metrics per end user. | ||
| <Card title="LLM Tracing" icon="list" href="/docs/observe/features/llm-tracing"> | ||
| Inspect the traces behind your alerts. | ||
| </Card> | ||
| </CardGroup> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,81 @@ | ||
| --- | ||
| title: "Charts" | ||
| description: "Pre-built time-series charts for the key metrics of your AI app — latency, tokens, traffic, cost, and evaluation results — with no setup required." | ||
| --- | ||
|
|
||
| ## About | ||
|
|
||
| The **Charts** view is a set of ready-made time-series charts that show how your AI app is performing over time. Unlike [Dashboards](/docs/observe/features/dashboard) (where you build custom widgets), Charts are calculated automatically from your traces — open the page and you immediately see how latency, tokens, traffic, and cost have moved. | ||
|
|
||
| Charts is most useful for at-a-glance health checks, deployment monitoring, and cost/usage tracking. | ||
|
|
||
| <img src="/images/docs/observe/charts-overview.png" alt="Charts overview — System Metrics with Latency, Tokens, Traffic, Cost charts" style={{ borderRadius: '5px' }} /> | ||
|
|
||
| --- | ||
|
|
||
| ## When to use | ||
|
|
||
| - **After a deployment** — Glance at Latency and Traffic to confirm production is stable. | ||
| - **Investigating a cost spike** — Open the Cost chart and compare against the Traffic chart to see if usage volume explains it. | ||
| - **Token budget tracking** — Watch the Tokens chart to stay within monthly limits. | ||
| - **Quality drift** — Check the Evaluation Metrics charts to see whether eval pass rates are improving or declining. | ||
|
|
||
| --- | ||
|
|
||
| ## What's on the page | ||
|
|
||
| The page is split into two sections: | ||
|
|
||
| ### System Metrics | ||
|
|
||
| Four time-series charts, all rendered automatically from your traces: | ||
|
|
||
| | Chart | What it shows | | ||
| |---|---| | ||
| | **Latency** | Average response time (in ms) over time. Use this to detect slowdowns. | | ||
| | **Tokens** | Total token consumption (prompt + completion) over time. | | ||
| | **Traffic** | Number of spans processed over time — your request volume. | | ||
| | **Cost** | Estimated cost in USD over time, calculated from token usage and model pricing. | | ||
|
|
||
| ### Evaluation Metrics | ||
|
|
||
| If you have eval tasks running on this project, each eval gets its own chart here showing its score over time. | ||
|
|
||
| <img src="/images/docs/observe/charts-evals.png" alt="Evaluation Metrics section showing per-eval charts" style={{ borderRadius: '5px' }} /> | ||
|
|
||
| --- | ||
|
|
||
| ## Choosing a time window | ||
|
|
||
| The date range is a row of inline buttons in the page header — not a dropdown like the other pages. Click one to set the window for all charts on the page. | ||
|
|
||
| Options: **Custom · Today · Yesterday · 7D · 30D · 3M · 6M · 12M**. | ||
|
|
||
| A **granularity** selector on the right (default "Day") controls the bucket size on the time axis. Auto-adjusts based on the selected range. | ||
|
|
||
| --- | ||
|
|
||
| ## Other controls | ||
|
|
||
| - **Refresh** — pull the latest data immediately. | ||
| - **View Traces** — jump straight to the Tracing page filtered by the current time window. | ||
| - **Auto refresh (10s)** — toggle in the header to poll for new data automatically. | ||
|
|
||
| --- | ||
|
|
||
| ## Next Steps | ||
|
|
||
| <CardGroup cols={2}> | ||
| <Card title="Dashboards" icon="layout-dashboard" href="/docs/observe/features/dashboard"> | ||
| Build custom dashboards with configurable widgets. | ||
| </Card> | ||
| <Card title="Alerts & Monitors" icon="zap" href="/docs/observe/features/alerts"> | ||
| Get notified when these metrics cross a threshold. | ||
| </Card> | ||
| <Card title="Run Evals on Traces" icon="chart-line" href="/docs/observe/features/evals"> | ||
| Run evaluations to populate the Evaluation Metrics charts. | ||
| </Card> | ||
| <Card title="LLM Tracing" icon="list" href="/docs/observe/features/llm-tracing"> | ||
| Drill into the individual traces behind these metrics. | ||
| </Card> | ||
| </CardGroup> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these images are not from the latest revamp ?