Skip to content

feat: add webperf-core-web-vitals skill#1198

Open
nucliweb wants to merge 10 commits intoChromeDevTools:mainfrom
nucliweb:feat/webperf-core-web-vitals-skill
Open

feat: add webperf-core-web-vitals skill#1198
nucliweb wants to merge 10 commits intoChromeDevTools:mainfrom
nucliweb:feat/webperf-core-web-vitals-skill

Conversation

@nucliweb
Copy link
Copy Markdown

@nucliweb nucliweb commented Mar 17, 2026

Motivation

Addresses the non-deterministic performance measurement problem raised in #1114. Instead of letting the LLM generate measurement code on the fly, this skill uses curated, deterministic JavaScript snippets from nucliweb/webperf-snippets executed via evaluate_script.

Benefits:

  • Deterministic output — the same script runs every time for a given metric
  • Token efficiency — only the structured JSON results are passed back to the model
  • No hallucinations — scripts use the real Performance API, not LLM-generated guesses

What's included

SKILL.md

Describes the complete skill with:

  • Common workflows (full CWV audit, LCP deep dive, CLS investigation, INP debugging)
  • Decision trees for automated follow-up based on metric results
  • Cross-skill integration with webperf-loading, webperf-interaction, webperf-media
  • Explicit INP interaction workflow (agent informs user to interact, then collects results)

Scripts (7 total)

Core CWV metrics:

  • LCP.js — Measures LCP, highlights element, returns structured JSON synchronously
  • CLS.js — Measures CLS from buffered entries, exposes getCLS() for ongoing tracking
  • INP.js — Tracking script; returns { status: "tracking" }, then getINP() after user interaction

LCP diagnostics:

  • LCP-Sub-Parts.js — Breaks down LCP into TTFB / Resource Load Delay / Resource Load Time / Element Render Delay
  • LCP-Trail.js — Tracks all LCP candidate elements during load with color-coded outlines
  • LCP-Image-Entropy.js — Detects low-entropy images ineligible for LCP (Chrome 112+)
  • LCP-Video-Candidate.js — Audits video LCP: poster presence, preload, fetchpriority, format

References

  • references/snippets.md — Human-readable descriptions and thresholds for each script
  • references/schema.md — Structured JSON return schema for agent consumption

Script source and minification

Scripts are sourced from nucliweb/webperf-snippets — specifically from the /snippets/CoreWebVitals/ directory, which is the readable source.

The skills/ directory in that repo contains Terser-minified build output (generated via scripts/generate-skills.js). The upstream rationale for minification is token optimization: smaller scripts mean fewer tokens consumed when the skill is loaded into context.

For this MCP integration, we're using the unminified source for maintainability and reviewability. If token optimization becomes a concern, minification can be added as a build step later.

INP interaction workflow

INP requires real user interactions to measure. When INP.js returns { status: "tracking" }, the skill instructs the agent to:

  1. Explicitly tell the user that INP tracking is active
  2. Ask the user to interact with the page (click, type, navigate menus)
  3. Wait for user confirmation
  4. Call getINP() to collect results

The agent cannot simulate interactions on behalf of the user for this metric.

Relationship to existing debug-optimize-lcp skill

The existing skill is a workflow guide that orchestrates MCP tools (trace recording, performance insights). This skill is script-based — it executes curated snippets and returns structured JSON. They are complementary: use debug-optimize-lcp for trace-based LCP analysis, webperf-core-web-vitals for script-based CWV measurement.

Decision: no !command`` injection in SKILL.md

Claude Code supports embedding shell commands in SKILL.md using the !command`` syntax — the output is injected inline when the skill loads, so the agent receives the content without needing a Read tool call (reference).

This was considered for injecting script contents directly into the prompt (e.g. !`cat scripts/LCP.js`), which would eliminate the file-read step during execution. However, this feature is Claude Code-specific: other agents (Cursor, Windsurf, generic MCP clients) would receive the literal command string instead of its output, breaking the skill entirely.

Since these skills target any MCP-compatible agent, we keep scripts as separate files read on demand. Cross-agent compatibility takes priority over saving a tool call.

Next skills

This is the first in the webperf skills series. Planned follow-ups (tracked in issue #1114):

  • webperf-loading (28 scripts: TTFB, FCP, render-blocking, fonts, resource hints…)
  • webperf-interaction (8 scripts: INP deep-dive, Long Animation Frames, scroll jank…)
  • webperf-media (3 scripts: image audit, video audit, SVG bitmap detection)
  • webperf-resources (1 script: network bandwidth / connection quality)
  • webperf (meta-skill)

@nucliweb
Copy link
Copy Markdown
Author

Design decision: agent-first scripts

The scripts have been rewritten to be agent-first: all console.log, console.table, console.group, and related helper code (RATING objects, formatMs, logINP, logCLS…) has been removed. The only output is the structured JSON return value consumed by the agent.

This reduced the total script code from ~1481 lines to ~623 lines.

What was kept: element highlighting via element.style.outline — this gives the user a visual cue in the browser while the agent works (e.g., the LCP element gets a dashed lime outline, LCP Trail candidates get distinct colors).

Open question for reviewers: is the visual highlighting worth keeping, or should the scripts be pure data with zero side effects? Arguments either way:

  • Keep: useful UX when the user is watching the browser while the agent runs
  • Remove: pure functions are easier to reason about and test; visual state leaks between script runs

Runs the skill in an isolated subagent so the main context only sees
the final analysis result, not the intermediate script execution calls.
@nucliweb
Copy link
Copy Markdown
Author

Added context: fork to the skill frontmatter so it runs in an isolated subagent — the main context only receives the final analysis result, not all the intermediate evaluate_script / get_console_message calls.

Tip via @lydiahallie.

Scripts:
- fix(LCP-Video-Candidate): apply activationStart correction to LCP value
  (bfcache navigations returned inflated values without this)
- fix(LCP): use getComputedStyle for background image detection
  (inline style check missed CSS-driven backgrounds)
- fix(CLS): add message field to synchronous return so agent knows to call getCLS()
- fix(INP): remove misleading rating:"good" from no-interactions error case;
  preserve getDataFn so agent can retry after user interacts
- feat(LCP, LCP-Sub-Parts, LCP-Trail): add window.__cwvHighlight flag
  to allow disabling visual element outlines without changing scripts

Documentation:
- docs(schema): document getINPDetails() with return schema and example
- docs(schema): add INP error case (no interactions) to schema examples
- docs(SKILL): add Error Recovery section with per-script guidance
- docs(SKILL): add cross-skill fork note — cross-skill triggers are
  recommendations to report, not direct calls from the forked subagent
- docs(SKILL): add Visual Highlighting section documenting __cwvHighlight
- docs(SKILL): prefix cross-skill script references with skill name
- docs(SKILL): document getINPDetails() step in INP debugging workflow
@nucliweb
Copy link
Copy Markdown
Author

Review fixes — based on Anthropic skill-creator recommendations

This commit applies improvements recommended by Anthropic's skill-creator after a full review of the skill scripts and documentation.

Bug fixes

LCP-Video-Candidate.js — missing activationStart correction
All other LCP scripts subtract activationStart from lcp.startTime to handle bfcache navigations correctly. This was the only one that didn't — meaning the reported LCP value could be inflated on back/forward navigations.

LCP.js — background image detection via getComputedStyle
el.style?.backgroundImage only checks inline styles. CSS-driven backgrounds were always missed, causing the element type to fall through to the tag name. Now uses window.getComputedStyle(el).backgroundImage.

CLS.js — missing message in synchronous return
The schema.md documented a message field telling the agent to call getCLS() for updated values, but the script wasn't including it. The agent had no signal to follow up.

INP.js — misleading rating: "good" on no-interactions error
When getINP() was called with no recorded interactions, it returned status: "error" alongside rating: "good" — contradictory and potentially confusing for the agent. Now returns a clean error with getDataFn preserved so the agent knows it can retry.

New feature: window.__cwvHighlight flag

Scripts that highlight LCP elements (LCP.js, LCP-Sub-Parts.js, LCP-Trail.js) now check window.__cwvHighlight !== false before applying outlines. Highlighting stays on by default — the UX value when the user is watching the browser is real — but it can now be disabled cleanly:

window.__cwvHighlight = false;
// then run any LCP script

This also resolves the open question raised in the first comment: keep highlighting as the default, opt-out when needed.

Documentation

  • getINPDetails() was an undocumented window function in INP.js. Added to schema.md with return schema and example, and to the INP debugging workflow in SKILL.md.
  • Error Recovery section added to SKILL.md — per-script guidance on what to do when a script returns status: "error" (page not loaded yet, no interactions, data-URI-only pages, etc.).
  • Cross-skill fork note added — clarifies that cross-skill triggers (webperf-loading, webperf-interaction, webperf-media) are recommendations to report back to the user, not direct calls the forked subagent can make.
  • Visual Highlighting section added — documents the __cwvHighlight flag and which scripts support it.
  • Cross-skill script references now consistently prefixed with the skill name (e.g., webperf-interaction skill) to make it clear these require a separate skill activation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant