Skip to content

feat(orchestrator-leaderboard): Daydream discovery plans and cache refresh#337

Merged
eliteprox merged 15 commits into
mainfrom
feat/discovery-plan-billing-provider
May 29, 2026
Merged

feat(orchestrator-leaderboard): Daydream discovery plans and cache refresh#337
eliteprox merged 15 commits into
mainfrom
feat/discovery-plan-billing-provider

Conversation

@eliteprox
Copy link
Copy Markdown
Contributor

@eliteprox eliteprox commented May 23, 2026

Summary

Narrows the orchestrator-leaderboard discovery-plan work from #307 into a Daydream-only foundation for PR #337.

  • Adds generic billingProviderSlug support on DiscoveryPlan, with PR feat(orchestrator-leaderboard): Daydream discovery plans and cache refresh #337 validating and defaulting to daydream only.
  • Adds plan CRUD/list/get filtering, plan-scoped results, capability catalog, and default/plan-scoped python-gateway discovery endpoints.
  • Preserves the generic discovery cache path: startup/cron global dataset refresh, plan lazy-eval cache, query caching, and plan-cache invalidation.
  • Updates the plugin Plans UI and docs to present Daydream as the only provider in this PR.

Out of scope / stacked follow-up

PymtHouse-specific behavior has been removed from this PR and belongs in #338 (feat/pymthouse-integration-followups), including manifest sync/gating, PymtHouse env/docs, provider UI toggle/copy, OIDC/billing APIs, and billing provider seeding.

Test plan

  • npm --workspace @naap/web-next test
  • npm --workspace @naap/plugin-orchestrator-leaderboard-frontend run build
  • Push hook quick checks passed for @naap/plugin-build, @naap/plugin-sdk, and Vercel runtime safety
  • Vercel preview smoke: plan CRUD, capability catalog, python-gateway discovery

Related: #307, #338

Summary by CodeRabbit

  • New Features

    • Python Gateway discovery endpoints for orchestrator address lookup
    • Provider-scoped Capability Catalog and a Plan creation page (/plans/new)
    • Billing-provider scoping added across plan APIs and UI
  • Bug Fixes

    • Hardened CRON auth with constant-time checks
    • Corrected 401 vs 403 authorization behavior
    • Simplified plan-results response envelope for clients
  • Improvements

    • Provider-aware caching and invalidation for discovery results
    • Redesigned plan/discovery UI and capability selection components
    • Startup dataset refresh and deploy-time seeding workflow updated

Add billingProviderSlug on DiscoveryPlan with API filtering, CRUD
support, and default plan seeding for pymthouse. Plans list/get accept
an optional billingProviderSlug query param; pymthouse requests also
match legacy rows with a null provider slug.

Co-authored-by: Cursor <cursoragent@cursor.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 23, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
naap-platform Ready Ready Preview, Comment May 27, 2026 6:52pm

Request Review

@github-actions github-actions Bot added has-migration Includes database migration size/L Large PR (201-500 lines) labels May 23, 2026
@github-actions
Copy link
Copy Markdown

🗃️ Database Migration Detected

This PR includes changes to Prisma schema files. Please ensure:

  • Migration is backward-compatible or a rollback plan exists
  • Data migration scripts are included if needed
  • Schema changes will be auto-applied to the preview database (Neon preview branch) during the Vercel preview deployment
  • Verify the preview deployment works correctly with the new schema
  • On merge to main, schema changes will auto-promote to production via prisma db push

Preview DB: This PR's Vercel preview deployment uses an isolated Neon database branch. Schema changes are applied automatically via prisma db push during the preview build. The preview branch is reset after each production deploy.

Requesting review from the core team: @livepeer/core

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 23, 2026

⚠️ This PR is very large (4936 lines changed). Please split it into smaller, focused PRs if possible.

@github-actions github-actions Bot added scope/shell Shell app changes scope/packages Shared package changes scope/infra Infrastructure changes labels May 23, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

Adds billing-provider slug support across types/DB/CRUD and seeds; implements provider-scoped capability catalog; adds plan-scoped and stateless python-gateway discovery with tiered shuffling and composite evaluation cache keys; refactors ClickHouse query routing and adapters; hardens cron/admin auth; updates frontend components, hooks, API client, tests, docs, and startup refresh.

Changes

Billing Provider Plan Scoping

Layer / File(s) Summary
DB schema & npm seed script
packages/database/prisma/schema.prisma, package.json, bin/seed-discovery-plans.ts, bin/db-setup.sh
Adds DiscoveryPlan.billingProviderSlug + index, backfills legacy nulls to 'daydream', seeds default discovery plans, and adds seed:discovery-plans npm script.
Types & schemas
apps/web-next/src/lib/orchestrator-leaderboard/types.ts
Exports BILLING_PROVIDER_SLUGS, BillingProviderSlug type and BillingProviderSlugSchema; adjusts plan schemas to accept nullish filters/SLA weights and include billingProviderSlug.
Plan CRUD logic
apps/web-next/src/lib/orchestrator-leaderboard/plans.ts
Adds billing-provider where helpers and list/get signatures, persists default 'daydream' on create, tightens mutation scoping, maps null JSON inputs to Prisma.JsonNull, and normalizes DB runtime shape.
Plans API parsing & results
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans*.ts, .../plans/[id]/results/route.ts
Parses/validates optional billingProviderSlug query param (400 on invalid), threads parsed slug into list/get/getPlanResults, flattens results response envelope, and sets DISCOVERY_RESPONSE_CACHE_CONTROL.
Tests & validation
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/*
Adds/updates tests for billingProviderSlug scoping, schema defaults/null handling, and cache/invalidate behaviors.

Python Gateway Discovery

Layer / File(s) Summary
Tiered discovery ordering
apps/web-next/src/lib/orchestrator-leaderboard/discovery-order.ts, .../discovery-order.test.ts
Implements tier partitioning, effective tier count, and tier-local shuffling while preserving first-seen deduplication.
Plan-scoped gateway
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.ts, tests
New GET returning bare JSON array of addresses: authorize, optional slug validation, resolve provider-allowed capabilities, evaluate/cached plan, dedupe/truncate/shuffle addresses, respond with cache/refresh headers.
Stateless gateway
apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts, tests
New GET supporting repeated caps/topN/provider scoping: parse params, provider-filter capabilities, fetch leaderboards, dedupe/trim/tier-shuffle, return bare JSON array with headers and error text handling.

Capability Catalog and ClickHouse

Layer / File(s) Summary
Provider restrictions
apps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.ts, tests
Normalizes billing provider slugs (Daydream-only), returns constant revision, and currently treats Daydream as pass-through for capability filtering.
Capability catalog route
apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts
GET /capability-catalog: provider-scoped pipeline/model filtering, manifestOnly branch, optional cache bypass, and DISCOVERY_RESPONSE_CACHE_CONTROL.
ClickHouse query helpers
apps/web-next/src/lib/orchestrator-leaderboard/query.ts, tests
Adds resolveClickhouseQueryTarget and buildOrchestratorClickhouseFetchParams, refactors fetchFromClickHouse to use them with timeout/AbortSignal, and switches to replaceAll for capability placeholder.
Source adapters
apps/web-next/src/lib/orchestrator-leaderboard/sources/clickhouse.ts, .../naap-pricing.ts, tests
Adds parseCapabilityRows helper, direct-target branch, and normalizes NaaP rows; adds integration/unit tests for direct mode.

Caching, Discovery Policy, and Startup

Layer / File(s) Summary
Discovery policy
apps/web-next/src/lib/orchestrator-leaderboard/discovery-policy.ts
Adds DiscoveryPolicy types and applyDiscoveryPolicyToOrchestrators to filter/sort/truncate orchestrator rows.
Discovery constants
apps/web-next/src/lib/orchestrator-leaderboard/discovery-constants.ts
Adds DISCOVERY_RESPONSE_CACHE_CONTROL and fingerprintCapabilityList for cache keys.
Plan evaluation cache keying
apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts, tests
Introduces composite cache key (plan id + billingProviderSlug + providerRevision + fingerprint), resolves provider capabilities before cache lookup, writes under composite keys, and updates prefix-based cache scan/invalidate.
Startup refresh
apps/web-next/src/lib/orchestrator-leaderboard/global-refresh.ts, apps/web-next/src/instrumentation.ts
Adds refreshGlobalDatasetOnStartup with in-flight dedupe and freshness check; invokes on server startup and logs outcome.

Security and Misc

Layer / File(s) Summary
Cron and admin auth hardening
apps/web-next/src/app/api/v1/orchestrator-leaderboard/filters/route.ts, .../dataset/refresh/route.ts, .../sources/route.ts
Uses timingSafeEqual for cron secret comparison, sets Authorization header only when token present, and splits missing-user (401) vs missing-admin-role (403) checks.
Docs, OpenAPI, tests config, scripts
plugins/.../docs/*, plugins/.../docs/openapi.yaml, vitest.config.ts, bin/*
Updates OpenAPI (python-gateway endpoints, BillingProviderSlug param and schemas), docs/how-to/for-ai, Vitest inline deps, db-setup and vercel build scripts, and example env for PymtHouse.

Possibly related PRs

  • livepeer/naap#227: Related pipeline-catalog/resolver refactor and caching work.
  • livepeer/naap#213: Prior changes touching plan CRUD and billingProviderSlug scaffolding.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • seanhanca
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/discovery-plan-billing-provider

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts (1)

164-187: ⚡ Quick win

Add a getPlan(..., 'pymthouse') compatibility test for legacy null rows.

You already cover this fallback in listPlans; adding the same assertion for getPlan would protect the special-case behavior from regressions on the detail endpoint.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts`
around lines 164 - 187, Add a new test that calls getPlan('pub-1', {
ownerUserId: 'user-b' }, 'pymthouse') and asserts mockFindFirst was invoked with
a where clause that applies the provider compatibility fallback (i.e., the
billingProviderSlug condition allows legacy null rows). Reuse publicPlan and
mockFindFirst from the existing test, mirroring the provider-scoped assertion
but expect the provider predicate to include billingProviderSlug: null (or an OR
that allows null) for the 'pymthouse' case so the special-case behavior is
covered on the detail endpoint.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/`[id]/route.ts:
- Around line 41-45: The check `if (!raw)` treats an empty query
`?billingProviderSlug=` as missing and bypasses validation; change the check to
`raw === null` so empty strings still reach
`BillingProviderSlugSchema.safeParse(raw.trim().toLowerCase())`, allowing the
route handler to return a 400 on invalid/empty values; update the logic around
the `raw` variable in route.ts (the block that calls
BillingProviderSlugSchema.safeParse) accordingly to ensure only a truly absent
param is treated as missing.

In `@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts`:
- Around line 24-29: The current check uses if (!raw) which treats an empty
query value as absent and bypasses validation; change that to a strict null
check (if (raw === null)) so only missing params return { value: null, error:
null }, and let BillingProviderSlugSchema.safeParse(raw.trim().toLowerCase())
handle empty-string validation; update the conditional around raw and keep
references to the same variables (raw) and schema
(BillingProviderSlugSchema.safeParse) so empty ?billingProviderSlug= is treated
as invalid.

In `@apps/web-next/src/lib/orchestrator-leaderboard/plans.ts`:
- Around line 162-167: The early return when scopeWhere is falsy blocks admin
actions on public plans; in updatePlan and deletePlan adjust the logic around
writeScopeWhere(scope) so that if the existing plan's visibility === 'public'
you do not immediately return 'forbidden' but still construct mutationWhere to
allow admin public-plan mutations (i.e., compute scopeWhere =
writeScopeWhere(scope) only for non-public plans, and for public plans let
mutationWhere be { id } so the public-plan admin branch can apply); apply the
same pattern to both updatePlan and deletePlan, referencing writeScopeWhere,
existing.visibility, mutationWhere, updatePlan and deletePlan to locate the
changes.

---

Nitpick comments:
In
`@apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts`:
- Around line 164-187: Add a new test that calls getPlan('pub-1', { ownerUserId:
'user-b' }, 'pymthouse') and asserts mockFindFirst was invoked with a where
clause that applies the provider compatibility fallback (i.e., the
billingProviderSlug condition allows legacy null rows). Reuse publicPlan and
mockFindFirst from the existing test, mirroring the provider-scoped assertion
but expect the provider predicate to include billingProviderSlug: null (or an OR
that allows null) for the 'pymthouse' case so the special-case behavior is
covered on the detail endpoint.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8f87ff80-6fab-45d6-b0ec-65d538bd0093

📥 Commits

Reviewing files that changed from the base of the PR and between 84fa791 and bf645e6.

📒 Files selected for processing (9)
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/__tests__/types-validation.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/plans.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/types.ts
  • bin/seed-discovery-plans.ts
  • package.json
  • packages/database/prisma/schema.prisma

Comment thread apps/web-next/src/lib/orchestrator-leaderboard/plans.ts Outdated
…tions

Treat only absent billingProviderSlug query params as unset so empty
values return 400. Allow admin public-plan updates/deletes without a
personal write scope. Add getPlan pymthouse null-slug compatibility test.

Co-authored-by: Cursor <cursoragent@cursor.com>
Move orchestrator-leaderboard backend, manifest gating, python-gateway
routes, discovery-service client, plugin UI, and docs from #307 into
this PR. Excludes developer-api and billing OIDC/auth changes.

Includes capability catalog, tiered discovery shuffle, default/plan-scoped
python-gateway, PymtHouse manifest sync, and removal of demo seed API.

Co-authored-by: Cursor <cursoragent@cursor.com>
@eliteprox eliteprox changed the title feat(discovery): link plans to billing providers feat(orchestrator-leaderboard): discovery plans and PymtHouse manifest gating May 23, 2026
@eliteprox eliteprox removed the size/L Large PR (201-500 lines) label May 23, 2026
@github-actions github-actions Bot added the size/XL Extra large PR (500+ lines) label May 23, 2026
@eliteprox
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@eliteprox eliteprox requested a review from seanhanca May 23, 2026 06:35
Copy link
Copy Markdown
Contributor

@seanhanca seanhanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — feat(orchestrator-leaderboard): discovery plans and PymtHouse manifest gating

Summary

The three CodeRabbit-flagged fixes all land correctly — empty ?billingProviderSlug= is rejected by zod, and updatePlan / deletePlan now branch on existing.visibility === 'public' so admins no longer fail when their personal scope is empty. The remaining concerns are a non-obvious fail-closed gap for legacy billingProviderSlug = null rows and a chunk of dead code in discovery-service/client.ts. With those two resolved this is a solid feature.


Blocking issues (must fix)

1. Legacy billingProviderSlug = null plans bypass PymtHouse manifest gating end-to-end

plans.ts:74-78 deliberately maps ?billingProviderSlug=pymthouse to OR: [{ slug: 'pymthouse' }, { slug: null }] so legacy rows are queryable as pymthouse-compatible. But the runtime gating treats them as "no provider":

// apps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.ts:34-43
if (normalizeBillingProviderSlug(billingProviderSlug) !== 'pymthouse') {
  return capabilities;  // bypasses manifest filter for null
}
// apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts:111-120
if (normalizeBillingProviderSlug(plan.billingProviderSlug) === 'pymthouse') {
  await ensurePymthouseManifestFresh(...);
}

Net effect: a personal plan created before this PR (slug null) returned via /plans/:id/python-gateway will (a) skip the manifest refresh and (b) pass its capabilities through verbatim — i.e. excluded pipelines/models will still be discovered. That contradicts the documented invariant in docs/pymthouse-integration.md:90 ("missing manifest denies by default").

Fix: either treat null as pymthouse in normalizeBillingProviderSlug (so the default-pymthouse semantic applies everywhere), or backfill billingProviderSlug = 'pymthouse' for existing rows during prisma db push (a one-shot UPDATE in bin/seed-discovery-plans.ts would be enough — createPlan already defaults new rows to 'pymthouse').

2. apps/web-next/src/lib/discovery-service/client.ts is dead code

No callers anywhere in the repo (verified queryPlanFromDiscoveryService, refreshDiscoveryServiceDataset, isDiscoveryServiceEnabled, and DISCOVERY_SERVICE_URL). It exports a 158-line HTTP client with no tests. Either wire it into refresh.ts / global-refresh.ts and add a vitest, or delete it from this PR. Shipping it unused increases the surface that has to be kept in sync with a service this codebase doesn't actually call.


Important (should fix before merge)

1. Stale doc reference to the removed POST /plans/seed

// plugins/orchestrator-leaderboard/docs/for-ai.md:65
| `POST` | `/plans/seed` | JWT or `gw_` | Seed 4 demo plans (idempotent). Onboarding. |

The route was deleted in this PR; replace this row with a pointer to npm run seed:discovery-plans / bin/seed-discovery-plans.ts.

2. openapi.yaml is not updated for any of the new surface

Searches in plugins/orchestrator-leaderboard/docs/openapi.yaml for billingProviderSlug, python-gateway, and capability-catalog all return 0 hits. The new query param on list/get and the three new routes (/plans/:id/python-gateway, /python-gateway, /capability-catalog) need OpenAPI entries — this is the contract the PR description points external consumers at.

3. capability-catalog/route.ts:89 always forces bypassCache: true

const catalog = await getDashboardPipelineCatalog({ bypassCache: true });

Combined with Cache-Control: private, no-store, must-revalidate, every plan-edit page open recomputes the entire pipeline catalog. Drop bypassCache (or move the bypass behind ?refresh=1) so the facade-level cache helps the hot UI path.

4. Inconsistent empty-slug handling on the python-gateway plan route

// apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.ts:32-35
const raw = request.nextUrl.searchParams.get('billingProviderSlug');
if (!raw) {
  return { value: null, error: null };
}

The fixed CRUD/list routes use if (raw === null) so empty strings 400. This route uses if (!raw) so ?billingProviderSlug= is silently treated as missing. Match the strict 400 behavior (preferred) or document the asymmetry — otherwise python-gateway clients sending the param will silently fall back to "no provider override" instead of getting a clear validation error.


Migration & rollout

Schema change is safe (billingProviderSlug String? + nullable index, prisma db push --accept-data-loss will ALTER TABLE without nuking existing rows). Personal plans created before this PR will land with null — see Blocking #1; consider adding a one-time UPDATE "DiscoveryPlan" SET "billingProviderSlug" = 'pymthouse' WHERE "billingProviderSlug" IS NULL to the seed script so the runtime invariant holds without code changes to the gating helpers.

.env.local.example additions reference (commented-out) PYMTHOUSE_* knobs only — no secrets, and PYMTHOUSE_ALLOW_MISSING_MANIFEST_FAIL_OPEN is correctly documented as audited and opt-in.


Verdict

REQUEST_CHANGES — primarily for the legacy-null PymtHouse-gating gap (blocking #1) and the unused discovery-service/client.ts (blocking #2). Once those are resolved and the docs/openapi update lands, the rest are happy "important" cleanups that can ride along.

…points

- Updated the capability catalog route to support a `refresh` query parameter for bypassing cache.
- Added a test to validate 400 responses when `billingProviderSlug` is empty.
- Modified the `parseBillingProviderSlugParam` function to treat null values correctly.
- Implemented a backfill function for legacy billing provider slugs in the seeding script.
- Introduced new API endpoints for plan-scoped and default python-gateway discovery, along with capability catalog retrieval.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/orchestrator-leaderboard/docs/openapi.yaml (1)

69-78: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the new 400 path for invalid billingProviderSlug.

These operations now accept billingProviderSlug, and the shared parameter explicitly says an empty value returns 400. The spec still only advertises 200/401(/404), so generated clients and external integrators will miss a real validation branch.

Also applies to: 136-154

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/orchestrator-leaderboard/docs/openapi.yaml` around lines 69 - 78, The
OpenAPI operations that reference the shared parameter BillingProviderSlugQuery
must include a "400" response to document the empty/invalid billingProviderSlug
validation branch; update the responses block for the affected operations (the
one shown and the other occurrence around lines 136-154) to add a "400" response
that references the existing bad-request response/schema (e.g.,
components/responses/BadRequest or a suitable error schema) so generated clients
and integrators will surface the validation error for BillingProviderSlugQuery.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts`:
- Around line 62-67: When handling requestedBillingProviderSlug === 'pymthouse'
wrap the call to ensurePymthouseManifestFresh() in a try/catch so that any
exception is swallowed (and optionally logged) and manifestAvailable is set to
false instead of letting the error propagate; after the try/catch still call
getPymthouseManifestSnapshot() only when no error occurred to determine
manifestAvailable, referencing ensurePymthouseManifestFresh(),
getPymthouseManifestSnapshot(), and the manifestAvailable variable to locate
where to add the try/catch.

In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/`[id]/results/route.ts:
- Around line 29-31: The current check uses `if (!raw)` which treats an empty
query like `?billingProviderSlug=` as missing; change the validation so you
distinguish null vs empty string: keep the null check (`raw === null`) for
missing param, and add an explicit empty-string check (e.g., `raw.trim() ===
''`) to reject empty values and return a 400/validation error. Update the logic
around the `raw` variable in route.ts (the `billingProviderSlug` handling) so
empty strings are rejected rather than treated as missing.

In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts`:
- Around line 145-148: The handler currently returns the raw variable message in
the NextResponse for all errors; change it so that when invalidCapability is
true you still return message with status 400, but when status is 500 return a
generic non-sensitive message (e.g., "Internal server error") instead of the raw
message; keep the original message available for server-side logging (e.g., log
the error before returning) and update the NextResponse creation (referencing
NextResponse, message, invalidCapability) accordingly.
- Around line 87-89: Call to ensurePymthouseManifestFresh() is currently outside
the try/catch so any rejection bypasses your structured API error handling; move
the ensurePymthouseManifestFresh() invocation into the existing try block that
handles the 'pymthouse' billingProvider (the same guarded path that creates/uses
the Python gateway client in route.ts) so its errors are caught and translated
into your API error response rather than escaping the handler.

In `@apps/web-next/src/lib/orchestrator-leaderboard/query.ts`:
- Around line 56-58: The current resolveDirectClickhouseUrl function drops any
configured path/query by doing new URL('/', rawUrl); change it to preserve the
original path and query from CLICKHOUSE_URL: parse rawUrl with new URL(rawUrl),
ensure the pathname ends with a trailing slash if needed (so endpoints like
base/path become base/path/), and return that URL.toString(); update the
function resolveDirectClickhouseUrl to use this logic so direct mode calls the
correct path-prefixed endpoint.

In `@apps/web-next/src/lib/pymthouse-discovery-plans.test.ts`:
- Around line 35-37: The test sets inconsistent env vars using PMTHOUSE_*;
update the vi.stubEnv calls so all keys match the rest of the integration (use
PYMTHOUSE_*). Specifically, change vi.stubEnv('PMTHOUSE_CLIENT_ID', 'app_x') to
vi.stubEnv('PYMTHOUSE_CLIENT_ID', 'app_x') and
vi.stubEnv('PMTHOUSE_M2M_CLIENT_SECRET', 'secret') to
vi.stubEnv('PYMTHOUSE_M2M_CLIENT_SECRET', 'secret'), and then run the test to
ensure no other references still use the old PMTHOUSE_* names.

In `@apps/web-next/src/lib/pymthouse-manifest.test.ts`:
- Around line 114-121: The test "uses server manifestVersion when provided" is
asserting the opposite; update the assertion in the test that calls
computeManifestRevision so it expects the returned revision to equal the
provided server manifestVersion ('server-rev-abc') and, if helpful, rename the
test or adjust its description to remain consistent with the behavior being
validated; target the computeManifestRevision call and its expect(...) assertion
for this change.

In `@apps/web-next/src/lib/pymthouse-manifest.ts`:
- Around line 245-265: The current fetch catch block clears any existing
snapshot by setting body and etag to null before calling applyManifestSnapshot,
which causes transient failures to wipe the last good manifest; change the logic
to preload the last known snapshot into body/etag before attempting the network
call (e.g. call a helper like getLastManifestSnapshot() to populate body and
etag), then only overwrite body/etag when the fetch succeeds and
parseManifestJson returns a valid manifest, and in the catch block do not set
body/etag to null so applyManifestSnapshot(body, etag) will preserve the
previous snapshot; reference parseManifestJson, applyManifestSnapshot and the
fetch block around creds.url when making the change.

In `@plugins/orchestrator-leaderboard/docs/how-to-guide.md`:
- Around line 451-452: The key-takeaway line uses the wrong path
`results.data.capabilities[...]`; update it to match the examples by referencing
`results.capabilities[...]` (or `env.data.capabilities[...]` where the doc uses
`env`) so clients parse the same structure; keep the note about `.orchUri` as
the dialer field.

In `@plugins/orchestrator-leaderboard/docs/openapi.yaml`:
- Around line 311-315: Change the query parameters "caps" and "capability" from
a scalar string to an array schema so generators emit repeatable query params;
update each param (e.g., the param named caps and the one named capability) to
use schema: { type: array, items: { type: string } } and include query
serialization hints (style: form, explode: true) so repeated usages like
?caps=a&caps=b are represented correctly.

In `@plugins/orchestrator-leaderboard/frontend/e2e/leaderboard.spec.ts`:
- Around line 260-262: The test asserts properties on resp.data.plan (e.g.,
expect(resp.data.plan.name)...), but the stubbed response still nests the
payload under resp.data.data.plan; update the mocked/stubbed response object
used in the spec so it returns the plan at resp.data.plan (remove the extra
"data" wrapper) and ensure capabilities are at resp.data.capabilities (so checks
like expect(resp.data.capabilities.noop).toHaveLength(1) pass); locate the stub
in the same spec where the fake response body is constructed and adjust the
shape accordingly.

In `@plugins/orchestrator-leaderboard/frontend/src/globals.css`:
- Around line 109-112: Remove the extra blank line between the background and
color declarations to satisfy the Stylelint rule declaration-empty-line-before:
inside the CSS rule that contains "background: var(--bg-tertiary);" and "color:
`#34d399`;", delete the empty line so the two declarations are adjacent (no blank
line before the color declaration).

In `@plugins/orchestrator-leaderboard/frontend/src/pages/PlanDetailPage.tsx`:
- Around line 125-134: bulkToggleCapabilities currently only compares exact
strings (using current.includes and removeSet) so legacy aliases like "sdxl"
remain selected when their canonical counterparts are removed; update
bulkToggleCapabilities to normalize/expand capabilities during comparison:
create or use a mapping of legacy aliases to canonical capability names (e.g.,
aliasMap) and when deselecting expand removeSet to include both the supplied
capability keys and their known aliases/canonical equivalents, then filter
current by checking against this expanded set (referencing
bulkToggleCapabilities, current, next, and removeSet to locate changes).

---

Outside diff comments:
In `@plugins/orchestrator-leaderboard/docs/openapi.yaml`:
- Around line 69-78: The OpenAPI operations that reference the shared parameter
BillingProviderSlugQuery must include a "400" response to document the
empty/invalid billingProviderSlug validation branch; update the responses block
for the affected operations (the one shown and the other occurrence around lines
136-154) to add a "400" response that references the existing bad-request
response/schema (e.g., components/responses/BadRequest or a suitable error
schema) so generated clients and integrators will surface the validation error
for BillingProviderSlugQuery.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3ceef9c9-0bbb-4e57-82b5-f15ae5ae6e10

📥 Commits

Reviewing files that changed from the base of the PR and between bf645e6 and 09d0fa7.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json, !package-lock.json
📒 Files selected for processing (68)
  • apps/web-next/.env.local.example
  • apps/web-next/src/app/api/v1/dashboard/orchestrators/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/dataset/refresh/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/filters/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.test.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/results/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/seed/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.test.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/sources/route.ts
  • apps/web-next/src/instrumentation.ts
  • apps/web-next/src/lib/facade/index.ts
  • apps/web-next/src/lib/facade/resolvers/pipeline-catalog.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/__tests__/provider-restrictions.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/__tests__/query.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/__tests__/refresh.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/__tests__/sources/integration.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/discovery-order.test.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/discovery-order.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/global-refresh.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/plans.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/query.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/sources/clickhouse.ts
  • apps/web-next/src/lib/orchestrator-leaderboard/sources/naap-pricing.ts
  • apps/web-next/src/lib/orchestrators-discovery-policy.test.ts
  • apps/web-next/src/lib/orchestrators-discovery-policy.ts
  • apps/web-next/src/lib/pymthouse-discovery-plans.test.ts
  • apps/web-next/src/lib/pymthouse-discovery-plans.ts
  • apps/web-next/src/lib/pymthouse-manifest.test.ts
  • apps/web-next/src/lib/pymthouse-manifest.ts
  • apps/web-next/tests/leaderboard-posture.spec.ts
  • apps/web-next/tests/orchestrator-leaderboard.spec.ts
  • apps/web-next/vitest.config.ts
  • bin/db-setup.sh
  • bin/seed-discovery-plans.ts
  • bin/vercel-build.sh
  • docs/pymthouse-integration.md
  • plugins/orchestrator-leaderboard/docs/data-sources.md
  • plugins/orchestrator-leaderboard/docs/for-ai.md
  • plugins/orchestrator-leaderboard/docs/how-to-guide.md
  • plugins/orchestrator-leaderboard/docs/openapi.yaml
  • plugins/orchestrator-leaderboard/frontend/e2e/leaderboard.spec.ts
  • plugins/orchestrator-leaderboard/frontend/src/App.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/AdminSettings.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/CapabilityGroupPicker.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/CapabilityTag.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/CollapsibleTagList.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/EndpointGuide.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/FormLabel.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/SectionLabel.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/StyledCheckbox.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/TabNav.tsx
  • plugins/orchestrator-leaderboard/frontend/src/globals.css
  • plugins/orchestrator-leaderboard/frontend/src/hooks/useCapabilityCatalog.ts
  • plugins/orchestrator-leaderboard/frontend/src/hooks/usePlanDetail.ts
  • plugins/orchestrator-leaderboard/frontend/src/hooks/usePlans.ts
  • plugins/orchestrator-leaderboard/frontend/src/lib/api.ts
  • plugins/orchestrator-leaderboard/frontend/src/pages/LeaderboardPage.tsx
  • plugins/orchestrator-leaderboard/frontend/src/pages/PlanCreatePage.tsx
  • plugins/orchestrator-leaderboard/frontend/src/pages/PlanDetailPage.tsx
  • plugins/orchestrator-leaderboard/frontend/src/pages/PlansOverviewPage.tsx
💤 Files with no reviewable changes (1)
  • apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/seed/route.ts
✅ Files skipped from review due to trivial changes (6)
  • plugins/orchestrator-leaderboard/frontend/src/components/StyledCheckbox.tsx
  • plugins/orchestrator-leaderboard/frontend/src/components/TabNav.tsx
  • plugins/orchestrator-leaderboard/docs/data-sources.md
  • apps/web-next/tests/orchestrator-leaderboard.spec.ts
  • bin/vercel-build.sh
  • docs/pymthouse-integration.md

Comment thread apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts Outdated
Comment thread apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts Outdated
Comment thread apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts Outdated
Comment thread apps/web-next/src/lib/orchestrator-leaderboard/query.ts
Comment thread plugins/orchestrator-leaderboard/docs/openapi.yaml
Comment thread plugins/orchestrator-leaderboard/frontend/e2e/leaderboard.spec.ts
Comment thread plugins/orchestrator-leaderboard/frontend/src/globals.css
Comment thread plugins/orchestrator-leaderboard/frontend/src/lib/api.ts
eliteprox and others added 2 commits May 26, 2026 14:03
… routes

- Added error handling for PymtHouse manifest refresh failures in the capability catalog route.
- Improved validation logic for `billingProviderSlug` in the plans results route.
- Updated the Python gateway route to ensure manifest refresh is only attempted when necessary.
- Enhanced tests to cover scenarios where manifest fetch fails, preserving the last known good state.
- Refactored utility functions to improve clarity and maintainability.

Co-authored-by: Cursor <cursoragent@cursor.com>
…rator discovery plans

- Replaced PymtHouse manifest dependencies with Daydream-specific logic in the capability catalog and Python gateway routes.
- Updated billing provider handling to default to 'daydream' instead of 'pymthouse'.
- Introduced new discovery constants and policies to support Daydream-only functionality.
- Removed legacy PymtHouse manifest and discovery plan files, streamlining the codebase.
- Enhanced the orchestrator discovery policy to apply filters without relying on PymtHouse exclusions.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@seanhanca seanhanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after Daydream-only pivot

Picking this back up after the 8 follow-up commits. Scope-pivot to Daydream-only is clean — PymtHouse manifest/discovery code is gone and BILLING_PROVIDER_SLUGS = ['daydream'] is enforced by Zod end-to-end. 5 of 6 prior blockers are fully resolved. Remaining issues below.


Prior-review blocker status

  1. Legacy billingProviderSlug = null plans bypassed PymtHouse manifest gating — RESOLVED. provider-restrictions.ts no longer does manifest gating; normalizeBillingProviderSlug only treats 'daydream' as valid; bin/seed-discovery-plans.ts:78-91 backfills legacy null rows at deploy time.
  2. Dead apps/web-next/src/lib/discovery-service/client.ts — RESOLVED. Directory deleted.
  3. Stale POST /plans/seed doc reference — RESOLVED. Tables in for-ai.md and how-to-guide.md:636-640 now point at npm run seed:discovery-plans. Route file deleted; no callers remain.
  4. openapi.yaml not updated for new endpoints/params — PARTIALLY RESOLVED. billingProviderSlug (openapi.yaml:796-803), python-gateway (openapi.yaml:252-375), and capability-catalog (openapi.yaml:377-417) are documented. Still wrong: the /plans/{id}/results response schema (openapi.yaml:237-243) keeps the old double-data wrapper — see Blocking #3.
  5. capability-catalog/route.ts:89 always forced bypassCache: true — RESOLVED. Now only bypasses when ?refresh=1 is present.
  6. Inconsistent empty-slug handling on /plans/[id]/python-gateway — RESOLVED. All four route handlers consistently 400 on empty value, and the python-gateway test asserts it (plans/[id]/python-gateway/route.test.ts:84-93).

Blocking issues (must fix)

1. Vitest is red — two tests fail in plans-visibility.test.ts

apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts:111-115 and :167-172 expect listPlans / getPlan to throw 'Invalid billingProviderSlug' when called with a blank slug ' '. The implementation in apps/web-next/src/lib/orchestrator-leaderboard/plans.ts:67-75 does no validation — it constructs { billingProviderSlug } from whatever string it receives and passes it to Prisma. Local run: Tests 2 failed | 137 passed | 1 skipped.

Fix: add a trim/empty-string guard inside billingProviderWhere (or at the call sites in listPlans / getPlan / listPlansWhere) that throws Error('Invalid billingProviderSlug') when a non-null/undefined value normalizes to ''. The tests cast ' ' as never precisely because the type alone doesn't enforce the runtime contract — fix the implementation, not the tests.

2. apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/refresh/route.ts:14-17 CRON auth still uses === string compare

Release notes claim "Hardened CRON auth with constant-time checks," but only dataset/refresh/route.ts:21-31 and filters/route.ts:27-37 use timingSafeEqual. The plans-refresh route is still:

function authorized(request: NextRequest): boolean {
  const auth = request.headers.get('authorization');
  return Boolean(process.env.CRON_SECRET) && auth === `Bearer ${process.env.CRON_SECRET}`;
}

Fix: replicate the timingSafeEqual pattern from dataset/refresh/route.ts:21-31 (length short-circuit + buffered compare). Extract into a shared verifyCronAuth(request) helper so all CRON entry points share the same hardened check and the next addition can't drift.

3. /plans/{id}/results response-shape change is undocumented and contradicts for-ai.md + openapi.yaml

apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/results/route.ts:83 now returns success(withPlanMeta) — single-wrapped { success, data: PlanResults }. Pre-PR it returned success({ data: withPlanMeta }) — double-wrapped. how-to-guide.md was updated correctly, but:

  • plugins/orchestrator-leaderboard/docs/for-ai.md:36-37 still says "the payload is double-nested: body.data.data.capabilities[<cap>][]. This is intentional, not a bug."
  • for-ai.md:265-269 reference impl reads out.data from a { data: PlanResults } wrapper.
  • for-ai.md:309 pitfalls table still tells readers "Path is body.data.data.capabilities — double .data."
  • plugins/orchestrator-leaderboard/docs/openapi.yaml:237-243 still defines the response as data: { data: PlanResults }.

This is a backward-incompatible break for any client following the docs. Fix: either revert to the double-wrap on the server (low-risk option) or update all four doc sites + the openapi schema in this PR, add a "Breaking change" note to the PR description, and pin the new shape with a regression test (body.data.capabilities defined, body.data.data undefined).


Important (should fix before merge)

  1. plans/[id]/python-gateway/route.ts:129-135 leaks raw error messages in 500 responses. The default /python-gateway route correctly returns 'Internal server error' and logs server-side (python-gateway/route.ts:131-148). The plan-scoped sibling returns err instanceof Error ? err.message : 'Failed to evaluate plan' verbatim. Match the default route's pattern.

  2. dataset/refresh/route.ts:87-93 and sources/route.ts:112-118 also surface err.message in 500 envelopes. Lower severity (admin-only), but the generic-message + server-side console.error pattern should apply across the plugin.

  3. useCapabilityCatalog.ts:30-48 is dead code in this PR. The hook type is BillingProviderSlug = 'daydream' (line 10), so the guard if (billingProviderSlug !== 'pymthouse' || …) is always true and loadManifest always returns early. Drop the manifest… state + first effect (it's all dead until PR #338), or restore the manifest plumbing now. Leaving dead branches makes the next PR riskier.

  4. provider-restrictions.ts:17-22 providerRestrictionRevision accepts a parameter it ignores. Footgun for the cache-key contract in refresh.ts:31-36 (rev will always be 'na'). Remove the parameter or drop the cache-key field.

  5. packages/database/prisma/schema.prisma:2250 doc comment is stale. Still says "pymthouse | daydream — PymtHouse Builder allowlist intersection applies when pymthouse." Update to "daydream" (or note that other slugs land in #338).

  6. openapi.yaml:858 Capability pattern is ^[a-zA-Z0-9_-]+$. Actual Zod regex in types.ts:96 is /^[a-zA-Z0-9_.:\-/]+$/ and types-validation.test.ts:39-50 confirms dots, colons, and / are accepted. OpenAPI under-specifies; generated clients will reject valid plans.

  7. openapi.yaml:882-884 documents SLAWeights defaults 0.4 / 0.3 / 0.3. The Zod schema (types.ts:106-110) sets no defaults. Either remove the defaults from the spec or add them in Zod.


Nits / suggestions

  • plans-visibility.test.ts:111 casts ' ' as never for a reason — once the implementation throws, the cast can stay but documents the runtime contract.
  • Slug validation drifts between routes: plans/[id]/results/route.ts:35-38 tests normalized === '' explicitly, while plans/route.ts:29-33 and friends rely on BillingProviderSlugSchema.safeParse('') failing. Consolidate into a single parseBillingProviderSlugParam utility.
  • provider-restrictions.test.ts:11-13 has an empty afterEach with only a comment. Delete the hook.
  • python-gateway/route.ts:75-78 accepts both billingProviderSlug and billingProvider aliases via normalizeBillingProviderSlug, but the plan-scoped python-gateway only accepts billingProviderSlug and 400s otherwise. Document the asymmetry inline.
  • plans/[id]/python-gateway/route.ts:89 always sets X-Pymthouse-Manifest: empty on the empty-allowlist branch — there is no PymtHouse manifest in this PR. Drop or rename the header.
  • discovery-policy.ts:49-64: the case 'latency': / case 'price': arms fall back to slaScore. Add an inline comment or TODO so the silent fallback isn't surprising once those fields exist.
  • for-ai.md:73-77 lists POST /plans/refresh as "Cron-only bulk refresh. Never call." — accurate, but cross-link the openapi entry (openapi.yaml:419-452) so it doesn't drift again.

Test coverage gaps

  • No test pins the new single-wrapped shape of /plans/{id}/results (would have caught the doc/code drift in Blocking #3).
  • No test verifies that pre-existing rows with billingProviderSlug = null become invisible when the UI requests ?billingProviderSlug=daydream, or that the seed backfill rewrites them.
  • No test for /plans/refresh/route.ts: missing CRON_SECRET → 401, wrong secret → 401, valid secret → 200 (would have caught Blocking #2).
  • No test asserts the capability-catalog route only bypasses cache when ?refresh=1 is present.
  • No test for capability-catalog/route.ts error paths (Facade throws). The route currently lacks a try/catch around getDashboardPipelineCatalog — an upstream failure surfaces as Next's default error page, not a JSON envelope.
  • discovery-order.test.ts is solid for partitioning but doesn't assert all orchestrators appear exactly once when duplicates are trimmed mid-input (discovery-order.ts:65-74 dedupe).
  • bin/seed-discovery-plans.ts has no test for backfillLegacyBillingProviderSlugs — given it mutates production rows at build time, an idempotency test (run twice → no change after first) is warranted.

Migration & rollout notes

  • DiscoveryPlan.billingProviderSlug is nullable + indexed (schema.prisma:2247-2272), so prisma db push is non-destructive.
  • Backfill runs at deploy in bin/seed-discovery-plans.ts:78-99 via npx tsx, scheduled in bin/vercel-build.sh:108-111. Runs after prisma db push, so legacy null rows are immediately rewritten to 'daydream'. If the seed crashes mid-loop, partial state is possible — but the next deploy is idempotent.
  • During the deploy window (schema push → seed → new pods take traffic), old pods still query without the new column and don't care about NULL slugs. No read path in this PR breaks on NULL after the backfill completes.
  • CreatePlanSchema defaults to 'daydream' and the UI hardcodes 'daydream' (PlanCreatePage.tsx:25); post-deploy plans start with the correct slug.
  • For PR #338 (PymtHouse): the seed script will need to learn that null'daydream' is no longer always correct, otherwise future PymtHouse imports get silently coerced.
  • The /plans/{id}/results envelope change is breaking for external SDKs reading body.data.data.capabilities. Surface in the PR description and release notes; consider a one-release deprecation window if external consumers exist.

Verdict

REQUEST_CHANGES — two failing tests in plans-visibility.test.ts, plain === CRON auth in /plans/refresh/route.ts, and the undocumented breaking change to the /plans/{id}/results envelope (with for-ai.md and openapi.yaml still describing the old shape) all need to land before merge. The Daydream-only pivot is the right call and resolves the structural concerns from the previous round.

- Introduced a check to throw an error if billingProviderSlug is an empty string, ensuring better error handling and input validation in the billing provider logic.
eliteprox added 2 commits May 26, 2026 19:57
- Replaced inline CRON authorization checks with a dedicated `verifyCronAuth` function for improved code clarity and maintainability.
- Updated affected routes to utilize the new authorization method, ensuring consistent authentication handling across the application.
- Enhanced error logging for better debugging in case of authorization failures.
…d update data retrieval

- Removed references to the 'pymthouse' billing provider, ensuring the capability catalog is now exclusively focused on 'daydream'.
- Simplified the logic for filtering capabilities, allowing all requested capabilities to pass through unfiltered when manifestOnly is true.
- Updated the data retrieval process to utilize the new `getDatasetCapabilities` function, enhancing the clarity and maintainability of the code.
- Adjusted the seeding script to backfill billing provider slugs, ensuring legacy plans are correctly updated to 'daydream'.
Copy link
Copy Markdown
Contributor

@seanhanca seanhanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — Critical & Major Findings

Reviewed the full diff across all 72 changed files. CI is green and the architecture is solid — Daydream-only scoping, DB-persisted dataset, lazy plan cache, and provider-restriction abstraction are all clean decisions. Below are the issues that should be addressed before merge.

Summary

# Severity File Issue
1 Critical global-dataset.ts writeGlobalDataset deletes all rows before inserting — concurrent reads during refresh window get empty results
2 Major refresh.ts evaluateAndCache / refreshAllPlans accept authToken, requestUrl, cookieHeader params that are never forwarded — dead parameters create a false contract
3 Major plans/[id]/results/route.ts errors.internal(message) on line 90 leaks raw error text from evaluateAndCache to the client on 500
4 Major global-refresh.ts startupRefreshPromise cached-but-cleared-on-reject creates a duplicate refresh race
5 Major cron-auth.ts Length check before timing-safe compare leaks CRON_SECRET length

Detailed inline comments follow.

Comment thread apps/web-next/src/lib/orchestrator-leaderboard/cron-auth.ts Outdated
Comment thread apps/web-next/src/lib/orchestrator-leaderboard/global-refresh.ts Outdated
Comment thread apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts
Comment thread apps/web-next/src/lib/orchestrator-leaderboard/plans.ts Outdated
Comment thread apps/web-next/src/lib/orchestrator-leaderboard/global-refresh.ts
…r dataset updates

- Implemented a new approach for updating the global dataset that uses upsert operations followed by pruning stale rows, ensuring concurrent readers always see a populated table.
- Introduced a dedicated `upsertDatasetBatch` function to handle batch upserts, improving performance and maintainability.
- Updated related tests to reflect changes in the dataset writing logic and ensure proper validation of upsert behavior.
- Removed legacy delete-then-insert logic to prevent empty table states during refresh operations.
@eliteprox
Copy link
Copy Markdown
Contributor Author

All 5 findings from this review (plus the inline pymthouse dead-branch note) are addressed in the latest push. Summary:

# Issue Resolution
1 writeGlobalDataset delete-then-insert window Switched to upsert-then-prune: every row is upserted via INSERT … ON CONFLICT (capability, orchUri) DO UPDATE with the new refreshedAt, then a single deleteMany({ refreshedAt: { lt: now } }) prunes stragglers. Table is never empty mid-refresh.
2 Dead authToken/requestUrl/cookieHeader on evaluateAndCache/refreshAllPlans Removed the unused parameters and updated all 4 callers (plans/[id]/results, plans/[id]/python-gateway, plans/refresh, startLocalRefreshLoop) plus tests. refreshGlobalDataset keeps authToken because it actually does forward it to source adapters.
3 Raw error text leaked to client on 500 Log err server-side, return generic Failed to evaluate plan from errors.internal(...).
4 startupRefreshPromise cleared-on-reject duplicate-refresh race Rejected promise is held for a 30 s STARTUP_REJECT_COOLDOWN_MS cooldown via setTimeout(...).unref?.() so concurrent cold-start callers share the failure; only then can a fresh attempt run. inflight.catch(() => undefined) suppresses spurious unhandled-rejection warnings while leaving the original rejection observable to awaiters.
5 cron-auth length-leak before timing-safe compare Both auth and expected are HMAC-SHA256'd before timingSafeEqual — length is no longer observable via timing.
(bonus) Dead pymthouse branch in plans.ts billingProviderWhere Removed — BillingProviderSlug is ['daydream'] only in this PR. Will return in #338 with test coverage.

Verification: npm --workspace @naap/web-next test705 passed, 1 skipped, 0 failed. Tests for writeGlobalDataset updated to assert the new $executeRaw upsert + deleteMany({ refreshedAt: { lt: ... } }) contract.

@eliteprox
Copy link
Copy Markdown
Contributor Author

@qianghan @seanhanca This is ready for re-review

@eliteprox eliteprox removed the request for review from qianghan May 29, 2026 18:05
@eliteprox eliteprox disabled auto-merge May 29, 2026 18:06
@eliteprox eliteprox enabled auto-merge (squash) May 29, 2026 18:06
@eliteprox eliteprox merged commit 7b84f1d into main May 29, 2026
27 checks passed
@eliteprox eliteprox deleted the feat/discovery-plan-billing-provider branch May 29, 2026 18:07
eliteprox added a commit that referenced this pull request May 29, 2026
Rebased onto main (post #337 "Daydream discovery plans and cache refresh").
Re-applies PymtHouse integration on top of the canonical Daydream discovery
work that landed in #337:

- PymtHouse OIDC device-flow login + device approval routes (consent screen,
  CSRF, constant-time cookie verify, per-instance rate limiting)
- Billing token/usage APIs and usage helpers via @pymthouse/builder-sdk
- Developer API key expiry, billing key prefix masking, safe key serialization
- developer-api plugin UI (usage, keys, discovery plan tooltip) + backend proxy
- PymtHouse manifest gating restored across discovery: provider-restrictions,
  capability-catalog, python-gateway, plan refresh, global refresh
- billingProviderSlug supports 'pymthouse' alongside 'daydream' (OR-null
  backfill for legacy rows); defaults to pymthouse
- UMD plugin stylesheet lifecycle, Modal escape-to-close, plugin visibleToUsers

Conflict resolution preserved #337's Daydream discovery baseline and layered
the branch's PymtHouse functionality on top. The plugin frontend (identical to
main except 5 files) takes the branch versions of the PymtHouse-aware pages.

Co-authored-by: Cursor <cursoragent@cursor.com>
eliteprox added a commit that referenced this pull request May 29, 2026
…#338)

feat(pymthouse): add OIDC, billing APIs, and manifest-gated discovery

Re-apply the PymtHouse integration on top of the canonical Daydream discovery
baseline from #337. Add OIDC device-flow login and approval routes, billing
token/usage APIs, developer API key handling, developer-api plugin UI/backend
proxy, and PymtHouse manifest gating across discovery routes and refresh flows.

Restore billingProviderSlug support for both pymthouse and daydream, including
legacy OR-null handling and pymthouse defaults. Also harden sensitive API
responses with safer error handling and no-store caching, align API key status
serialization to uppercase, update jsdom-safe manifest tests, sync the lockfile,
and clean up UMD stylesheet lifecycle handling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

has-migration Includes database migration scope/infra Infrastructure changes scope/packages Shared package changes scope/shell Shell app changes size/XL Extra large PR (500+ lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants