Skip to content

fix: enforce subpath match for staging domain validation#2590

Open
dhavkuma wants to merge 2 commits into
fix/LLMO-4579-subpath-guard-cdn-routingfrom
fix/subpath-staging-domain-validation
Open

fix: enforce subpath match for staging domain validation#2590
dhavkuma wants to merge 2 commits into
fix/LLMO-4579-subpath-guard-cdn-routingfrom
fix/subpath-staging-domain-validation

Conversation

@dhavkuma

Copy link
Copy Markdown

Summary

  • areDomainsSameAsBase previously only checked the registrable domain (e.g. nba.com), so a staging URL like staging.nba.com was accepted for a prod site at nba.com/kings
  • Added getPathname helper that extracts and normalises the pathname from both URLs (handles bare hostnames, trailing slashes)
  • areDomainsSameAsBase now also requires pathname(stage) === pathname(prod) — root-domain prod sites are unaffected (both sides normalise to /)

Before: staging.nba.com accepted for prod nba.com/kings
After: only staging.nba.com/kings accepted for prod nba.com/kings

Test plan

  • 4 new unit tests added covering: no-subpath stage vs subpath prod (400), wrong subpath (400), correct subpath (200), root vs root (200)
  • All 23 createOrUpdateStageEdgeConfig tests pass
  • Full npm test passes (exit 0)

Related

Companion UI PR: adobe/project-elmo-ui (same branch name)

🤖 Generated with Claude Code

…h prod sites

`areDomainsSameAsBase` previously only checked the registrable domain
(e.g. nba.com), so a staging domain with the wrong subpath—or no
subpath at all—was accepted for a subpath prod site.

Now `getPathname` extracts the pathname from both the prod URL and each
staging URL (normalising trailing slashes) and requires them to match.
Root-domain prod sites are unaffected: both sides resolve to `/`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.47368% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/controllers/llmo/llmo.js 89.47% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@dhavkuma dhavkuma requested a review from MysticatBot June 11, 2026 06:02

@MysticatBot MysticatBot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dhavkuma,

Verdict: Request changes - one robustness gap in the new getPathname helper needs addressing.
Changes: Adds subpath matching to staging domain validation and a CDN routing guard that rejects subpath sites from host-level auto-routing (2 files).

Must fix before merge

  1. [Important] getPathname returns '/' on unparseable input, making parse failure indistinguishable from a legitimate root path - src/controllers/llmo/llmo.js:1865 (details inline)
Non-blocking (3): minor issues and suggestions
  • nit: Add a test exercising a malformed baseURL reaching the CDN guard catch block - fixes the codecov/patch failure and documents the intended fall-through behavior - src/controllers/llmo/llmo.js:1495
  • nit: The trailing-slash test asserts status !== 400 but the guard returns 200 when it rejects - assert the response body does NOT include "subpath sites" instead, which directly proves the guard did not fire - test/controllers/llmo/llmo.test.js:6281
  • suggestion: The CDN guard duplicates the "prepend https:// if missing then new URL()" pattern that getPathname already encapsulates - consider calling getPathname(baseURL) !== '/' instead of re-implementing inline - src/controllers/llmo/llmo.js:1486

Comment thread src/controllers/llmo/llmo.js Outdated
try {
return new URL(`https://${urlString}`).pathname.replace(/\/$/, '') || '/';
} catch {
return '/';

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (blocking): getPathname returns "/" on unparseable input, making the return value ambiguous.

What: The inner catch { return '/'; } means any garbage string (empty, whitespace, protocol-only like "https://") normalizes to "/". When both prod and stage URLs are malformed, getPathname returns "/" for both and the pathname comparison passes vacuously. The only remaining guard is getDomain, which may or may not reject the same garbage.

Why it matters: The function has an unclear contract - callers cannot distinguish "this is a root path" from "I could not parse this at all." If getDomain ever gains a similarly lenient fallback, validation would pass for two broken URLs.

Fix: Return a sentinel (e.g. null) from the final catch and treat it as a non-match in areDomainsSameAsBase:

function getPathname(urlString) {
  try {
    return new URL(urlString).pathname.replace(/\/$/, '') || '/';
  } catch {
    try {
      return new URL(`https://${urlString}`).pathname.replace(/\/$/, '') || '/';
    } catch {
      return null;
    }
  }
}

// In areDomainsSameAsBase:
return urlList.every(
  (stageURL) => {
    const stagePath = getPathname(stageURL);
    return stagePath !== null && getDomain(stageURL) === prodDomain && stagePath === prodPath;
  },
);

This also makes the CDN guard clearer if you reuse it there.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b145ff1getPathname now returns null in the final catch. areDomainsSameAsBase guards against null before comparing paths, so two malformed URLs can no longer match vacuously. The new malformed-URL test covers the CDN guard catch block (fall-through behavior) and the trailing-slash test now asserts the response body excludes "subpath sites" rather than checking the status code.

@MysticatBot MysticatBot added the ai-reviewed Reviewed by AI label Jun 11, 2026
@dhavkuma dhavkuma marked this pull request as draft June 11, 2026 17:04
@dhavkuma dhavkuma changed the base branch from main to fix/LLMO-4579-subpath-guard-cdn-routing June 11, 2026 17:07
…ard tests

- getPathname now returns null instead of '/' for malformed URLs,
  making the parse-failure case distinguishable from a genuine root path
- areDomainsSameAsBase treats a null pathname as a non-match
- Trailing-slash test now asserts response body excludes 'subpath sites'
  instead of checking status (which the guard also returns as 200)
- New test covers the CDN guard catch block with a malformed baseURL,
  exercising the fall-through behaviour to the probe step

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

This PR will trigger a patch release when merged.

@dhavkuma dhavkuma marked this pull request as ready for review June 11, 2026 17:31
@dhavkuma dhavkuma requested a review from MysticatBot June 11, 2026 17:31
@dhavkuma dhavkuma requested review from MysticatBot and jindaliiita and removed request for MysticatBot June 15, 2026 06:05
* @param {string} prodBaseURL the production base URL to match against.
* @returns {boolean} true if all URLs share the same domain as prodBaseURL
*/
function getPathname(urlString) {

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to look for reusable function

return urlList.every(
(stageURL) => {
const stagePath = getPathname(stageURL);
return stagePath !== null && getDomain(stageURL) === prodDomain && stagePath === prodPath;

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider the cases

  1. override urls

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also onboard stagin domain command should not enforce this check yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-reviewed Reviewed by AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants