🐛 Fixed stored XSS and mangled structured data in JSON-LD output by 9larsons · Pull Request #29013 · TryGhost/Ghost

9larsons · 2026-07-01T15:09:39Z

Summary

Ghost builds a schema.org object in core/frontend/meta/schema.js and injects it, JSON-serialised, into an inline <script type="application/ld+json"> block via ghost_head. Escaping was done per-field with escapeExpression (HTML-entity encoding), which was wrong for this context in two ways:

Stored XSS (the bug 🐛 Fixed stored XSS in structured data via unescaped tag names #28957 targets). Three Editor-controlled fields were missed entirely — tag name, post keywords, and site title. A tag named foo</script><script>alert(document.domain)</script> broke out of the inline script and executed for any anonymous visitor on tag/post/home pages.
Corrupted structured data. escapeExpression HTML-entity-encodes the JSON value, so legitimate content was mangled for the search engines that consume JSON-LD: a tag Tom & Jerry was serialised as Tom & Jerry, and "quoted" titles/URLs/sameAs links came out as " / = etc. This has been the behaviour on main for every already-"escaped" field (headline, description, author name, social URLs), not just the three missed ones.

Fix

Escape once, at the render sink, on the serialised JSON string — mirroring Ghost's existing json helper:

JSON.stringify(meta.schema, null, '    ')
    .replace(/</g, '\\u003C')
    .replace(/>/g, '\\u003E')
    .replace(/&/g, '\\u0026')
    .replace(/\u2028/g, '\\u2028')
    .replace(/\u2029/g, '\\u2029');

Neutralises the </script> breakout for every field, current and future — no more field-by-field allow-listing that the next new field can silently miss.
Uses JS-string unicode escapes, which JSON parsers decode back to the original characters, so structured data is correct again (Tom & Jerry round-trips to Tom & Jerry).
The now-redundant per-field escapeExpression calls are removed from schema.js, so names, URLs and keywords are no longer double/entity-encoded.

This supersedes #28957, which fixed the XSS at the field level but left (and slightly extended) the data-corruption problem — see the snapshot change there flipping a benign Tom & Jerry to Tom & Jerry.

Testing

Added a ghost_head regression test proving a </script> payload in a tag name produces no literal </script> in the JSON-LD block, yet still parses back to the original tag name.
Added an assertion that the benign Tom & Jerry keywords value round-trips through the rendered JSON-LD, guarding against re-introducing entity encoding.
Updated schema.test.js (now asserts raw values — escaping is the sink's job) and the ghost-head snapshot (" → \", Tom & Jerry → Tom & Jerry).
pnpm test:single for both files: 117 passing; lint clean.

coderabbitai · 2026-07-01T15:10:16Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 371a2d00-930a-42e3-9416-8eb92fed44e3

📥 Commits

Reviewing files that changed from the base of the PR and between de1fcb7 and c1d2b62.

⛔ Files ignored due to path filters (1)

ghost/core/test/unit/frontend/helpers/__snapshots__/ghost-head.test.js.snap is excluded by !**/*.snap

📒 Files selected for processing (4)

ghost/core/core/frontend/helpers/ghost_head.js
ghost/core/core/frontend/meta/schema.js
ghost/core/test/unit/frontend/helpers/ghost-head-jsonld-escaping.test.js
ghost/core/test/unit/frontend/meta/schema.test.js

Walkthrough

This change moves JSON-LD escaping from field-level construction in schema.js to a single escaping point in ghost_head.js. A new escapeJsonLd function escapes angle brackets and line/paragraph separators in the serialized schema string before it is embedded in the inline application/ld+json script tag, preventing script/comment breakout. schema.js no longer applies escapeExpression to individual fields (names, descriptions, sameAs URLs, headline), relying on raw values that are HTML-escaped once at render time. Corresponding unit tests in schema.test.js are updated to expect raw unescaped values, and a new test file validates escapeJsonLd's breakout-prevention and round-trip JSON parsing behavior.

Changes

Cohort / File(s)	Summary
ghost_head.js	Added `escapeJsonLd` helper, applied to serialized schema before script injection, exported for testing
schema.js	Removed `escapeExpression` usage; fields now pass through raw values with explanatory comment
schema.test.js	Updated `sameAs` and related test expectations to raw unescaped values; renamed tests
ghost-head-jsonld-escaping.test.js (new)	Added unit tests for `escapeJsonLd` covering breakout prevention and round-trip parsing

Sequence Diagram(s)

sequenceDiagram
  participant SchemaJS as schema.js getSchema
  participant GhostHead as ghost_head helper
  participant EscapeJsonLd as escapeJsonLd
  participant Script as inline script tag

  SchemaJS->>GhostHead: raw schema object (unescaped fields)
  GhostHead->>GhostHead: JSON.stringify(schema, null, 4 spaces)
  GhostHead->>EscapeJsonLd: serialized JSON string
  EscapeJsonLd->>EscapeJsonLd: replace <, >, \u2028, \u2029
  EscapeJsonLd-->>GhostHead: escaped JSON string
  GhostHead->>Script: inject escaped JSON string

Related issues: None specified.

Related PRs: None specified.

Suggested labels: security, frontend, needs review

Suggested reviewers: None specified.

Poem:
A rabbit hopped through JSON's maze,
Found angle brackets in a script-tag haze,
"Escape me once, at the very end,"
It said, "not field by field, my friend!"
Now schema.js runs raw and free,
While ghost_head guards the boundary. 🐰✨

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: fixing stored XSS and JSON-LD data corruption.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The description accurately matches the code and test changes around sink-level JSON-LD escaping and removal of per-field escaping.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch slars/jsonld-sink-escaping

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

nx-cloud · 2026-07-01T15:10:57Z

🤖 Nx Cloud AI Fix

Ensure the fix-ci command is configured to always run in your CI pipeline to get automatic fixes in future runs. For more information, please see https://nx.dev/ci/features/self-healing-ci

View your CI Pipeline Execution ↗ for commit c1d2b62

Command	Status	Duration	Result
`nx run ghost:test:ci:integration`	✅ Succeeded	2m 34s	View ↗
`nx run ghost:test:integration`	✅ Succeeded	2m 49s	View ↗
`nx run ghost:test:legacy`	✅ Succeeded	2m 58s	View ↗
`nx run ghost:test:e2e`	✅ Succeeded	2m 29s	View ↗
`nx run-many --target=build --projects=tag:publi...`	✅ Succeeded	<1s	View ↗
`nx run-many -t lint -p ghost`	✅ Succeeded	35s	View ↗
`nx run-many -t test:unit -p ghost`	✅ Succeeded	29s	View ↗
`nx run @tryghost/admin:build`	✅ Succeeded	9s	View ↗
`Additional runs (2)`	✅ Succeeded	...	View ↗

💡 Verify your cache is correct by running tasks in a sandbox. Read docs ↗

☁️ Nx Cloud last updated this comment at 2026-07-01 17:50:34 UTC

9larsons · 2026-07-01T16:23:17Z

This is a regression - see snapshots. Needs reworking.

ref #28957 - tag names, post keywords, and the site title were serialized raw into the inline <script type="application/ld+json"> block, letting an Editor-controlled value like `foo</script><script>...` break out of the script element and run arbitrary JS for anonymous visitors on tag and post pages - escapes the breakout-relevant characters (< > U+2028 U+2029) as JSON \u escapes at the single serialization boundary in ghost_head, so every field is covered at once instead of relying on per-field escaping that is easy to forget - removed the per-field escapeExpression calls from schema.js: HTML-entity escaping is the wrong layer here — JSON-LD consumers (Google et al.) parse the block as JSON and never HTML-decode, so it silently corrupted structured data (e.g. `Tom & Jerry` was indexed as `Tom & Jerry`). JSON \u escapes are both safe and lossless, so legitimate `& ' "` now round-trip correctly - added a regression test proving breakout is neutralised while data round-trips, and updated the snapshot/assertions that were capturing the old corruption

9larsons mentioned this pull request Jul 1, 2026

🐛 Fixed stored XSS in structured data via unescaped tag names #28957

Closed

3 tasks

9larsons marked this pull request as draft July 1, 2026 16:23

9larsons force-pushed the slars/jsonld-sink-escaping branch from de1fcb7 to c1d2b62 Compare July 1, 2026 17:39

9larsons marked this pull request as ready for review July 1, 2026 18:09

9larsons merged commit 895d95c into main Jul 1, 2026
42 of 43 checks passed

9larsons deleted the slars/jsonld-sink-escaping branch July 1, 2026 18:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

🐛 Fixed stored XSS and mangled structured data in JSON-LD output#29013

🐛 Fixed stored XSS and mangled structured data in JSON-LD output#29013
9larsons merged 1 commit into
mainfrom
slars/jsonld-sink-escaping

9larsons commented Jul 1, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

Review failed

Uh oh!

nx-cloud Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

9larsons commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

9larsons commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Testing

Uh oh!

coderabbitai Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

nx-cloud Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Nx Cloud AI Fix

Uh oh!

9larsons commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

9larsons commented Jul 1, 2026 •

edited

Loading

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

nx-cloud Bot commented Jul 1, 2026 •

edited

Loading