Skip to content

Conversation

@DanielCliftonGuardian
Copy link
Contributor

@DanielCliftonGuardian DanielCliftonGuardian commented Jan 28, 2026

What does this change?

TextBlockComponent currently returns null when it encounters unknown tags like <div> (common in 2013/14 legacy content) or <h1>. This causes content loss because recursion stops and all child elements are discarded. This PR updates sanitiserOptions to transform div to p and h1 to h2 before they reach the renderer.

Why?

  • Restores missing content
  • Fixes semantic errors
  • Reduces Log Noise

Part of guardian/frontend#28496

@DanielCliftonGuardian DanielCliftonGuardian changed the title Fix content loss in TextBlockComponent Fix content loss by transforming legacy div and h1 tags Jan 28, 2026
@DanielCliftonGuardian DanielCliftonGuardian self-assigned this Jan 28, 2026
@DanielCliftonGuardian DanielCliftonGuardian added this to the Health milestone Jan 28, 2026
@DanielCliftonGuardian DanielCliftonGuardian added the fix Departmental tracking: fix label Jan 28, 2026
@DanielCliftonGuardian DanielCliftonGuardian marked this pull request as ready for review January 28, 2026 15:25
@github-actions
Copy link

Hello 👋! When you're ready to run Chromatic, please apply the run_chromatic label to this PR.

You will need to reapply the label each time you want to run Chromatic.

Click here to see the Chromatic project.

@DanielCliftonGuardian DanielCliftonGuardian added the run_chromatic Runs chromatic when label is applied label Jan 28, 2026
@github-actions github-actions bot removed the run_chromatic Runs chromatic when label is applied label Jan 28, 2026
};
},
div: 'p',
h1: 'h2',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, what error did you find in the logs that could be fixed by this change? And what is the reason for making div to be p and h1 to become h2? Why couldn't we keep them as div and h1

Copy link
Contributor Author

@DanielCliftonGuardian DanielCliftonGuardian Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This generates around 2000 warns a day see logs . Because we return null on unknown tags, we discard all nested children. In old sports blogs, DIV wrappers cause entire team lineups to be deleted from the page.
  2. We transform H1 to H2 so that we maintain the best practice of only one H1 per page.
  3. Transforming DIV to P allows us to inherit standard styling. This is much cleaner than adding redundant switch cases and CSS to handle legacy elements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Departmental tracking: fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants