Skip to content

fix(seo): correct sitemap locale prefix and hreflang for pages#58

Merged
JohnRDOrazio merged 1 commit intomainfrom
fix/sitemap-double-locale-and-hreflang
May 1, 2026
Merged

fix(seo): correct sitemap locale prefix and hreflang for pages#58
JohnRDOrazio merged 1 commit intomainfrom
fix/sitemap-double-locale-and-hreflang

Conversation

@JohnRDOrazio
Copy link
Copy Markdown
Member

@JohnRDOrazio JohnRDOrazio commented May 1, 2026

Closes #54.

Summary

Two bugs in app/api/sitemap/[lang]/route.ts for non-EN page entries:

  1. Doubled locale prefix on <loc> — Polylang's uri already includes the language directory (/it/...), and buildUrl was prepending the locale again, producing https://catholicdigitalcommons.org/it/it/governance-2/....
  2. Broken hreflang alternates — every <xhtml:link rel="alternate"> reused that same locale-specific URI, so the English alternate (and all others) pointed at a non-existent IT URL.

getAllPages now requests translations { language { code } uri } from WPGraphQL, resolves to the EN canonical URI (no locale prefix) regardless of which locale is being fetched, and exposes each page's availableLocales so the sitemap emits hreflang alternates only for languages that actually have a translation.

Verification

Spun up the dev server and fetched both sitemaps. Spot-check from it:

<url>
  <loc>https://catholicdigitalcommons.org/it/governance/research/governance-as-code-catholic-technology/</loc>
  <lastmod>2026-04-29T18:25:04</lastmod>
  ...
  <xhtml:link rel="alternate" hreflang="it" href="https://catholicdigitalcommons.org/it/governance/research/governance-as-code-catholic-technology/" />
  <xhtml:link rel="alternate" hreflang="en" href="https://catholicdigitalcommons.org/governance/research/governance-as-code-catholic-technology/" />
  <xhtml:link rel="alternate" hreflang="es" href="https://catholicdigitalcommons.org/es/governance/research/governance-as-code-catholic-technology/" />
  <xhtml:link rel="alternate" hreflang="fr" href="..." />
  <xhtml:link rel="alternate" hreflang="pt" href="..." />
  <xhtml:link rel="alternate" hreflang="de" href="..." />
</url>

Single locale prefix, every hreflang resolves correctly to the right per-locale URL on the same EN canonical path.

Files changed

  • lib/wordpress/queries.tsGET_ALL_PAGES now includes translations { language { code } uri }.
  • lib/wordpress/api.ts — new WPSitemapPage shape (enUri, modified, availableLocales); getAllPages derives EN canonical URI and the locale availability list.
  • app/api/sitemap/[lang]/route.ts — uses page.enUri for the path, passes page.availableLocales so hreflang only includes locales with translations.

Out of scope (deliberately)

Posts and projects still use their own locale-specific slugs in <loc> URLs (e.g. /it/projects/ontokit-2, /it/projects/api-della-bibbia) and emit hreflang alternates for all locales using that same locale-specific slug. That's a separate pre-existing bug — the EN alternate for an IT project would point to /projects/api-della-bibbia even though EN's slug is bible-api. Fixing it requires querying CPT translations the same way pages now do, but the data flow is different enough (CPTs vs pages) that it warrants a separate PR if desired before #55 lands. Not made worse by this PR.

The -2 collision-suffix problem on the IT project slug /projects/ontokit-2 is the same systemic Polylang-free issue documented in discussion #56.

Test plan

  • Hit /api/sitemap/en<loc> has no locale prefix, hreflang alternates use correct per-locale prefixes.
  • Hit /api/sitemap/it<loc> has single /it/ prefix, no doubling.
  • Repeat for es/fr/pt/de.
  • Validate XML against Google Search Console's sitemap test (or any XML sitemap validator).
  • Confirm pages without translations in some locales only emit hreflang for available locales.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Enhanced multi-language sitemap generation with improved hreflang alternate language links that reflect actual per-page language availability instead of global defaults
    • Refined translation metadata handling to provide more accurate language variant detection across content pages
    • Updated data structures to better track available languages for each page to support international SEO

The sitemap was emitting two bugs for non-EN page entries:

1. `<loc>` doubled the locale prefix because Polylang's `uri` field
   already includes `/it/...`, and `buildUrl` prepended the locale
   again — Italian entries became `/it/it/governance-2/...`.
2. `<xhtml:link rel="alternate">` reused that same locale-specific
   URI for every hreflang, so the English alternate (and all others)
   pointed at a non-existent IT URL.

Change `getAllPages` to query `translations` and resolve to the EN
canonical URI (no locale prefix) regardless of which locale is being
fetched. Also expose each page's `availableLocales` so the sitemap
only emits hreflang alternates for languages that actually have a
translation.

Posts and projects continue to use their own locale-specific slugs
in URLs and emit hreflang alternates for all locales — that's a
separate pre-existing issue (e.g. `/projects/ontokit-2` IT slug
gets used for the EN alternate despite EN having a different slug)
and is out of scope here. Will track separately if it warrants a
fix before #55 lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e5f7a6b9-f896-4217-ab8d-506c2ec78644

📥 Commits

Reviewing files that changed from the base of the PR and between 92646d8 and 10dbdb3.

📒 Files selected for processing (3)
  • app/api/sitemap/[lang]/route.ts
  • lib/wordpress/api.ts
  • lib/wordpress/queries.ts

📝 Walkthrough

Walkthrough

The changes update the sitemap generation to use per-page language availability and English URIs instead of global locales and locale-prefixed URIs. The GraphQL query now fetches translation metadata, the API transforms the response into a new WPSitemapPage type with canonical URIs and available locales, and the sitemap handler uses these page-specific values for hreflang link generation.

Changes

Cohort / File(s) Summary
Sitemap Alternate Links
app/api/sitemap/[lang]/route.ts
Generalized buildAlternateLinks to accept optional alternateLocales parameter; updated urlEntry to use per-page alternateLocales instead of global locales; switched page URL from page.uri to page.enUri for canonical representation.
Page Data Transformation
lib/wordpress/api.ts
Added exported WPSitemapPage interface with enUri, modified, and availableLocales fields; updated getAllPages return type and implementation to construct page-specific canonical URIs (English fallback) and derive available locales from translations metadata.
GraphQL Query Metadata
lib/wordpress/queries.ts
Extended GET_ALL_PAGES query to fetch translations selection containing language codes and translated URIs for each page.

Sequence Diagram

sequenceDiagram
    participant GQL as GraphQL Query
    participant API as getAllPages()
    participant Sitemap as Sitemap Handler

    GQL->>GQL: Fetch pages with translations metadata
    GQL-->>API: Return pages + translations
    API->>API: Transform per-page data:<br/>select enUri, collect locales
    API-->>Sitemap: Return WPSitemapPage[]
    Sitemap->>Sitemap: For each page entry:<br/>buildAlternateLinks(page.enUri,<br/>page.availableLocales)
    Sitemap-->>Sitemap: Generate per-page hreflang links
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 Hops through locales, no more confusion,
Per-page translations, a clearer fusion!
English URIs guide the way,
Alternate links work—hooray, hooray! 🌍

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main fix: correcting sitemap locale prefix (doubled locale bug) and hreflang alternates for pages.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sitemap-double-locale-and-hreflang

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 9 complexity · 0 duplication

Metric Results
Complexity 9
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@JohnRDOrazio
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@JohnRDOrazio JohnRDOrazio merged commit ea84d7a into main May 1, 2026
10 checks passed
@JohnRDOrazio JohnRDOrazio deleted the fix/sitemap-double-locale-and-hreflang branch May 1, 2026 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(seo): sitemap-[lang].xml emits doubled locale prefix and broken hreflang

1 participant