Skip to content

fix: update robots.txt and sitemap.xml for improved crawler management#8

Merged
chengjiahao1234 merged 1 commit intomainfrom
fix/sitemap-robots-2026-02-07
Feb 7, 2026
Merged

fix: update robots.txt and sitemap.xml for improved crawler management#8
chengjiahao1234 merged 1 commit intomainfrom
fix/sitemap-robots-2026-02-07

Conversation

@chengjiahao1234
Copy link
Collaborator

Type of Change

Please select the type of change (uncomment one):

✅ Bug Fix

✅ Documentation

Summary

Briefly describe what this PR changes:

Adds/validates a Google Search Console–friendly sitemap and updates robots.txt to match our current site structure and crawling goals (public pages allowed, internal files restricted). Also removes Crawl-delay from the Googlebot section to avoid Search Console warnings while still rate-limiting other bots.

Changes Made

List specific files or sections changed:

  • sitemap.xml: switched to a cleaner pages_list loop and (optionally) emits if page.last_modified_at exists; bundle exec jekyll build produces a valid sitemap.xml and I confirmed it parses as XML.
  • robots.txt: removed Crawl-delay from the Googlebot group (Google ignores it and warns), kept Crawl-delay: 10 for other crawlers, and applied the “block internal files” rules consistently (disallow /images/, /assets/, /_site/, plus /bin/, /CNAME, - README.md, DEVELOPMENT.md, /.htaccess). AI-crawler blocks remain commented out (so they’re allowed).
  • DEVELOPMENT.md: updated the URL checklist line to reflect the sitemap line is templated/should be verified.

Testing Checklist

Please confirm you've tested the following:

  • Changes display correctly on desktop (1920x1080 or similar)
  • Changes display correctly on tablet (768x1024 or similar)
  • Changes display correctly on mobile (375x667 or similar)
  • Tested in light theme
  • Tested in dark theme
  • No broken links or images
  • No console errors
  • Ran bundle exec jekyll serve locally without errors

Screenshots (if applicable)

Related Issues

Additional Notes


By submitting this PR, I confirm:

  • I have tested these changes locally
  • I have followed the code standards in DEVELOPMENT.md
  • I have updated documentation if needed
  • I am ready for this to be reviewed and merged

@chengjiahao1234 chengjiahao1234 merged commit d06fe66 into main Feb 7, 2026
1 check passed
@chengjiahao1234 chengjiahao1234 deleted the fix/sitemap-robots-2026-02-07 branch February 7, 2026 04:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant