Skip to content

Conversation

@marcarl
Copy link
Collaborator

@marcarl marcarl commented Jan 1, 2026

Summary

This PR adds 117 new law names to data/law-names.json by automatically matching missing law names from HTML generation warnings against the SFS document database.

Changes

  • ✅ Added 117 new law entries to data/law-names.json (236 → 353 entries, +50%)
  • ✅ Created scripts/find_missing_law_names.py to search for matching laws
  • ✅ Created scripts/add_law_names.py to filter and select best matches
  • ✅ Updated .gitignore to exclude generated log files

Impact

  • 📉 Reduced "Okänt lagnamn" warnings by 67% (4,013 → 1,304 warnings)
  • 🔗 Improved automatic cross-referencing between laws in HTML output
  • 📊 Better data quality for law name recognition

How it works

The matching algorithm:

  1. Extracts law names from warning messages in HTML conversion logs
  2. Searches through 10,935 SFS JSON files for potential matches
  3. Filters out introduction laws, announcements, and amendment documents
  4. Scores matches based on:
    • Exact law name matching in document title
    • Law type (Lag vs Förordning) alignment
    • Document recency (newer laws preferred)
    • Title simplicity (main laws vs application regulations)
  5. Selects the best-scoring match for each missing law name

Examples of added laws

  • brottsdatalagen → Lag (2018:1696) om Skatteverkets behandling av personuppgifter inom brottsdata
  • djurskyddsförordningen → Djurskyddsförordning (2019:66)
  • budgetlagen → Budgetlag (2011:203)
  • patientsäkerhetslagen → Patientsäkerhetslag (2010:659)
  • socialförsäkringsbalken → Socialförsäkringsbalk (2010:110)

Test plan

  • Run HTML generation with updated law-names.json
  • Verify warning count reduction (4,013 → 1,304)
  • Validate law name matches are appropriate
  • Ensure no duplicate entries in law-names.json

Files changed

  • data/law-names.json - Added 117 new law entries
  • scripts/find_missing_law_names.py - New script for finding matching laws
  • scripts/add_law_names.py - New script for filtering and selecting best matches
  • .gitignore - Exclude generated log files

🤖 Generated with Claude Code

This commit adds 117 new law names to data/law-names.json by automatically
matching missing law names from HTML generation warnings against the SFS
document database.

Changes:
- Added 117 new law entries to data/law-names.json (236 → 353 entries)
- Created scripts/find_missing_law_names.py to search for matching laws
- Created scripts/add_law_names.py to filter and select best matches
- Updated .gitignore to exclude generated log files

Impact:
- Reduced "Okänt lagnamn" warnings by 67% (4,013 → 1,304 warnings)
- Improved automatic cross-referencing between laws in HTML output

The matching algorithm:
1. Extracts law names from warning messages
2. Searches through 10,935 SFS JSON files for matches
3. Filters out introduction laws, announcements, and amendments
4. Scores matches based on exact name matching, law type, and recency
5. Selects the best match for each missing law name

Examples of added laws:
- brottsdatalagen (2018:1696)
- djurskyddsförordningen (2019:66)
- budgetlagen (2011:203)
- patientsäkerhetslagen (2010:659)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@marcarl marcarl changed the title Add 117 missing law names to reduce HTML generation warnings Add 117 missing law names Jan 3, 2026
@marcarl marcarl changed the title Add 117 missing law names Add missing law names Jan 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants