Skip to content

validate unscIds#158

Open
nvmbrasserie wants to merge 1 commit intomainfrom
validate-unsc-id
Open

validate unscIds#158
nvmbrasserie wants to merge 1 commit intomainfrom
validate-unsc-id

Conversation

@nvmbrasserie
Copy link
Contributor

@nvmbrasserie nvmbrasserie commented Jan 20, 2026

Use UNSC ID validation to distinguish genuine UN IDs from Argentine national IDs

Problem

In ar_repet, we can't distinguish between:

  • Genuine UN SC IDs: QDi.002, CDi.030 (7 chars, format: [REGIME][i/e].[NUM])
  • Argentine national IDs: ArP.00234 (also 7-9 chars)

This causes invalid data in unscId field.

Solution

Implement proper UNSC ID validation using pattern ^[A-Z]{2,3}[ie]\.\d{3,}$ to:

  • Accept only valid UN SC permanent reference numbers
  • Reject ArP.XXXXX and other non-UN identifiers
  • Prevent storing invalid data

@nvmbrasserie nvmbrasserie changed the title valudate unscIds validate unscIds Jan 20, 2026
@nvmbrasserie nvmbrasserie added the enhancement New feature or request label Jan 20, 2026

# Invalid - wrong pattern
assert UNSC.normalize("QDx.002") is None # 'x' not 'i' or 'e'
assert UNSC.normalize("qdi.002") is None # lowercase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could/should the class tolerate lowercase or missing dot or even normalize the result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather start on the stricter end and then soften it a bit if necessary. We'll have a good grasp of the data as soon as we roll it out. From what I've seen, it's either perfectly clean or completely broken

@leonhandreke
Copy link
Contributor

Nice! I guess the alternative here would be to add a quick regex to ar_repet - but I feel the field is maybe important enough to build this into the infra. I don't have a strong opinion whether it should be in rigour or FtM, I guess it could be in the latter as well since it's not a well-specced official format but more of a convention?

@nvmbrasserie
Copy link
Contributor Author

since we have other validators in rigour too, I think it's the most suitable place for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants