Skip to content

feat: re-add ElementId as UUID v5 across primitives#16

Merged
vieiralucas merged 9 commits into
mainfrom
re-add-ids
Jun 14, 2026
Merged

feat: re-add ElementId as UUID v5 across primitives#16
vieiralucas merged 9 commits into
mainfrom
re-add-ids

Conversation

@vieiralucas

@vieiralucas vieiralucas commented Jun 14, 2026

Copy link
Copy Markdown
Member

What

  • New: sha1.h/c (Steve Reid's public-domain SHA-1), uuid.h/c (canonical format + parse, RFC 4122 v5 derivation, version/variant bit helper, UuidV5 typed output, streaming UuidV5Ctx), elementid.h/c (ElementId wraps a UuidV5; convergent derivation via v5 over (parent.uuid, key, kind)).
  • Counter / Register / Map each carry an ElementId stamped at create. *_create(arena, id, ...), *_id accessor, clone preserves source id.
  • map_counter / map_register / map_map derive their child's id internally. Two replicas independently calling the same helper at the same slot land on matching ids.
  • element_id(Element) dispatches per kind; aborts on SCALAR (no id concept for scalars).
  • map_merge recursive guard checks ids alongside kind: matching ids recurse in place; mismatched ids at the same slot are two distinct logical elements, resolved via LWW + element_clone with the loser orphaned.
  • New test suites: test_sha1.c (NIST FIPS 180-4 vectors), test_uuid.c (format / parse / v5 conformance / streaming), test_elementid.c (derive determinism + sensitivity + kind-in-derive distinctness + v5 bits). Plus merge_same_kind_different_id_uses_lww_not_recurse regression in test_map.c that locks in the merge id-check.

Why

PR #15 dropped ElementId entirely on the rationale that recursive merge could use positional witness (key, kind). That worked for in-process Map but doesn't compose with the architecture (line 208–210) where every Element gets a stable id for ops, moves, anchors, and cross-references. This PR re-adds ids as RFC 4122 UUID v5 — standard format, deterministic from (parent_id, key, kind), cross-language interop friendly — and avoids the cascade trap of the original by stamping the id exactly once at create.

Lays the foundation for the op model (next PR): ops can target elements by id rather than by tree path.


Summary by cubic

Reintroduces stable ElementId as RFC 4122 UUID v5 across composite elements, derived deterministically from (parent_id, key, kind). Adds SHA‑1/UUID utilities and updates merge to recurse only when kind+id match; otherwise LWW.

  • New Features

    • UUID/crypto: sha‑1 and uuid utilities with canonical format/parse, version/variant bit helper, UUID v5 (one‑shot + streaming). Endian detection hardened with a runtime fallback; uuid_v5_update chunks large inputs; uuid_parse leaves out untouched on failure. Tests and Makefile targets added.
    • Element identity: ElementId (UUID v5) with elementid_derive(parent, key, kind) and elementid_root(). Counter/Register/Map now carry an id (*_create(arena, id) + *_id), clones preserve ids, helpers derive child ids, element_id(Element) dispatches by kind. map_merge recurses only when kind+id match; otherwise LWW with clone. New tests cover SHA‑1, UUID, ElementId, and merge behavior.
    • Docs: corrected elementid_derive signature references in headers; clarified/cleaned UUID version notes; added project‑scoped include guard in sha1.h.
  • Migration

    • Update counter_create, register_create, and map_create to pass an ElementId (elementid_root() for top‑level, or elementid_derive(parent_id, key, kind); map helpers derive ids for you).
    • Use counter_id / register_id / map_id or element_id(Element) where identity is needed (scalars have no id).

Written for commit 7670973. Summary will update on new commits.

Review in cubic

- sha1.h/c: Steve Reid's public-domain SHA-1 (FIPS 180-4). Used internally
  by UUID v5 derivation.
- uuid.h/c: Format/parse for the canonical RFC 4122 8-4-4-4-12 string
  form, version/variant bit helper, UUID v5 generation (one-shot +
  streaming via UuidV5Ctx). Typed UuidV5 output for v5 specifically.
- elementid.h/c: ElementId wraps a UuidV5. Convergent derivation via
  (parent.uuid, key bytes, kind tag) through UUID v5.
- test_sha1.c (6): NIST FIPS 180-4 vectors.
- test_uuid.c (17): format / parse / round-trip / v5 determinism /
  version+variant bits / RFC DNS namespace vector / streaming-vs-one-shot.
- test_elementid.c (22): construction / root sentinel / equality / cmp /
  derive determinism + sensitivity + kind-in-derive distinctness +
  v5 conformance.
- Makefile + .gitignore wired.

Counter / Register / Map id wiring + helpers + recursive merge guard
to land in a follow-up commit.
… checks id

- Counter / Register / Map carry an ElementId stamped at create; *_create
  takes the id and *_id accessors return it. Clone preserves source id.
- map_counter / map_register / map_map derive their child's id via
  elementid_derive(map_id(parent), key, kind), so two replicas independently
  calling the same helper at the same slot land on identical ids.
- element_id(Element) added — dispatches to register_id / counter_id /
  map_id by kind; host_aborts on SCALAR (programmer error).
- map_merge recursive guard now checks ids: same-key + same-kind +
  matching id recurses; same-key + same-kind + mismatched id is two
  distinct logical elements sharing a slot, resolved via LWW + clone
  with the loser orphaned.
- Regression test merge_same_kind_different_id_uses_lww_not_recurse pins
  the id-check behavior — fails without the check, passes with it.
- Header docs (counter.h / register.h / map.h) refreshed: drop "positional
  identity / no identifier" wording from the PR 15 era, describe the
  actual model (id stamped at create, derived convergently, merge keys
  on (kind, id)).

This comment was marked as resolved.

… caveat

- uuid_parse parses into a local 16-byte buffer first, memcpys to caller's
  `out` only on success. Restores the header contract that `out` is
  untouched on failure. Regression test parse_failure_leaves_out_untouched
  pins it.
- sha1.h includes <stdint.h> (system) instead of "stdint.h" (local-first
  search). Verbatim-from-Reid nit.
- Comment on uuid_v5_update documents the size_t -> uint32_t narrowing
  that SHA1Update inherits; our caller (elementid_derive) is well below
  4 GiB so it never bites in practice.

Declined: copilot flagged element_id() can fall off end of function via
the ELEMENT_SCALAR + host_abort + break path. Verified by adding a probe
enum value — clang fires -Wswitch in 5 dispatching switches across
element.c and map.c plus -Wreturn-type cascades, which is exactly the
behavior we want when a new ElementKind is introduced. Keeping the
current shape so the compiler stays on duty.

This comment was marked as resolved.

…nd map.h

- sha1.c #if guards now use defined() checks so undefined macros never
  collapse to 0 == 0 (which would force little-endian on a big-endian
  host with no system headers defining BYTE_ORDER / LITTLE_ENDIAN). Falls
  through to the runtime endian-detection union when macros are missing.
- New SHA1_USE_RUNTIME_ENDIAN build override forces the runtime path even
  on systems where the compile-time macros are present. test-sha1-runtime-
  endian Makefile target exercises this path against the NIST vectors.
- map.h and element.h now include elementid.h explicitly instead of
  relying on transitive includes through counter.h / register.h.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 23 changed files in this pull request and generated no new comments.

@vieiralucas vieiralucas merged commit 1f9d6be into main Jun 14, 2026
3 checks passed
@vieiralucas vieiralucas deleted the re-add-ids branch June 14, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants