Skip to content

Add sign-subtree wire format to tlog_witness crate#228

Open
lukevalenta wants to merge 4 commits intomainfrom
lvalenta/tlog-witness-sign-subtree
Open

Add sign-subtree wire format to tlog_witness crate#228
lukevalenta wants to merge 4 commits intomainfrom
lvalenta/tlog-witness-sign-subtree

Conversation

@lukevalenta
Copy link
Copy Markdown
Contributor

Summary

Adds wire-format parsers and serializers for the optional sign-subtree endpoint described in c2sp.org/tlog-witness#sign-subtree (currently in draft at C2SP PR #245). The endpoint lets a client ask a witness to cosign a subtree of a previously-cosigned tree, scoped to a [start, end) range, accompanied by a consistency proof to a reference checkpoint and (optionally) DoS-protection cosignatures from other known witnesses.

This PR is wire-format only — no crypto and no witness_worker handler. The handler that wires these parsers up to a SubtreeV1CheckpointSigner (PR #227) and switches the worker default to ML-DSA-44 will follow in a separate PR.

Commits

  1. signed_note: add Note::signatures() accessor — small symmetric addition next to text(). Lets callers count or inspect signatures without going through Note::verify() (which requires verifiers and may error). Needed by the sign-subtree serializer to decide whether to emit the checkpoint signature block and to enforce the spec's MAX_CHECKPOINT_SIGNATURES = 8 cap.
  2. tlog_witness: extract shared error and helpers into common module — pure refactor. Moves TlogWitnessError, MAX_CONSISTENCY_PROOF_LINES, CONTENT_TYPE_TLOG_SIZE, and parse_proof_line out of add_checkpoint.rs into a new private common.rs. No behavior change.
  3. tlog_witness: add sign-subtree wire format — the new sign_subtree module with parsers, serializers, and 13 unit tests (request roundtrips, response roundtrip, spec-example shape pin, all bound and ordering rejection paths).

Wire format covered

  • Request body: subtree <start> <end>\n range line, base64 subtree hash, up to 8 subtree-cosignature lines (em-dash-prefixed), up to 63 consistency-proof hashes, blank-line separator, then a reference checkpoint that is either a full signed note or a bare body when no checkpoint signatures are supplied.
  • Response body: identical shape to add-checkpoint's response (one or more signed-note signature lines).

The parser enforces the spec's ordering rule (cosignature lines MUST appear before proof lines) and all spec-mandated upper bounds.

What's NOT in this PR

  • No /sign-subtree HTTP handler in witness_worker — that comes in a follow-up.
  • No verification strategy (the spec lists three; the handler PR will hardcode the stateless one to start).
  • No change to the witness's default cosignature algorithm — add-checkpoint still produces cosignature/v1 Ed25519 signatures. The follow-up PR will add ML-DSA-44 / subtree/v1 support and switch the default, deriving the algorithm from the OID in the configured PEM signing key.

Pre-push checks

All four pass on each commit:

  • cargo clippy --workspace --all-targets -- -Dwarnings -Dclippy::pedantic
  • cargo test (27 tlog_witness tests: 14 add_checkpoint + 13 sign_subtree, plus the rest of the workspace)
  • cargo fmt --all --check
  • cargo machete

Symmetric with the existing Note::text() accessor: returns the
signatures slice in the order they appear on the wire. Useful for
callers that need to inspect or count the signatures without going
through Note::verify() (which requires verifiers and may error out
when no known key is present).

Note::from_bytes() always returns a note with at least one
signature, since the signed-note wire format requires a non-empty
signature block; the slice is empty only for notes constructed via
Note::new() with no existing_sigs (e.g. a bare checkpoint body).
Move TlogWitnessError, MAX_CONSISTENCY_PROOF_LINES,
CONTENT_TYPE_TLOG_SIZE, and the parse_proof_line helper out of
add_checkpoint.rs into a new private common.rs module. The helper
becomes pub(crate) so additional endpoint modules can reuse it; the
public types continue to be re-exported from the crate root.

Pure refactor; no behavior change. Prepares for a sibling
sign_subtree module that needs the same error type, proof-line cap,
and base64 hash-line parser.
@lukevalenta
Copy link
Copy Markdown
Contributor Author

/bonk review

@lukevalenta lukevalenta self-assigned this Apr 30, 2026
Comment thread crates/tlog_witness/src/sign_subtree.rs
@lukevalenta
Copy link
Copy Markdown
Contributor Author

/bonk review

@ask-bonk
Copy link
Copy Markdown

ask-bonk Bot commented Apr 30, 2026

@lukevalenta Bonk workflow was cancelled.

View workflow run · To retry, trigger Bonk again.

Copy link
Copy Markdown

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wire-format-only PR is well-scoped, the refactor in commit 2 is clean, and the test set is thorough for happy-path roundtrips and bound rejections. A few correctness concerns to flag before this gets wired up to a handler.

Main concerns (inline):

  1. Bare-checkpoint fallback silently accepts malformed signed notes. parse_reference_checkpoint falls back to Note::new(bytes, &[]) whenever Note::from_bytes fails, but Note::new only validates UTF-8 / control chars / trailing \n. A payload like origin\nsize\nhash\n\nGARBAGE-NOT-A-SIG\n will fail from_bytes (bad sig line) and then succeed Note::new as a "bare body" whose text() includes the bogus signature line. The handler would then see a checkpoint with no signatures and route it to the stateful path, parsing what was meant to be a signed checkpoint as if the client had asked for stateful verification.

  2. Per the self-review on the subtree line: yes, the request body needs an explicit upper bound. Note::from_bytes caps the embedded checkpoint at MAX_NOTE_SIZE (1 MiB), but the header preceding the \n\n has no size cap, and individual subtree-cosignature lines aren't bounded either (NoteSignature::from_bytes will base64-decode arbitrarily large blobs). 8 cosignature lines × N bytes each, plus 63 proof lines (small), plus header — if body reaches the parser as &[u8] already in memory the worker has already paid the allocation cost, but the parser itself shouldn't compound it. Either reject body.len() > SOME_MAX up front or bound each line.

  3. u64 parse accepts +0, +5, etc. parse_decimal only rejects leading 0s; Rust's u64::from_str accepts a leading +. The spec's tree-size encoding rules are stricter than that. Same issue exists in add_checkpoint::parse_old_line, so this is a pre-existing wart you may or may not want to fix here.

Minor:

  • parse_signature_line is a one-line wrapper around NoteSignature::from_bytes(line.as_bytes()).map_err(TlogWitnessError::Note); could be inlined at the call site for symmetry with how parse_proof_line is used (it is non-trivial; this one isn't).
  • The request_roundtrip_bare_checkpoint test asserts !body.ends_with(b"\n\n"), which is the right invariant, but the assertion only fires because the base64-encoded subtree hash never ends in \n. A more direct test would assert the exact suffix is <root_b64>\n rather than relying on the negative.
  • Doc comment on SignSubtreeRequest::checkpoint (lines 87–91) says the bare-body case has empty signatures() and bare text() — accurate, but worth explicitly noting that callers must inspect signatures().is_empty() to decide whether the client requested stateful verification, since otherwise the two cases are indistinguishable post-parse.

Pre-push checks confirmed locally: cargo test -p tlog_witness → 27 passed.

Comment thread crates/tlog_witness/src/sign_subtree.rs
Comment thread crates/tlog_witness/src/sign_subtree.rs
Comment thread crates/tlog_witness/src/sign_subtree.rs Outdated
@ask-bonk
Copy link
Copy Markdown

ask-bonk Bot commented Apr 30, 2026

Review posted. Summary of what I left on PR #228:

Top-level comment highlighting the wire-format-only scope is well-handled, the refactor is clean, tests are thorough — then three concrete concerns.

Inline comments:

  1. sign_subtree.rs:357 (bare-checkpoint fallback ambiguity)parse_reference_checkpoint falls through Note::from_bytes failure into Note::new(bytes, &[]), and Note::new only validates UTF-8/control chars/trailing \n. A malformed signed checkpoint (e.g. origin\nsize\nhash\n\nGARBAGE-NOT-A-SIG\n) silently succeeds as a "bare body" whose text() includes the bogus signature line. Suggested gating the fallback on !bytes.windows(2).any(|w| w == b"\n\n").

  2. sign_subtree.rs:104 (size limits, picking up the author's own self-review) — embedded checkpoint is capped via MAX_NOTE_SIZE, line counts are bounded, but the header before \n\n and each individual cosignature line size are not. NoteSignature::from_bytes will decode arbitrarily large base64 blobs. Proposed an explicit MAX_SIGN_SUBTREE_REQUEST_SIZE shared via common.rs.

  3. sign_subtree.rs:333 (u64 parser accepts +N)parse_decimal only rejects leading 0s; u64::from_str happily accepts +0/+5. Spec is stricter. Same gap in add_checkpoint::parse_old_line.

Plus minor nits in the top-level comment: trivial parse_signature_line wrapper, the request_roundtrip_bare_checkpoint negative assertion could be a positive suffix check, and the SignSubtreeRequest::checkpoint doc could spell out that callers must inspect signatures().is_empty() to distinguish stateful vs. stateless verification requests.

github run

Parsers and serializers for the optional sign-subtree endpoint
described in c2sp.org/tlog-witness#sign-subtree (currently in draft
at C2SP PR #245). The endpoint lets a client ask a witness to cosign
a subtree of a previously-cosigned tree, scoped to a [start, end)
range, accompanied by a consistency proof to a reference checkpoint
and (optionally) DoS-protection cosignatures from other known
witnesses.

Wire format covered:
- request body: subtree range line, base64 subtree hash, up to 8
  subtree-cosignature lines (em-dash-prefixed), up to 63
  consistency-proof hashes, blank-line separator, then a reference
  checkpoint that is either a full signed note or a bare body when
  no checkpoint signatures are supplied.
- response body: identical shape to add-checkpoint's response
  (one or more signed-note signature lines).

Bound and ordering rejection paths are exercised by tests, including
a pin against the spec's literal example shape. No crypto, no
witness_worker handler — those land in a follow-up that wires the
parsers up to a SubtreeV1 cosigner.
Both `add-checkpoint` and `sign-subtree` request parsers gain two
defenses against hostile input:

1. **Request-body size cap.** Add `MAX_REQUEST_BODY_SIZE`
   (1 MiB + 64 KiB) to `common.rs` and reject any body above that
   cap up front. The line-count caps (≤63 proof lines, ≤8 cosig
   lines) bound how many lines we'll process, but they don't bound
   line *sizes* — `NoteSignature::from_bytes` will base64-decode any
   blob the cosignature line contains, and an attacker could feed
   the parser an 8-cosig × 100 MB payload. The cap covers a 1 MiB
   embedded checkpoint (per `signed_note::MAX_NOTE_SIZE`) plus 64
   KiB of envelope headroom for the worst-case sign-subtree
   header.

2. **Stricter ASCII-decimal parsing for tree sizes.** Lift the
   shared validation into a new `parse_tree_size_decimal` helper in
   `common.rs` that:

   - rejects empty strings
   - rejects leading zeros (except for "0" itself)
   - rejects any non-ASCII-digit character — notably the leading
     `+` that Rust's `u64::from_str` would otherwise accept,
     turning `old +0` and `subtree +0 +1` into legitimately spec-
     conformant-looking but actually-rejected requests.

   Both `parse_old_line` (`add_checkpoint`) and
   `parse_subtree_range_line` (`sign_subtree`) now delegate to the
   shared helper.

Tests: 4 new (`request_rejects_oversized_body` and
`request_rejects_leading_plus_in_*` for both endpoints).

Addresses review feedback on PR #228.
@lukevalenta lukevalenta force-pushed the lvalenta/tlog-witness-sign-subtree branch from c23cde5 to 46ccb5f Compare April 30, 2026 23:44
@lukevalenta
Copy link
Copy Markdown
Contributor Author

/bonk review

Copy link
Copy Markdown

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewing 46ccb5f. All three concerns from the previous round are addressed cleanly:

  1. Bare-checkpoint fallbackparse_reference_checkpoint (sign_subtree.rs:363) now discriminates on bytes.windows(2).any(|w| w == b"\n\n") rather than falling through on a Note::from_bytes error. The new request_rejects_malformed_signed_checkpoint test pins the previously-silently-accepted origin\nsize\nhash\n\nGARBAGE-NOT-A-SIG\n payload to a Note(_) error.
  2. Body-size capMAX_REQUEST_BODY_SIZE (1 MiB + 64 KiB) lives in common.rs and is checked up front in both parsers, with request_rejects_oversized_body covering each.
  3. u64::from_str +N accepted — lifted into parse_tree_size_decimal (common.rs:78) with strict ASCII-digit check, and parse_old_line / parse_subtree_range_line both delegate. Both sides have a request_rejects_leading_plus_in_* test.

cargo test -p tlog_witness → 32 passed locally; clippy -Dpedantic clean; cargo fmt --all --check clean.

One small docstring nit on MAX_REQUEST_BODY_SIZE (common.rs:46): the comment says "1 MiB embedded checkpoint (per signed_note::MAX_NOTE_SIZE)" but signed_note::MAX_NOTE_SIZE is 1_000_000 bytes (decimal MB, not 1 MiB ≈ 1,048,576), and it's pub(crate)-private so the rustdoc link [\signed_note::MAX_NOTE_SIZE`]aimed at docs.rs won't resolve. Doesn't affect correctness — the cap (1,114,112 bytes) is a strict superset of any well-formed checkpoint plus envelope — but the prose and the broken intra-doc link are easy to tighten in a followup. Could either drop the link, change "1 MiB" to "1 MB" (matching the actual value), or exposeMAX_NOTE_SIZEaspub` if external crates are expected to be aware of it.

Approving as-is; nothing here blocks the wire-format-only scope. Looking forward to the handler PR.

@ask-bonk
Copy link
Copy Markdown

ask-bonk Bot commented May 1, 2026

Review posted to PR #228.

Summary of what I left:

All three concerns from my prior review are addressed cleanly in commit 46ccb5f:

  1. Bare-checkpoint fallbackparse_reference_checkpoint now discriminates on \n\n presence (sign_subtree.rs:363) rather than falling through on a Note::from_bytes error; the new request_rejects_malformed_signed_checkpoint test pins the regression.
  2. Body-size capMAX_REQUEST_BODY_SIZE (1 MiB + 64 KiB) added to common.rs, checked up front in both parse_add_checkpoint_request and parse_sign_subtree_request.
  3. u64::from_str accepts +N — fixed via the new shared parse_tree_size_decimal helper with a strict ASCII-digit check; both parse_old_line and parse_subtree_range_line now delegate. Both endpoints have request_rejects_leading_plus_in_* tests.

Local pre-push checks pass: cargo test -p tlog_witness → 32 passed, clippy with -Dpedantic clean, cargo fmt clean.

One minor docstring nit: the comment on MAX_REQUEST_BODY_SIZE says "1 MiB embedded checkpoint (per signed_note::MAX_NOTE_SIZE)", but MAX_NOTE_SIZE is 1_000_000 bytes (1 MB, not 1 MiB), and it's pub(crate)-private so the rustdoc link won't resolve on docs.rs. Doesn't affect correctness — easy followup. Approved as-is.

github run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mtc Merkle Tree Certificates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant