feat!: compile-time CEL expansion; drop runtime interpreter#10
Merged
Conversation
Transpile every protovalidate CEL rule to native Rust at codegen time.
The runtime no longer hosts an interpreter — `validate()` is a direct
struct field walk with zero per-call `Value` / `HashMap` allocations.
## Plugin (`protoc-gen-protovalidate-buffa`)
- New `emit/cel_compile.rs`: a typed AST visitor over the upstream `cel`
crate's parser. Each CEL expression compiles to a `CompileOutput`
carrying a `TokenStream`, its `CelType` (`Int` / `UInt` / `Double` /
`Bool` / `Str { owned }` / `Bytes { owned }` / `List<T>` /
`Map<K, V>` / `Duration` / `Timestamp` / `Message(schema)` /
`MessageRef(fqn)` / `Dyn`), and a `needs_now` flag. The visitor
handles literals, idents, selects, calls, comprehensions
(`all` / `exists` / `map` / `filter` / `.map(filter, expr)`),
string/bytes/list/map ops, `has()`, `size()`, `is_ip` / `is_ip_prefix`
(incl. dynamic `(ver, strict)` arg dispatch), `is_hostname` /
`is_email` / `is_uri` / `is_uri_ref`, regex match, `dyn()`, and the
Duration/Timestamp constructors. Const-fold compile-time literals
(e.g. `rule.gte` in predefined CEL).
- `emit/cel.rs`: protovalidate-specific integration glue. Binds `this` /
`rule` / `now`, wraps tokens in violation-pushing code with the right
`FieldPath` / `FieldPathElement`, dispatches per `FieldKind`, handles
WKT first-class cases. Adds `SchemaIndex` so `(field).cel` /
`(message).cel` on sub-message-typed fields can resolve fields
through a `SchemaLookup` trait without embedding cyclic schemas.
- Two-kind fallback model: when a construct isn't transpilable the
compiler returns `FallbackKind::Unsupported`; when it proves the
expression would always raise a CEL runtime error (e.g.
`dyn(this).<unknown_field>`) it returns `FallbackKind::RuntimeError`.
Both paths emit a `__cel_runtime_error__` violation marker in place
of the rule — distinguishing them just sharpens the diagnostic
message.
- Wire native paths through `repeated.rs` (items / keys / values CEL,
items.predefined) and the message-level emit pipeline.
- `scan.rs`: add `RuleConst` enum + `decode_*_rule_const` so the
transpiler can fold `rule.foo` references to Rust literals from the
raw extension bytes; delete `rule_value_expr`-style stringification.
Conformance: 2872 / 2872 against the upstream
`protovalidate-conformance` harness (proto2, proto3, editions 2023).
## Transpiler unit tests
The conformance suite exercises rules end-to-end but barely exercises
CEL itself — most cases reduce to "predefined rule X on field Y" with
fixed expressions that constant-fold. To pin down transpiler behavior
on its own, the test module in `emit/cel_compile.rs` carries 64 unit
tests grouped by risk bucket:
1. Type-system edge cases — cross-type comparison, int/uint/double
casts, negation, arithmetic, modulo.
2. Comprehensions — `all` / `exists` / `map` / `filter` /
`.map(filter, expr)`, nesting, predicate captures, short-list
literals.
3. `has()` semantics — one test per `SchemaFieldKind` (Scalar variants,
StringLike, Optional, Wrapper, Message, Repeated), plus
unknown-field and `dyn` paths.
4. Sub-message field resolution — `MessageRef` not in index, deep
chains, self-referential schemas.
5. String semantics — `size()` on unicode strings (chars not bytes),
concat, contains, ends-with, regex match (literal + dynamic
pattern), lower/upperAscii, indexOf, substring.
6. Map indexing — string/int/bool keys, `size()`, `.all()` over keys.
7. `Compiled.constant` folding — Int/UInt/Double/Bool/Str RuleConst
inlines as a Rust literal in the emitted tokens; List form.
Three tests document gaps the suite never trips:
- `cross_type_int_uint_eq_currently_unsupported` — `int == uint` needs
an explicit cast today (`op_cmp` rejects the mix).
- `empty_list_literal_comprehension_currently_unsupported` —
`[].all(x, ...)` fails type-check (empty literal → Dyn element).
- `in_operator_on_map_currently_unsupported` — CEL's
`key in mapValue` doesn't work; `op_in` requires a list rhs.
All 86 unit tests run in milliseconds.
## Runtime (`protovalidate-buffa`)
- Drop the `cel` interpreter dependency entirely.
- `src/cel.rs` collapses to a 135-line helper module: `CelScalar`
trait (scalar widening to `i64` / `u64` / `f64`, including
`EnumValue<E>`), `duration_from_secs_nanos`,
`timestamp_from_secs_nanos`, and `now_local`. That's it — no
`Value`, no `Context`, no `Program`.
- Delete the `wkt_cel.rs` test (covered the removed interpreter path).
- `ValidationError::runtime_error` doc updated to reflect that
always-runtime-error CEL is now diagnosed at codegen time.
## Plugin Cargo.toml
- `cel = { version = "0.13", default-features = false }` — parse-only;
drops the runtime evaluator, regex matching, and chrono bindings.
## Cleanups along the way
- Drop unused dependencies: `buffa-types` in `protovalidate-buffa`,
`protovalidate-buffa-protos`, and `protovalidate-buffa-conformance`;
`connectrpc` in `protovalidate-buffa-macros` (proc-macro emits paths
that resolve through the user's transitive deps). Caught by
`cargo +nightly udeps`.
- Delete dead `ctx::CodeGenContext` placeholder module (the work it
was reserved for shipped via `SchemaIndex` instead).
- `#[cfg(feature = "connect")]` gate the `connect_impl` re-export
and its dedicated test so `cargo test --no-default-features`
compiles cleanly.
- Fix `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps`:
backtick `Option<T>` so rustdoc doesn't read it as an unclosed HTML
tag; allow `rustdoc::broken_intra_doc_links` /
`rustdoc::invalid_html_tags` on the conformance crate's
buffa-build-generated submodule.
- Tighten lint surface: convert `#[allow]` → `#[expect]` where the
lint reliably fires; drop a handful of inline allows that were
redundant under the crate's module-level allows; remove a stale
`let _ = pi` interpreter-era suppression; replace nested
`if let / if let` chains with let-chains; mark const-eligible
helpers `const fn`; use `Self` in self-referential enum variants;
derive `Default` on `Compiler`.
- Refresh stale doc comments that still mentioned "the interpreter"
or named symbols (`CelConstraint`, `AsCelValue`, `ToCelValue`,
`Compiled`) that no longer exist.
- Apply rustfmt across the plugin crate to restore stable-fmt
cleanliness.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Transpile every protovalidate CEL rule to native Rust at codegen time. The runtime no longer hosts an interpreter —
validate()is a direct struct-field walk with zero per-callValue/HashMapallocations.emit/cel_compile.rstyped AST visitor over the upstreamcelparser. Handles literals, idents, selects, calls, comprehensions (all/exists/exists_one/map/filter— single AND two-variable), string/bytes/list/map ops,has(),size(),type(),is_ip/is_email/ etc., regex match (literal + dynamic),dyn(), duration/timestamp constructors (literal + dynamic args, plus timezone arg via opt-intzfeature), all 10 timestamp accessors, all 4 duration accessors, fullmath.*extension, listreverse/distinct, stringreverse,isFinite, optional types (optional.of/.none/.ofNonZeroValue,.hasValue/.orValue/.value,m[?k]optional indexing), format directives%s/%d/%f/%e/%x/%X/%o/%b, list indexing, map literals.FallbackKind::UnsupportedvsFallbackKind::RuntimeError— both emit__cel_runtime_error__violation markers; distinguishing them sharpens diagnostics.mod.rs+ per-package<pkg>.mod.rsfiles that mirror the proto package hierarchy aspub modnesting. Removes the need for downstream "generate mod tree" scripts. All packaging files run throughprettypleasefor consistent formatting. Configurable viaopt: proto_module=crate::proto.celinterpreter dep entirely;src/cel.rscollapses to a 135-line helper module (CelScalar,duration_from_secs_nanos,timestamp_from_secs_nanos,now_local,parse_duration,parse_timestamp). Optionaltzfeature gateschrono-tzfor timezone-aware timestamp accessors.reverse/distinct/filter) return borrowed&str/&[u8]/&Tviews instead of cloning, so a 20MB bytes payload doesn't get duplicated when a CEL rule reads or filters it.tests/emit_compiles.rsthat actually compiles every emitted token-stream against the runtime crate — catches type errors and missing imports, not just syntax. Each test's emitted body is ascribed to the expected Rust return type, so a mismatch breaks the build. Workspace: 164 tests; conformance: 2872 / 2872 end-to-end.buffa-types× 3,connectrpcfrom the macros crate); deleted deadctx::CodeGenContextplaceholder; gatedconnect_implre-export on theconnectfeature; fixedcargo doc -D warnings; fixedcargo test --no-default-features; refreshed stale interpreter-era doc comments throughout.Real-world tested on the sloper-demo repo: 4128-line / +450-line reduction in generated code; downstream
cargo check --workspacepasses.Test plan
cargo clippy --workspace --all-targets -- -D warningscargo fmt --all -- --check(stable rustfmt — what CI runs)cargo test --workspace— 164 / 164 passingcargo test -p protovalidate-buffa --no-default-featurescleanRUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-depscleancargo +1.95 build --workspace(MSRV) cleancargo auditno advisoriesprotovalidate-conformanceend-to-end: 2872 / 2872 against proto2 + proto3 + editions 2023