feat(cluster): Terraform-shaped query declaration; drop ./ path noise#183
Conversation
cluster.yaml's graphs.<id>.queries previously accepted only an explicit name->file map, forcing configs to re-enumerate every `query <name>` that the .gq files already declare (the SPIKE cookbook needed 66 entries for 6 files). The files ARE the declaration now: `queries: queries/` discovers every declaration in a directory's top-level *.gq (sorted), a list form takes explicit files, and the map stays for fine-grained control. Discovery is loud — unreadable/unparseable files and duplicate query names fail validation (query_parse_error, duplicate_query_name). Downstream is untouched: each discovered query is still an individually addressed resource with the containing file's digest. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
| let mut registry: BTreeMap<String, QueryConfig> = BTreeMap::new(); | ||
| let mut origin: BTreeMap<String, PathBuf> = BTreeMap::new(); | ||
| for (declared, resolved) in files { | ||
| let source = match fs::read_to_string(&resolved) { | ||
| Ok(source) => source, | ||
| Err(err) => { | ||
| diagnostics.push(Diagnostic::error( | ||
| "query_file_missing", | ||
| format!("graphs.{graph_id}.queries"), | ||
| format!("could not read query file '{}': {err}", resolved.display()), | ||
| )); | ||
| continue; | ||
| } | ||
| }; | ||
| let parsed = match parse_query(&source) { | ||
| Ok(parsed) => parsed, | ||
| Err(err) => { | ||
| diagnostics.push(Diagnostic::error( | ||
| "query_parse_error", | ||
| format!("graphs.{graph_id}.queries"), | ||
| format!("'{}' does not parse: {err}", resolved.display()), | ||
| )); | ||
| continue; | ||
| } | ||
| }; | ||
| for query_decl in &parsed.queries { | ||
| let name = query_decl.name.clone(); | ||
| if let Some(previous) = origin.get(&name) { | ||
| diagnostics.push(Diagnostic::error( | ||
| "duplicate_query_name", | ||
| format!("graphs.{graph_id}.queries.{name}"), | ||
| format!( | ||
| "query '{name}' is declared in both '{}' and '{}'", | ||
| previous.display(), | ||
| declared.display() | ||
| ), | ||
| )); | ||
| continue; | ||
| } | ||
| origin.insert(name.clone(), declared.clone()); | ||
| registry.insert(name, QueryConfig { file: declared.clone() }); | ||
| } | ||
| } | ||
| registry |
There was a problem hiding this comment.
Discovery mode reads each multi-query file N+1 times
resolve_query_decls reads and parses each .gq file once to enumerate query names. Then in load_desired (line 3747), fs::read_to_string is called once per discovered query — so a file containing N queries is read N+1 times total and parsed N times in validate_query_source. This creates a small TOCTOU window: if the file changes between the discovery read and the per-query reads, validate_query_source can emit query_key_mismatch for a query that was just discovered, which is surprising since the caller already verified the name exists in that file. The loud failure is safe, but the error attribution is confusing.
Returning (String, Vec<String>) (content + names) from the file-processing loop and threading the content through to the downstream read would eliminate the redundancy and close the window entirely.
There was a problem hiding this comment.
Fixed in 4558454: resolve_query_decls returns its file contents and the per-query pass reuses them — one read per file, no enumeration/validation window. (The link-check failure was unrelated: untracked scratch notes swept into the docs commit by a broad git add; stripped.)
Paths in cluster.yaml and command examples are relative to one explicit config folder (Terraform-shaped) — the ./ prefixes were noise and are gone across the user docs (109 instances; ../ links and ./scripts executables untouched). The cluster docs now present directory discovery as the primary queries form with the list and map forms documented alongside. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
cebf5bf to
44b5866
Compare
There was a problem hiding this comment.
aaltshuler has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
resolve_query_decls hands its file contents to the caller; the per-query digest/typecheck pass reuses them instead of re-reading (a file with N queries was read N+1 times), which also closes the window where a file changing between enumeration and validation produced a confusing query_key_mismatch for a just-discovered name. Explicit-map declarations read as before. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
aaltshuler has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
Two ergonomics fixes from cookbook-alignment feedback — the cluster config was fighting its own model:
The
.gqfiles are the declaration.graphs.<id>.queriespreviously accepted only an explicitname → filemap, forcing configs to re-enumerate everyquery <name>the files already declare (the SPIKE cookbook needed 66 entries for 6 files). Now:Discovery is loud — unreadable/unparseable files and cross-file duplicate names fail validation (
query_parse_error,duplicate_query_name). Downstream is untouched: each discovered query remains an individually addressed resource (query.<graph>.<name>) with the containing file's digest../path prefixes removed from the docs (109 instances across user docs). Paths incluster.yamland command examples resolve against one explicit config folder — the prefix was stylistic noise. (../links and./scriptsexecutable invocations untouched; verified no link damage +check-agents-md.shgreen.)Test plan
.gqfiles ignored and deterministic ordering; list + single-file forms; duplicate-name and parse-error refusals); 95 total passcargo test --workspace --locked— green (exit 0 in-log)cluster.yaml(schema: people.pg,queries: queries/) validates with all declarations discoveredThe cookbooks PR (ModernRelay/omnigraph-cookbooks#17) collapses to the one-line form on top of this.
🤖 Generated with Claude Code
Greptile Summary
This PR makes two ergonomics improvements to the cluster config: a Terraform-style query declaration (
QueriesDeclenum accepting a directory path, file list, or explicit name→file map) so.gqfiles self-describe their queries instead of forcing operators to re-enumerate every name, and a 109-instance removal of./path-prefix noise from user docs.QueriesDecl(#[serde(untagged)]) replaces the hardBTreeMap<String, QueryConfig>onGraphConfig.queries. A newresolve_query_declsfunction reads and parses each discovered.gqviaparse_query, emitting loud errors for unreadable files, parse failures, and cross-file duplicate names before handing the canonical name→file map back to the existingload_desiredmachinery. Three tests cover directory discovery, list and single-file forms, and error paths../prefix removal is cosmetic/docs-only: paths incluster.yamlalways resolve against the config folder, soschema: people.pgandschema: ./people.pgare equivalent; the prefix was stylistic noise in examples and reference docs.../relative links and./scriptsexecutable calls were intentionally left untouched.Confidence Score: 4/5
Safe to merge. The new query-discovery path is additive and backward-compatible; the Explicit map form is untouched. The only behavioural changes are new error/warning codes for the new forms and the removal of cosmetic ./ prefixes from docs.
The core logic in resolve_query_decls is correct: directory enumeration, sorting, duplicate detection, and loud parse-failure handling all work as intended and are covered by the new tests. Two minor design observations remain — discovered files are read a second time per-query in load_desired (N+1 reads for N queries), and the diagnostic field path emitted during that second read uses an explicit-map–style locator that doesn't match the discovery syntax the user actually wrote — but neither affects correctness or safety.
crates/omnigraph-cluster/src/lib.rs — specifically the resolve_query_decls function and the per-query re-read block in load_desired (lines ~464–543 and ~3747–3778).
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["cluster.yaml\n`queries:` field"] -->|string| B["QueriesDecl::Discover(PathBuf)"] A -->|array| C["QueriesDecl::DiscoverMany(Vec<PathBuf>)"] A -->|map| D["QueriesDecl::Explicit(BTreeMap)"] B --> E["resolve_query_decls()"] C --> E D -->|early return| F["name→file map"] E --> G{is_dir?} G -->|yes| H["fs::read_dir\nfilter *.gq\nsort"] G -->|no| I["single file"] H --> J["For each .gq file:\nfs::read_to_string\nparse_query"] I --> J J -->|parse error| K["query_parse_error ❌"] J -->|ok| L["extract query names\ncheck duplicates"] L -->|duplicate| M["duplicate_query_name ❌"] L -->|ok| F F --> N["load_desired:\nper-query fs::read_to_string\ncompute digest\nvalidate_query_source"] N --> O["resource_digests &\ngraph_query_digests"]Comments Outside Diff (1)
crates/omnigraph-cluster/src/lib.rs, line 3747-3778 (link)query_file_missingfield path is misleading for discovered queriesWhen discovery mode is used (
queries: queries/orqueries: [a.gq, b.gq]), this second read can emitquery_file_missingwith fieldgraphs.{graph_id}.queries.{query_name}.file— a JSON-pointer–style path that implies the user set an explicit{ find_person: { file: … } }map. In reality the user never wrote that field; the name was auto-discovered. If a file becomes unreadable between discovery and digest-computation, the operator will see a field path that points to something that doesn't exist in theircluster.yaml.Emitting
graphs.{graph_id}.queries(same level used byresolve_query_decls) and including the file path in the message (already present) would give accurate attribution.Reviews (1): Last reviewed commit: "docs: drop ./ path prefixes; document qu..." | Re-trigger Greptile