Skip to content

feat(cluster): Terraform-shaped query declaration; drop ./ path noise#183

Merged
aaltshuler merged 3 commits into
mainfrom
feat/cluster-query-discovery
Jun 10, 2026
Merged

feat(cluster): Terraform-shaped query declaration; drop ./ path noise#183
aaltshuler merged 3 commits into
mainfrom
feat/cluster-query-discovery

Conversation

@aaltshuler

@aaltshuler aaltshuler commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Two ergonomics fixes from cookbook-alignment feedback — the cluster config was fighting its own model:

  1. The .gq files are the declaration. graphs.<id>.queries previously accepted only an explicit name → file map, forcing configs to re-enumerate every query <name> the files already declare (the SPIKE cookbook needed 66 entries for 6 files). Now:

    queries: queries/                  # directory: every declaration in top-level *.gq (sorted)
    queries: [people.gq, extra/a.gq]   # explicit files; every declaration in each
    queries: { find_experts: { file: knowledge.gq } }   # fine-grained map, unchanged

    Discovery is loud — unreadable/unparseable files and cross-file duplicate names fail validation (query_parse_error, duplicate_query_name). Downstream is untouched: each discovered query remains an individually addressed resource (query.<graph>.<name>) with the containing file's digest.

  2. ./ path prefixes removed from the docs (109 instances across user docs). Paths in cluster.yaml and command examples resolve against one explicit config folder — the prefix was stylistic noise. (../ links and ./scripts executable invocations untouched; verified no link damage + check-agents-md.sh green.)

Test plan

  • 3 new in-crate tests (directory discovery incl. non-.gq files ignored and deterministic ordering; list + single-file forms; duplicate-name and parse-error refusals); 95 total pass
  • cargo test --workspace --locked — green (exit 0 in-log)
  • Manual: a bare-paths cluster.yaml (schema: people.pg, queries: queries/) validates with all declarations discovered

The cookbooks PR (ModernRelay/omnigraph-cookbooks#17) collapses to the one-line form on top of this.

🤖 Generated with Claude Code

Greptile Summary

This PR makes two ergonomics improvements to the cluster config: a Terraform-style query declaration (QueriesDecl enum accepting a directory path, file list, or explicit name→file map) so .gq files self-describe their queries instead of forcing operators to re-enumerate every name, and a 109-instance removal of ./ path-prefix noise from user docs.

  • QueriesDecl (#[serde(untagged)]) replaces the hard BTreeMap<String, QueryConfig> on GraphConfig.queries. A new resolve_query_decls function reads and parses each discovered .gq via parse_query, emitting loud errors for unreadable files, parse failures, and cross-file duplicate names before handing the canonical name→file map back to the existing load_desired machinery. Three tests cover directory discovery, list and single-file forms, and error paths.
  • ./ prefix removal is cosmetic/docs-only: paths in cluster.yaml always resolve against the config folder, so schema: people.pg and schema: ./people.pg are equivalent; the prefix was stylistic noise in examples and reference docs. ../ relative links and ./scripts executable calls were intentionally left untouched.

Confidence Score: 4/5

Safe to merge. The new query-discovery path is additive and backward-compatible; the Explicit map form is untouched. The only behavioural changes are new error/warning codes for the new forms and the removal of cosmetic ./ prefixes from docs.

The core logic in resolve_query_decls is correct: directory enumeration, sorting, duplicate detection, and loud parse-failure handling all work as intended and are covered by the new tests. Two minor design observations remain — discovered files are read a second time per-query in load_desired (N+1 reads for N queries), and the diagnostic field path emitted during that second read uses an explicit-map–style locator that doesn't match the discovery syntax the user actually wrote — but neither affects correctness or safety.

crates/omnigraph-cluster/src/lib.rs — specifically the resolve_query_decls function and the per-query re-read block in load_desired (lines ~464–543 and ~3747–3778).

Important Files Changed

Filename Overview
crates/omnigraph-cluster/src/lib.rs Core change: introduces QueriesDecl enum (string/list/map) so .gq files are self-describing; adds resolve_query_decls to discover queries from directories and files. Logic is correct but discovery mode reads each multi-query file N+1 times and emits misleading field paths in the second-read error path.
docs/user/cluster-config.md Updated cluster.yaml reference example and added Terraform-style queries section with all three forms; ./ prefix removed from path examples throughout.
docs/user/cluster.md ./ path-prefix noise stripped from user-doc instances; no functional changes.
docs/user/cli-reference.md ./ prefixes removed from command examples; content unchanged.
docs/user/cli.md ./ prefixes removed from path examples; no functional changes.
docs/user/deployment.md ./ prefixes removed from deployment examples; no functional changes.
docs/user/policy.md ./ prefixes removed from policy file path examples; no functional changes.
docs/user/transactions.md ./ prefixes removed from path examples; no functional changes.
docs/dev/cluster-pr-issues.md Dev-facing notes; ./ prefixes cleaned up, no structural changes.
docs/dev/cluster-state-of-affairs.md Dev-facing notes; ./ prefixes cleaned up, no structural changes.
docs/dev/omnigraph-article-long.md ./ prefixes removed from path examples; no functional changes.
docs/dev/omnigraph-article.md ./ prefixes removed from path examples; no functional changes.
docs/dev/omnigraph-visuals.md ./ prefixes removed from path examples; no functional changes.
docs/dev/pr-139-issues.md ./ prefixes removed from path examples; no functional changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["cluster.yaml\n`queries:` field"] -->|string| B["QueriesDecl::Discover(PathBuf)"]
    A -->|array| C["QueriesDecl::DiscoverMany(Vec<PathBuf>)"]
    A -->|map| D["QueriesDecl::Explicit(BTreeMap)"]

    B --> E["resolve_query_decls()"]
    C --> E
    D -->|early return| F["name→file map"]

    E --> G{is_dir?}
    G -->|yes| H["fs::read_dir\nfilter *.gq\nsort"]
    G -->|no| I["single file"]
    H --> J["For each .gq file:\nfs::read_to_string\nparse_query"]
    I --> J
    J -->|parse error| K["query_parse_error ❌"]
    J -->|ok| L["extract query names\ncheck duplicates"]
    L -->|duplicate| M["duplicate_query_name ❌"]
    L -->|ok| F

    F --> N["load_desired:\nper-query fs::read_to_string\ncompute digest\nvalidate_query_source"]
    N --> O["resource_digests &\ngraph_query_digests"]
Loading

Comments Outside Diff (1)

  1. crates/omnigraph-cluster/src/lib.rs, line 3747-3778 (link)

    P2 query_file_missing field path is misleading for discovered queries

    When discovery mode is used (queries: queries/ or queries: [a.gq, b.gq]), this second read can emit query_file_missing with field graphs.{graph_id}.queries.{query_name}.file — a JSON-pointer–style path that implies the user set an explicit { find_person: { file: … } } map. In reality the user never wrote that field; the name was auto-discovered. If a file becomes unreadable between discovery and digest-computation, the operator will see a field path that points to something that doesn't exist in their cluster.yaml.

    Emitting graphs.{graph_id}.queries (same level used by resolve_query_decls) and including the file path in the message (already present) would give accurate attribution.

    Fix in Claude Code

Fix All in Claude Code

Reviews (1): Last reviewed commit: "docs: drop ./ path prefixes; document qu..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

cluster.yaml's graphs.<id>.queries previously accepted only an explicit
name->file map, forcing configs to re-enumerate every `query <name>` that
the .gq files already declare (the SPIKE cookbook needed 66 entries for 6
files). The files ARE the declaration now: `queries: queries/` discovers
every declaration in a directory's top-level *.gq (sorted), a list form
takes explicit files, and the map stays for fine-grained control.
Discovery is loud — unreadable/unparseable files and duplicate query names
fail validation (query_parse_error, duplicate_query_name). Downstream is
untouched: each discovered query is still an individually addressed
resource with the containing file's digest.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Comment thread crates/omnigraph-cluster/src/lib.rs Outdated
Comment on lines +500 to +543
let mut registry: BTreeMap<String, QueryConfig> = BTreeMap::new();
let mut origin: BTreeMap<String, PathBuf> = BTreeMap::new();
for (declared, resolved) in files {
let source = match fs::read_to_string(&resolved) {
Ok(source) => source,
Err(err) => {
diagnostics.push(Diagnostic::error(
"query_file_missing",
format!("graphs.{graph_id}.queries"),
format!("could not read query file '{}': {err}", resolved.display()),
));
continue;
}
};
let parsed = match parse_query(&source) {
Ok(parsed) => parsed,
Err(err) => {
diagnostics.push(Diagnostic::error(
"query_parse_error",
format!("graphs.{graph_id}.queries"),
format!("'{}' does not parse: {err}", resolved.display()),
));
continue;
}
};
for query_decl in &parsed.queries {
let name = query_decl.name.clone();
if let Some(previous) = origin.get(&name) {
diagnostics.push(Diagnostic::error(
"duplicate_query_name",
format!("graphs.{graph_id}.queries.{name}"),
format!(
"query '{name}' is declared in both '{}' and '{}'",
previous.display(),
declared.display()
),
));
continue;
}
origin.insert(name.clone(), declared.clone());
registry.insert(name, QueryConfig { file: declared.clone() });
}
}
registry

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Discovery mode reads each multi-query file N+1 times

resolve_query_decls reads and parses each .gq file once to enumerate query names. Then in load_desired (line 3747), fs::read_to_string is called once per discovered query — so a file containing N queries is read N+1 times total and parsed N times in validate_query_source. This creates a small TOCTOU window: if the file changes between the discovery read and the per-query reads, validate_query_source can emit query_key_mismatch for a query that was just discovered, which is surprising since the caller already verified the name exists in that file. The loud failure is safe, but the error attribution is confusing.

Returning (String, Vec<String>) (content + names) from the file-processing loop and threading the content through to the downstream read would eliminate the redundancy and close the window entirely.

Fix in Claude Code

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4558454: resolve_query_decls returns its file contents and the per-query pass reuses them — one read per file, no enumeration/validation window. (The link-check failure was unrelated: untracked scratch notes swept into the docs commit by a broad git add; stripped.)

Paths in cluster.yaml and command examples are relative to one explicit
config folder (Terraform-shaped) — the ./ prefixes were noise and are gone
across the user docs (109 instances; ../ links and ./scripts executables
untouched). The cluster docs now present directory discovery as the primary
queries form with the list and map forms documented alongside.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@aaltshuler aaltshuler force-pushed the feat/cluster-query-discovery branch from cebf5bf to 44b5866 Compare June 10, 2026 22:33

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaltshuler has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

resolve_query_decls hands its file contents to the caller; the per-query
digest/typecheck pass reuses them instead of re-reading (a file with N
queries was read N+1 times), which also closes the window where a file
changing between enumeration and validation produced a confusing
query_key_mismatch for a just-discovered name. Explicit-map declarations
read as before.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaltshuler has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@aaltshuler aaltshuler merged commit 4bd763f into main Jun 10, 2026
8 checks passed
@aaltshuler aaltshuler deleted the feat/cluster-query-discovery branch June 10, 2026 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant