diff --git a/skill/SKILL.md b/skill/SKILL.md index f882bbb..6968d67 100644 --- a/skill/SKILL.md +++ b/skill/SKILL.md @@ -5,8 +5,9 @@ description: >- schema, writing or applying migrations, loading seed/test data, or diagnosing health and performance ("why is the DB slow", locks, bloat, sequence exhaustion, slow queries). pgcrate is a single binary that wraps - psql-class work with read-only-by-default guardrails, timeouts, structured - JSON, and semantic exit codes so the work is safe to run unsupervised. + psql-class work with guardrails — reads are read-only and capped, writes + preview themselves (dry-run + rollback) before they commit, plus timeouts, + structured JSON, and semantic exit codes so the work is safe to run unsupervised. --- # pgcrate: the Postgres interface for agents @@ -18,8 +19,10 @@ can't hang forever, and output is dense and machine-readable. ## Why pgcrate over raw psql -- **Read-only by default.** `sql` refuses writes unless you pass `--allow-write`; - diagnostics never write. You can't `DROP` something by accident. +- **Writes preview by default.** `sql` blocks writes; `--allow-write` *dry-runs* + one (transaction → report → rollback), and only `--commit` actually applies it. + Diagnostics never write. You can't `DROP` something by accident, and you can't + forget to preview a destructive change. - **Timeouts on everything.** Connect/statement/lock timeouts are enforced (defaults 5s / 30s / 500ms; override with `--connect-timeout`, `--statement-timeout`, `--lock-timeout`). A bad query fails fast instead of @@ -58,12 +61,54 @@ diagnostics below. (A one-shot `brief` summary command is coming; not available pgcrate sql -c "SELECT ..." # read-only; blocks writes echo "SELECT ..." | pgcrate sql # reads from stdin if no -c pgcrate sql -c "SELECT ..." --json # structured rows for downstream use -pgcrate sql -c "UPDATE ..." --allow-write # opt in to writes, explicitly +pgcrate sql -c "SELECT ..." --limit 50 # cap rows (default 1000; --limit 0 uncaps) ``` -Default is read-only: a write statement errors with a prompt to re-run with -`--allow-write`. Only escalate when the human asked for a write. `query` is an -alias for `sql`. +Reads are read-only and capped. `SELECT` output is truncated at 1000 rows by +default with a `… +N more rows` trailer so a stray `SELECT *` on a huge table +can't flood your context; raise it with `--limit N`, or `--limit 0` to uncap. +`query` is an alias for `sql`. Statement/lock/connect timeouts apply here just +like the diagnostics (defaults 30s / 500ms / 5s; override with the global +`--statement-timeout` etc.). + +### Writes preview themselves (dry-run by default) + +A write — `UPDATE`/`DELETE`/`INSERT` or DDL — never just happens. By default +it's blocked; the two ways to run one are: + +```bash +pgcrate sql -c "UPDATE ..." --allow-write # PREVIEW: runs in a txn, reports, ROLLS BACK +pgcrate sql -c "UPDATE ..." --commit # APPLY: actually commits (implies --allow-write) +``` + +`--allow-write` is a **dry run**: the statement runs inside a transaction, +reports how many rows it would affect (plus a sample of the affected rows for a +single DML statement), then **rolls back** — nothing changes. Preview first, +show the human the affected count/sample, then re-run with `--commit` to apply. +`--commit` is the only flag that writes; you can't forget to preview. + +Writes that hide inside a query are caught, not just bare `INSERT`/`UPDATE`/ +`DELETE`: writable CTEs (`WITH d AS (DELETE … RETURNING *) SELECT * FROM d`), +leading-CTE DML, `SELECT … INTO new_table`, and `EXPLAIN ANALYZE ` (which +*executes* the statement) all classify as writes and go through the same +preview-or-`--commit` gate. + +Before a write runs, pgcrate EXPLAINs it and warns (on stderr) if the estimated +cost is high — surface that warning to the human. With `--json` the cost +estimate is in the `write.cost` object (`estimated`, `threshold`, +`exceeds_threshold`) regardless of whether it crossed the threshold. +`--no-cost-check` skips the check. + +Statements that can't run in a transaction (`CREATE INDEX CONCURRENTLY`, +`VACUUM`, `REINDEX`) can't be previewed; pgcrate refuses `--allow-write` for +them and tells you to use `--commit` directly. + +**Known limitation:** side-effect functions called from a `SELECT` +(`SELECT setval(...)`, `SELECT nextval(...)`, `SELECT some_writing_func()`) +can't be detected statically — they read as queries. Without a write flag the +read-only connection still blocks them; but under `--allow-write` such a +`SELECT` runs in autocommit and its side effect is *not* rolled back. Don't rely +on dry-run to preview a `SELECT` that calls a writing function. ## Migration loop @@ -146,8 +191,12 @@ Every `fix` supports `--dry-run` (preview) and `--verify` (re-check after). Run - Destructive operations require `--yes` — never pass it speculatively; show the plan/`--dry-run` first and let the human confirm intent. -- Never add `--read-write` (global) or `sql --allow-write` unless the task is - explicitly a write. Default read-only is the guardrail; keep it on. +- For `sql` writes: `--allow-write` is safe to reach for — it only previews + (dry-run + rollback). Use it to show the human the affected count/sample, then + only pass `--commit` once they've confirmed. Treat `--commit` like `--yes`: + never speculative, always after a preview the human has seen. +- Never add `--read-write` (global) unless the task is explicitly a write. + Default read-only is the guardrail; keep it on. - Prefer `--json` whenever output is consumed by you for a next step; use human tables when reporting to the person. - Treat exit `10+` as "I couldn't check" (fix connectivity/perms), distinct from @@ -163,8 +212,9 @@ Every `fix` supports `--dry-run` (preview) and `--verify` (re-check after). Run | Table dependents/dependencies | `pgcrate inspect table --dependents` / `--dependencies` | | Roles / grants / extensions | `pgcrate inspect roles` / `grants` / `extensions` | | Compare two schemas | `pgcrate inspect diff --to ` | -| Run a read query | `pgcrate sql -c "SELECT ..."` (add `--json`) | -| Run a write | `pgcrate sql -c "..." --allow-write` | +| Run a read query | `pgcrate sql -c "SELECT ..."` (add `--json`, `--limit N`) | +| Preview a write (dry-run + rollback) | `pgcrate sql -c "..." --allow-write` | +| Apply a write (commit) | `pgcrate sql -c "..." --commit` | | New migration | `pgcrate migrate new ` | | Apply / status | `pgcrate migrate up` / `pgcrate migrate status` | | Roll back (dev) | `pgcrate migrate down --steps N --yes` | diff --git a/src/commands/mod.rs b/src/commands/mod.rs index 34fd513..9f943fb 100644 --- a/src/commands/mod.rs +++ b/src/commands/mod.rs @@ -64,7 +64,7 @@ pub use schema::{describe, diff, generate, init}; pub use seed::{seed_diff, seed_list, seed_run, seed_validate}; // Re-export sql/query command -pub use sql_cmd::sql; +pub use sql_cmd::{sql, SqlOptions, DEFAULT_COST_WARN_THRESHOLD}; // Re-export extension commands from new module pub use extension::extension_list; diff --git a/src/commands/sql_cmd.rs b/src/commands/sql_cmd.rs index 42dc76c..b38c38b 100644 --- a/src/commands/sql_cmd.rs +++ b/src/commands/sql_cmd.rs @@ -1,16 +1,75 @@ +//! `pgcrate sql` (alias `query`) — the guarded SQL gateway. +//! +//! Read-only by default. Writes preview themselves: with `--allow-write` a +//! statement runs inside a transaction, reports what it *would* do (affected +//! count + sample rows), then **rolls back**. `--commit` (which implies +//! `--allow-write`) is the only way to actually apply a change. Expensive +//! statements are flagged via EXPLAIN before they run, and runaway result sets +//! are capped. + use anyhow::{bail, Context, Result}; use serde::Serialize; use std::io::Read; -use tokio_postgres::SimpleQueryMessage; +use tokio_postgres::{Client, SimpleQueryMessage}; + +use crate::diagnostic::{DiagnosticSession, TimeoutConfig}; + +/// Default cap on rows printed/returned for a SELECT. `--limit 0` uncaps. +pub const DEFAULT_ROW_CAP: usize = 1000; -use super::connect; +/// How many sample rows a dry-run write previews. +const DRY_RUN_SAMPLE_LIMIT: usize = 5; + +/// Default EXPLAIN total-cost threshold above which a warning prints. +/// Postgres "cost" is in arbitrary planner units; ~50k corresponds roughly to +/// a sizable sequential scan. Tune via `[sql] cost_warn_threshold`. +pub const DEFAULT_COST_WARN_THRESHOLD: f64 = 50_000.0; + +/// Options resolved from CLI flags + config for a single `sql` invocation. +pub struct SqlOptions { + pub allow_write: bool, + pub commit: bool, + pub limit: Option, + pub no_cost_check: bool, + pub cost_warn_threshold: f64, + /// Connect/statement/lock timeouts (from global CLI flags, same as DBA). + pub timeouts: TimeoutConfig, + pub quiet: bool, + pub json: bool, +} #[derive(Serialize)] struct SqlResponse { ok: bool, + /// Present only on a write path; describes whether changes were committed. + #[serde(skip_serializing_if = "Option::is_none")] + write: Option, results: Vec, } +#[derive(Serialize)] +struct WriteOutcome { + /// `true` when changes were committed, `false` when rolled back (dry run). + committed: bool, + /// Total rows affected across all write statements. + rows_affected: u64, + /// EXPLAIN cost estimate for the statement, when one was computed. Present + /// so `--json` consumers get the same cost signal humans see on stderr. + #[serde(skip_serializing_if = "Option::is_none")] + cost: Option, +} + +/// Structured EXPLAIN cost signal surfaced in `--json` output. +#[derive(Serialize, Clone)] +struct CostEstimate { + /// Estimated total plan cost (Postgres planner units). + estimated: f64, + /// The configured warning threshold this was checked against. + threshold: f64, + /// `true` when `estimated` exceeds `threshold` (a warning was emitted). + exceeds_threshold: bool, +} + #[derive(Serialize)] #[serde(tag = "type")] enum SqlResult { @@ -18,19 +77,74 @@ enum SqlResult { Query { columns: Vec, rows: Vec>>, + /// Rows withheld by the row cap (0 when nothing was truncated). + #[serde(skip_serializing_if = "is_zero")] + truncated: usize, }, #[serde(rename = "command")] CommandComplete { rows: u64 }, + /// Sample rows previewed for a dry-run write (the wrapped RETURNING result). + #[serde(rename = "sample")] + Sample { + columns: Vec, + rows: Vec>>, + }, } -pub async fn sql( - database_url: &str, - command: Option<&str>, - allow_write: bool, - quiet: bool, - json: bool, -) -> Result<()> { - let sql = match command { +fn is_zero(n: &usize) -> bool { + *n == 0 +} + +/// Classification of a single parsed statement. +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +enum StmtKind { + /// SELECT, SET, EXPLAIN, transaction control — never mutates. + Read, + /// INSERT/UPDATE/DELETE that the planner can EXPLAIN and that can be + /// previewed with a RETURNING wrapper. + Dml, + /// Transactional DDL (CREATE TABLE/VIEW, ALTER, DROP, TRUNCATE, …). Runs + /// inside a transaction so it can be dry-run, but EXPLAIN doesn't apply. + Ddl, + /// Statement that cannot run inside a transaction block (CREATE INDEX + /// CONCURRENTLY, VACUUM). Dry-run is impossible; requires `--commit`. + NonTransactional, +} + +impl StmtKind { + fn is_write(self) -> bool { + !matches!(self, StmtKind::Read) + } +} + +/// Statement-level facts gathered from a single parse of the input. +struct ParsedSql { + kinds: Vec, + /// `true` when the input is exactly one DML statement with no RETURNING + /// clause — the only case where we wrap with RETURNING to sample rows. + single_dml_no_returning: bool, +} + +impl ParsedSql { + fn has_write(&self) -> bool { + self.kinds.iter().any(|k| k.is_write()) + } + + fn non_transactional(&self) -> bool { + self.kinds + .iter() + .any(|k| matches!(k, StmtKind::NonTransactional)) + } + + /// Statements eligible for EXPLAIN cost gating (only DML; DDL/utility + /// statements aren't plannable). + fn explainable(&self) -> bool { + self.kinds.iter().any(|k| matches!(k, StmtKind::Dml)) + } +} + +pub async fn sql(database_url: &str, command: Option<&str>, opts: SqlOptions) -> Result<()> { + let raw = match command { Some(c) => c.to_string(), None => { let mut buf = String::new(); @@ -41,23 +155,290 @@ pub async fn sql( } }; - let sql = sql.trim(); + let sql = raw.trim(); if sql.is_empty() { bail!( "No SQL provided. Use: pgcrate sql -c \"SELECT 1\" or echo \"SELECT 1\" | pgcrate sql" ); } - if !allow_write && looks_like_write(sql)? { - bail!("SQL appears to write. Re-run with --allow-write to proceed."); + let parsed = classify(sql)?; + let is_write = parsed.has_write(); + + // Gate: a write with neither --allow-write nor --commit is refused, and the + // message teaches the new preview-by-default model. + if is_write && !opts.allow_write { + bail!( + "This SQL writes. By default writes are blocked.\n \ + Preview it (runs in a transaction, then rolls back):\n \ + pgcrate sql -c \"…\" --allow-write\n \ + Apply it for real:\n \ + pgcrate sql -c \"…\" --commit" + ); + } + + // Non-transactional writes can't be dry-run. Under --allow-write (preview) + // we refuse and point at --commit; under --commit we run them directly. + if parsed.non_transactional() && !opts.commit { + bail!( + "This SQL contains a statement that cannot run inside a transaction \ + (e.g. CREATE INDEX CONCURRENTLY, VACUUM), so it cannot be previewed.\n \ + Re-run with --commit to execute it directly:\n \ + pgcrate sql -c \"…\" --commit" + ); + } + + let session = DiagnosticSession::connect(database_url, opts.timeouts.clone()).await?; + crate::diagnostic::setup_ctrlc_handler(session.cancel_token()); + let client = session.client(); + + // Cost gating runs before any execution. EXPLAIN only applies to DML. + let cost = if is_write && !opts.no_cost_check && parsed.explainable() { + cost_gate(client, sql, opts.cost_warn_threshold).await? + } else { + None + }; + + if is_write { + run_write(client, sql, &parsed, &opts, cost).await + } else { + run_read(client, sql, &opts).await + } +} + +/// Read-only path: execute and print/return rows, capping the result set. +async fn run_read(client: &Client, sql: &str, opts: &SqlOptions) -> Result<()> { + let messages = client.simple_query(sql).await.context("execute SQL")?; + let cap = row_cap(opts.limit); + let results = collect_results(messages, cap); + + if opts.json { + emit_json(opts, None, results); + return Ok(()); + } + if opts.quiet { + return Ok(()); + } + print_results(&results); + Ok(()) +} + +/// Write path: dry-run (transaction → report → ROLLBACK) unless `--commit`. +async fn run_write( + client: &Client, + sql: &str, + parsed: &ParsedSql, + opts: &SqlOptions, + cost: Option, +) -> Result<()> { + // Non-transactional + --commit: execute directly, no transaction wrapper. + if parsed.non_transactional() { + let messages = client.simple_query(sql).await.context("execute SQL")?; + let results = collect_results(messages, usize::MAX); + let rows_affected = total_affected(&results); + finish_write(opts, true, rows_affected, results, cost); + return Ok(()); + } + + client + .batch_execute("BEGIN") + .await + .context("begin transaction")?; + + // From here on, ensure we always close the transaction even on error. + let outcome = execute_in_tx(client, sql, parsed, opts).await; + + let (results, rows_affected) = match outcome { + Ok(v) => v, + Err(e) => { + let _ = client.batch_execute("ROLLBACK").await; + return Err(e); + } + }; + + if opts.commit { + client + .batch_execute("COMMIT") + .await + .context("commit transaction")?; + finish_write(opts, true, rows_affected, results, cost); + } else { + client + .batch_execute("ROLLBACK") + .await + .context("rollback transaction")?; + finish_write(opts, false, rows_affected, results, cost); } - let client = connect(database_url).await?; + Ok(()) +} + +/// Run the write inside the open transaction, gathering command tags and (when +/// safe) a sample of affected rows via a RETURNING wrapper. +async fn execute_in_tx( + client: &Client, + sql: &str, + parsed: &ParsedSql, + opts: &SqlOptions, +) -> Result<(Vec, u64)> { let messages = client.simple_query(sql).await.context("execute SQL")?; + let mut results = collect_results(messages, usize::MAX); + let rows_affected = total_affected(&results); + + // Sample rows only when it's safe and worthwhile: a single DML statement + // without its own RETURNING. We wrap it so the rows would-be-affected can + // be shown. The wrapper still rolls back with the outer transaction. + if parsed.single_dml_no_returning && rows_affected > 0 { + if let Some(sample) = sample_affected(client, sql, opts).await? { + results.push(sample); + } + } + + Ok((results, rows_affected)) +} + +/// Wrap a single DML statement to capture a few affected rows for preview. +/// Returns `None` (silently) if wrapping fails — sampling is best-effort. +async fn sample_affected( + client: &Client, + sql: &str, + opts: &SqlOptions, +) -> Result> { + let stmt = sql.trim().trim_end_matches(';'); + let limit = match opts.limit { + Some(0) => DRY_RUN_SAMPLE_LIMIT, // uncapped reads still bound the preview + Some(n) => n.min(DRY_RUN_SAMPLE_LIMIT), + None => DRY_RUN_SAMPLE_LIMIT, + }; + let wrapped = format!( + "WITH __pgcrate_preview AS ({stmt} RETURNING *) \ + SELECT * FROM __pgcrate_preview LIMIT {limit}" + ); + + match client.simple_query(&wrapped).await { + Ok(messages) => { + let collected = collect_results(messages, limit); + // The wrapped statement produces exactly one query result. + let found = collected.into_iter().find_map(|r| match r { + SqlResult::Query { columns, rows, .. } => Some((columns, rows)), + _ => None, + }); + Ok(found.map(|(columns, rows)| SqlResult::Sample { columns, rows })) + } + // Wrapping isn't always valid (e.g. statement already errors, or a + // construct RETURNING can't express). Fall back to count-only. + Err(_) => Ok(None), + } +} + +/// EXPLAIN the input and return the estimated total cost (relative to the +/// configured threshold). Returns `None` only when no cost could be computed +/// (multi-statement, non-plannable, or EXPLAIN failed) — never blocks. +async fn cost_gate(client: &Client, sql: &str, threshold: f64) -> Result> { + let stmt = sql.trim().trim_end_matches(';'); + // Only a single statement is reliably EXPLAIN-able here; multi-statement + // writes skip the gate (the parser already flagged them as a write). + if stmt.contains(';') { + return Ok(None); + } + + let explain_sql = format!("EXPLAIN (FORMAT JSON) {stmt}"); + let row = match client.query_one(&explain_sql, &[]).await { + Ok(r) => r, + // EXPLAIN can fail (non-plannable statement, permissions). Don't block. + Err(_) => return Ok(None), + }; + + let plan_json: serde_json::Value = row.get(0); + let total_cost = plan_json + .get(0) + .and_then(|p| p.get("Plan")) + .and_then(|p| p.get("Total Cost")) + .and_then(|c| c.as_f64()); + Ok(total_cost.map(|estimated| CostEstimate { + estimated, + threshold, + exceeds_threshold: estimated > threshold, + })) +} + +/// Resolve the effective row cap. `Some(0)` means uncapped. +fn row_cap(limit: Option) -> usize { + match limit { + Some(0) => usize::MAX, + Some(n) => n, + None => DEFAULT_ROW_CAP, + } +} + +/// Sum affected-row counts across command-complete results. +fn total_affected(results: &[SqlResult]) -> u64 { + results + .iter() + .filter_map(|r| match r { + SqlResult::CommandComplete { rows } => Some(*rows), + _ => None, + }) + .sum() +} + +/// Drive output for a completed write (human or JSON). +fn finish_write( + opts: &SqlOptions, + committed: bool, + rows_affected: u64, + results: Vec, + cost: Option, +) { + // Humans get a stderr warning only when the estimate exceeds the threshold; + // JSON consumers get the full structured cost inside the write object below. + if !opts.json { + if let Some(c) = &cost { + if c.exceeds_threshold { + eprintln!( + "pgcrate: warning: estimated query cost {:.0} exceeds threshold {:.0} \ + — this may be expensive. Pass --no-cost-check to skip this check.", + c.estimated, c.threshold + ); + } + } + } + + if opts.json { + let write = Some(WriteOutcome { + committed, + rows_affected, + cost, + }); + emit_json(opts, write, results); + return; + } + + if opts.quiet { + return; + } + + // Print any sample rows first, then the summary banner. + print_results(&results); + + if committed { + println!("\nCOMMITTED — {rows_affected} row(s) affected."); + } else { + println!( + "\nDRY RUN — {rows_affected} row(s) would be affected. \ + Nothing was changed (rolled back)." + ); + println!("Re-run with --commit to apply."); + } +} + +/// Convert simple-query messages into structured results, capping query rows. +fn collect_results(messages: Vec, cap: usize) -> Vec { let mut results: Vec = Vec::new(); let mut current_columns: Option> = None; let mut current_rows: Vec>> = Vec::new(); + let mut current_total: usize = 0; for msg in messages { match msg { @@ -69,17 +450,23 @@ pub async fn sql( current_columns = Some(row.columns().iter().map(|c| c.name().to_string()).collect()); } - let values: Vec> = (0..row.len()) - .map(|i| row.get(i).map(|s| s.to_string())) - .collect(); - current_rows.push(values); + current_total += 1; + if current_rows.len() < cap { + let values: Vec> = (0..row.len()) + .map(|i| row.get(i).map(|s| s.to_string())) + .collect(); + current_rows.push(values); + } } SimpleQueryMessage::CommandComplete(rows) => { if let Some(cols) = current_columns.take() { + let truncated = current_total.saturating_sub(current_rows.len()); results.push(SqlResult::Query { columns: cols, rows: std::mem::take(&mut current_rows), + truncated, }); + current_total = 0; } results.push(SqlResult::CommandComplete { rows }); } @@ -88,54 +475,185 @@ pub async fn sql( } if let Some(cols) = current_columns.take() { + let truncated = current_total.saturating_sub(current_rows.len()); results.push(SqlResult::Query { columns: cols, rows: std::mem::take(&mut current_rows), + truncated, }); } - if json { - let payload = SqlResponse { ok: true, results }; - println!("{}", serde_json::to_string_pretty(&payload)?); - return Ok(()); - } + results +} - if quiet { - return Ok(()); +fn emit_json(opts: &SqlOptions, write: Option, results: Vec) { + let payload = SqlResponse { + ok: true, + write, + results, + }; + match serde_json::to_string_pretty(&payload) { + Ok(s) => println!("{s}"), + Err(e) => eprintln!("pgcrate: failed to serialize JSON: {e}"), } + let _ = opts; +} +fn print_results(results: &[SqlResult]) { for result in results { match result { - SqlResult::Query { columns, rows } => { - print_table(&columns, &rows); + SqlResult::Query { + columns, + rows, + truncated, + } => { + print_table(columns, rows); + if *truncated > 0 { + println!("… +{truncated} more row(s) — use --limit N (or --limit 0 to uncap)"); + } + } + SqlResult::Sample { columns, rows } => { + println!("Sample of affected rows:"); + print_table(columns, rows); } SqlResult::CommandComplete { rows } => { println!("OK ({rows} rows)"); } } } +} - Ok(()) +/// Statements that cannot run inside a transaction *and* that sqlparser 0.58 +/// does not parse into a dedicated `Statement` variant. We detect these by +/// leading keyword so the user gets the "needs --commit" guidance rather than +/// a raw parse error. +fn leading_non_transactional(sql: &str) -> bool { + let head = sql.trim_start().to_ascii_uppercase(); + head.starts_with("VACUUM") || head.starts_with("REINDEX") } -fn looks_like_write(sql: &str) -> Result { +/// Does a parsed `Query` (a `SELECT`-shaped statement) actually mutate data? +/// +/// sqlparser routes several write constructs through `Statement::Query`, so a +/// naive `Query => Read` rule lets real writes execute unguarded. At any +/// nesting depth (CTEs can themselves contain `WITH`, set operations nest, +/// subqueries nest), this catches: +/// +/// - writable CTEs (`WITH d AS (DELETE … RETURNING *) SELECT *`), +/// - leading-CTE DML (`WITH x AS (…) UPDATE …`, whose top-level body is the +/// `UPDATE`/`INSERT`/`DELETE`), +/// - `SELECT … INTO new_table` (creates a table). +fn query_writes(query: &sqlparser::ast::Query) -> bool { + if let Some(with) = &query.with { + if with.cte_tables.iter().any(|cte| query_writes(&cte.query)) { + return true; + } + } + set_expr_writes(&query.body) +} + +fn set_expr_writes(body: &sqlparser::ast::SetExpr) -> bool { + use sqlparser::ast::SetExpr; + match body { + // A DML body inside a query is a write (writable CTE / leading-CTE DML). + SetExpr::Insert(_) | SetExpr::Update(_) | SetExpr::Delete(_) => true, + // `SELECT … INTO t` materializes a table — that's a write. + SetExpr::Select(select) => select.into.is_some(), + SetExpr::Query(inner) => query_writes(inner), + SetExpr::SetOperation { left, right, .. } => { + set_expr_writes(left) || set_expr_writes(right) + } + SetExpr::Values(_) | SetExpr::Table(_) => false, + } +} + +/// Classify a single parsed statement. `Query` and `Explain` recurse because +/// either can hide a write (writable CTE; `EXPLAIN ANALYZE `, which +/// *executes* the inner statement). +fn classify_statement(stmt: &sqlparser::ast::Statement) -> (StmtKind, bool) { + use sqlparser::ast::Statement; + match stmt { + // A query is a write iff it embeds DML / SELECT INTO; otherwise read. + Statement::Query(q) => { + if query_writes(q) { + // Embedded writes can't be safely RETURNING-wrapped, and the + // planner can't EXPLAIN them as plain DML — treat as DDL-class + // (transactional dry-run, no sample, no cost gate). + (StmtKind::Ddl, false) + } else { + (StmtKind::Read, false) + } + } + + // EXPLAIN ANALYZE *runs* the inner statement; the parenthesized + // `EXPLAIN (ANALYZE)` form parses with `analyze=false` in sqlparser yet + // still executes, so we also classify the inner statement and treat any + // inner write as a write regardless of the flag. + Statement::Explain { + statement, analyze, .. + } => { + let (inner_kind, _) = classify_statement(statement); + if *analyze || inner_kind.is_write() { + // EXPLAIN can't run inside our RETURNING wrapper or be cost-gated + // meaningfully; route it through the transactional DDL path so a + // dry-run still rolls back the side effects of ANALYZE. + (StmtKind::Ddl, false) + } else { + (StmtKind::Read, false) + } + } + + Statement::Set(_) + | Statement::StartTransaction { .. } + | Statement::Commit { .. } + | Statement::Rollback { .. } + | Statement::ExplainTable { .. } => (StmtKind::Read, false), + + Statement::Insert(insert) => (StmtKind::Dml, insert.returning.is_some()), + Statement::Update { returning, .. } => (StmtKind::Dml, returning.is_some()), + Statement::Delete(delete) => (StmtKind::Dml, delete.returning.is_some()), + + // CREATE INDEX CONCURRENTLY cannot run in a transaction. + Statement::CreateIndex(idx) if idx.concurrently => (StmtKind::NonTransactional, false), + + // Everything else that mutates: transactional DDL/utility. + _ => (StmtKind::Ddl, false), + } +} + +/// Parse the input once and classify each statement. +fn classify(sql: &str) -> Result { + // VACUUM / REINDEX aren't parseable by sqlparser; short-circuit so the + // caller can route them to the "requires --commit" path. + if leading_non_transactional(sql) { + return Ok(ParsedSql { + kinds: vec![StmtKind::NonTransactional], + single_dml_no_returning: false, + }); + } + let dialect = sqlparser::dialect::PostgreSqlDialect {}; let statements = sqlparser::parser::Parser::parse_sql(&dialect, sql).context("parse SQL")?; - for stmt in statements { - use sqlparser::ast::Statement; - match stmt { - Statement::Query(_) => {} - Statement::Set(_) => {} - Statement::StartTransaction { .. } - | Statement::Commit { .. } - | Statement::Rollback { .. } - | Statement::Explain { .. } => {} - _ => return Ok(true), + let mut kinds = Vec::with_capacity(statements.len()); + let mut dml_returnings: Vec = Vec::new(); + + for stmt in &statements { + let (kind, has_returning) = classify_statement(stmt); + if kind == StmtKind::Dml { + dml_returnings.push(has_returning); } + kinds.push(kind); } - Ok(false) + let single_dml_no_returning = statements.len() == 1 + && kinds.first() == Some(&StmtKind::Dml) + && dml_returnings.first() == Some(&false); + + Ok(ParsedSql { + kinds, + single_dml_no_returning, + }) } fn print_table(columns: &[String], rows: &[Vec>]) { @@ -176,3 +694,189 @@ fn print_table(columns: &[String], rows: &[Vec>]) { println!("{}", line.join(" | ")); } } + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn select_is_read() { + let p = classify("SELECT 1").unwrap(); + assert!(!p.has_write()); + assert!(!p.non_transactional()); + } + + #[test] + fn update_is_write_and_samplable() { + let p = classify("UPDATE users SET name = 'x' WHERE id = 1").unwrap(); + assert!(p.has_write()); + assert!(p.explainable()); + assert!(p.single_dml_no_returning); + } + + #[test] + fn update_with_returning_not_wrapped() { + let p = classify("UPDATE users SET name = 'x' WHERE id = 1 RETURNING id").unwrap(); + assert!(p.has_write()); + assert!(!p.single_dml_no_returning); + } + + #[test] + fn insert_delete_are_dml() { + assert!( + classify("INSERT INTO t (a) VALUES (1)") + .unwrap() + .single_dml_no_returning + ); + assert!( + classify("DELETE FROM t WHERE a = 1") + .unwrap() + .single_dml_no_returning + ); + } + + #[test] + fn multi_statement_write_not_single() { + let p = classify("UPDATE t SET a = 1; UPDATE t SET a = 2").unwrap(); + assert!(p.has_write()); + assert!(!p.single_dml_no_returning); + } + + #[test] + fn ddl_is_write_not_explainable() { + let p = classify("CREATE TABLE t (id int)").unwrap(); + assert!(p.has_write()); + assert!(!p.explainable()); + assert!(!p.non_transactional()); + } + + #[test] + fn drop_and_truncate_are_writes() { + assert!(classify("DROP TABLE t").unwrap().has_write()); + assert!(classify("TRUNCATE t").unwrap().has_write()); + assert!(classify("ALTER TABLE t ADD COLUMN c int") + .unwrap() + .has_write()); + } + + #[test] + fn create_index_concurrently_is_non_transactional() { + let p = classify("CREATE INDEX CONCURRENTLY idx ON t (a)").unwrap(); + assert!(p.has_write()); + assert!(p.non_transactional()); + } + + #[test] + fn plain_create_index_is_transactional() { + let p = classify("CREATE INDEX idx ON t (a)").unwrap(); + assert!(p.has_write()); + assert!(!p.non_transactional()); + } + + #[test] + fn vacuum_is_non_transactional() { + assert!(classify("VACUUM ANALYZE t").unwrap().non_transactional()); + } + + #[test] + fn row_cap_resolution() { + assert_eq!(row_cap(None), DEFAULT_ROW_CAP); + assert_eq!(row_cap(Some(0)), usize::MAX); + assert_eq!(row_cap(Some(10)), 10); + } + + // --- Bypass regression tests: writes that hide inside a Query/Explain. --- + // Each of these previously classified as Read and, under --allow-write, + // executed in autocommit with no transaction wrapper and no rollback. + + #[test] + fn writable_cte_delete_is_write() { + let p = classify("WITH d AS (DELETE FROM t RETURNING *) SELECT * FROM d").unwrap(); + assert!( + p.has_write(), + "writable-CTE DELETE must classify as a write" + ); + // Not wrappable/explainable — the embedded write goes the DDL dry-run path. + assert!(!p.single_dml_no_returning); + assert!(!p.explainable()); + } + + #[test] + fn writable_cte_update_and_insert_are_writes() { + assert!( + classify("WITH u AS (UPDATE t SET a = 1 RETURNING *) SELECT * FROM u") + .unwrap() + .has_write() + ); + assert!( + classify("WITH i AS (INSERT INTO t VALUES (1) RETURNING *) SELECT * FROM i") + .unwrap() + .has_write() + ); + } + + #[test] + fn leading_cte_dml_is_write() { + // Top-level body is the UPDATE; sqlparser still wraps it in Statement::Query. + let p = classify("WITH x AS (SELECT 1) UPDATE t SET a = 1").unwrap(); + assert!(p.has_write(), "leading-CTE UPDATE must classify as a write"); + } + + #[test] + fn nested_writable_cte_is_write() { + // A DELETE buried inside a CTE's own WITH clause. + let sql = "WITH outer1 AS (\ + WITH inner1 AS (DELETE FROM t RETURNING *) SELECT * FROM inner1\ + ) SELECT * FROM outer1"; + assert!( + classify(sql).unwrap().has_write(), + "DELETE nested in a CTE's WITH must classify as a write" + ); + } + + #[test] + fn select_into_is_write() { + let p = classify("SELECT id INTO new_table FROM t").unwrap(); + assert!( + p.has_write(), + "SELECT … INTO materializes a table — a write" + ); + } + + #[test] + fn explain_analyze_write_is_write() { + // EXPLAIN ANALYZE executes the inner statement. + assert!(classify("EXPLAIN ANALYZE DELETE FROM t") + .unwrap() + .has_write()); + assert!(classify("EXPLAIN ANALYZE UPDATE t SET a = 1") + .unwrap() + .has_write()); + // Parenthesized form parses analyze=false in sqlparser but still executes; + // the inner-statement check catches it. + assert!(classify("EXPLAIN (ANALYZE) DELETE FROM t") + .unwrap() + .has_write()); + // Plain EXPLAIN of a write doesn't execute, but classifying it as a write + // is the conservative, correct call. + assert!(classify("EXPLAIN DELETE FROM t").unwrap().has_write()); + } + + #[test] + fn plain_reads_stay_reads() { + // Guard against over-eager classification breaking ordinary queries. + assert!(!classify("SELECT 1").unwrap().has_write()); + assert!(!classify("EXPLAIN SELECT 1").unwrap().has_write()); + assert!(!classify("EXPLAIN (FORMAT JSON) SELECT 1") + .unwrap() + .has_write()); + assert!(!classify("WITH q AS (SELECT 1) SELECT * FROM q") + .unwrap() + .has_write()); + assert!(!classify("SELECT * FROM (SELECT 1) sub") + .unwrap() + .has_write()); + assert!(!classify("(SELECT 1) UNION (SELECT 2)").unwrap().has_write()); + assert!(!classify("VALUES (1), (2)").unwrap().has_write()); + } +} diff --git a/src/config.rs b/src/config.rs index 6dd8418..be47496 100644 --- a/src/config.rs +++ b/src/config.rs @@ -18,6 +18,7 @@ pub struct Config { pub model: Option, pub seeds: Option, pub tools: Option, + pub sql: Option, /// Named database connections #[serde(default)] pub connections: HashMap, @@ -92,6 +93,13 @@ pub struct ToolsConfig { pub psql: Option, } +/// `sql` / `query` command configuration +#[derive(Deserialize, Debug, Default)] +pub struct SqlConfig { + /// EXPLAIN total-cost threshold above which a write warns before running. + pub cost_warn_threshold: Option, +} + /// Anonymization configuration (pgcrate.anonymize.toml) #[derive(Deserialize, Default, Debug, Clone)] pub struct AnonymizeConfig { @@ -378,6 +386,14 @@ impl Config { .unwrap_or_default() } + /// Get the EXPLAIN cost-warning threshold for `sql` writes. + pub fn sql_cost_warn_threshold(&self) -> f64 { + self.sql + .as_ref() + .and_then(|s| s.cost_warn_threshold) + .unwrap_or(crate::commands::DEFAULT_COST_WARN_THRESHOLD) + } + /// Get path for a PostgreSQL tool (pg_dump, pg_restore, psql) /// Returns configured path if set, otherwise returns the tool name (for PATH lookup) pub fn tool_path(&self, tool: &str) -> String { diff --git a/src/main.rs b/src/main.rs index 87a7a5a..de909f5 100644 --- a/src/main.rs +++ b/src/main.rs @@ -264,9 +264,19 @@ enum Commands { /// Use -c for psql compatibility: pgcrate sql -c "SELECT 1" #[arg(short = 'c', value_name = "SQL")] command: Option, - /// Allow write statements (INSERT/UPDATE/DELETE/DDL) + /// Preview a write: run it in a transaction, report the effect, then ROLL BACK. + /// Nothing is changed unless you also pass --commit. #[arg(long)] allow_write: bool, + /// Actually apply a write (implies --allow-write). + #[arg(long)] + commit: bool, + /// Cap rows printed/returned for a SELECT (default 1000; 0 = uncapped). + #[arg(long, value_name = "N")] + limit: Option, + /// Skip the EXPLAIN cost check before running a write. + #[arg(long)] + no_cost_check: bool, }, /// Save and restore database state Snapshot { @@ -2206,12 +2216,17 @@ async fn run(cli: Cli, output: &Output) -> Result<()> { } } Commands::Sql { - command, + ref command, allow_write, + commit, + limit, + no_cost_check, } => { let config = Config::load(cli.config_path.as_deref()).context("Failed to load configuration")?; - // --allow-write implies --read-write (otherwise writes fail due to read-only URL) + // --commit implies --allow-write; either one needs read-write mode + // (otherwise writes fail due to a read-only URL). + let allow_write = allow_write || commit; let effective_read_write = cli.read_write || allow_write; let conn_result = connection::resolve_and_validate( &config, @@ -2222,14 +2237,17 @@ async fn run(cli: Cli, output: &Output) -> Result<()> { effective_read_write, cli.quiet, )?; - commands::sql( - &conn_result.url, - command.as_deref(), + let opts = commands::SqlOptions { allow_write, - cli.quiet, - cli.json, - ) - .await?; + commit, + limit, + no_cost_check, + cost_warn_threshold: config.sql_cost_warn_threshold(), + timeouts: parse_timeout_config(&cli)?, + quiet: cli.quiet, + json: cli.json, + }; + commands::sql(&conn_result.url, command.as_deref(), opts).await?; } Commands::Db { command } => { // db commands need database URL but not config diff --git a/tests/commands/sql.rs b/tests/commands/sql.rs index f63694b..1442adc 100644 --- a/tests/commands/sql.rs +++ b/tests/commands/sql.rs @@ -149,27 +149,438 @@ fn test_sql_blocks_write_by_default() { ); } +// ============================================================================ +// Dry-run by default (--allow-write previews and rolls back) +// ============================================================================ + +/// The core safety guarantee: `--allow-write` previews a destructive statement +/// inside a transaction and ROLLS BACK, so nothing changes. `--commit` applies. +#[test] +fn test_sql_allow_write_is_a_dry_run() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('seed@test.com', 'Original')"); + + // Dry-run an UPDATE. It should report what it would do but change nothing. + let output = project.run_pgcrate_ok(&[ + "sql", + "-c", + "UPDATE users SET name = 'Changed' WHERE email = 'seed@test.com'", + "--allow-write", + ]); + let out = stdout(&output); + assert!( + out.to_uppercase().contains("DRY RUN"), + "Dry run should be announced: {}", + out + ); + + // Provable rollback: the row is untouched. + let name = db.query("SELECT name FROM users WHERE email = 'seed@test.com'"); + assert_eq!( + name, "Original", + "Dry run must not change data — row should still be 'Original'" + ); +} + +/// `--commit` (which implies --allow-write) actually applies the change. +#[test] +fn test_sql_commit_applies_the_write() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('seed@test.com', 'Original')"); + + let output = project.run_pgcrate_ok(&[ + "sql", + "-c", + "UPDATE users SET name = 'Changed' WHERE email = 'seed@test.com'", + "--commit", + ]); + let out = stdout(&output); + assert!( + out.to_uppercase().contains("COMMITTED"), + "Commit should be announced: {}", + out + ); + + let name = db.query("SELECT name FROM users WHERE email = 'seed@test.com'"); + assert_eq!(name, "Changed", "Commit must apply the write"); +} + +/// The full provable-rollback contract in one test: dry-run leaves data +/// unchanged, then --commit changes it. +#[test] +fn test_sql_dry_run_then_commit() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('rollback@test.com', 'Before')"); + + // 1. Dry run — data unchanged. + project.run_pgcrate_ok(&[ + "sql", + "-c", + "UPDATE users SET name = 'After' WHERE email = 'rollback@test.com'", + "--allow-write", + ]); + assert_eq!( + db.query("SELECT name FROM users WHERE email = 'rollback@test.com'"), + "Before", + "After dry run the data must be unchanged" + ); + + // 2. Commit — data changed. + project.run_pgcrate_ok(&[ + "sql", + "-c", + "UPDATE users SET name = 'After' WHERE email = 'rollback@test.com'", + "--commit", + ]); + assert_eq!( + db.query("SELECT name FROM users WHERE email = 'rollback@test.com'"), + "After", + "After commit the data must be changed" + ); +} + +/// Dry-run JSON output reports the affected count and committed=false; a sample +/// of affected rows is included for a single DML statement. +#[test] +fn test_sql_dry_run_json_shape() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('json@test.com', 'Orig')"); + + let output = project.run_pgcrate(&[ + "sql", + "-c", + "UPDATE users SET name = 'New' WHERE email = 'json@test.com'", + "--allow-write", + "--json", + ]); + assert!(output.status.success(), "dry-run JSON should succeed"); + + let json = parse_json(&output); + let write = json.get("write").expect("write outcome present"); + assert_eq!( + write.get("committed").and_then(|v| v.as_bool()), + Some(false), + "dry run must report committed=false: {}", + json + ); + assert_eq!( + write.get("rows_affected").and_then(|v| v.as_u64()), + Some(1), + "dry run must report rows_affected=1: {}", + json + ); + + // A sample result with the would-be value should be present. + let has_sample = json + .get("results") + .and_then(|r| r.as_array()) + .map(|arr| { + arr.iter().any(|r| { + r.get("type").and_then(|t| t.as_str()) == Some("sample") + && serde_json::to_string(r).unwrap().contains("New") + }) + }) + .unwrap_or(false); + assert!(has_sample, "dry run should include a sample row: {}", json); + + // And the data is still unchanged. + assert_eq!( + db.query("SELECT name FROM users WHERE email = 'json@test.com'"), + "Orig" + ); +} + +/// The EXPLAIN cost estimate is surfaced in `--json` output (the `write.cost` +/// object) so JSON consumers get the same signal humans see on stderr. +#[test] +fn test_sql_dry_run_json_includes_cost() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('cost@test.com', 'Orig')"); + + let output = project.run_pgcrate(&[ + "sql", + "-c", + "UPDATE users SET name = 'New' WHERE email = 'cost@test.com'", + "--allow-write", + "--json", + ]); + assert!(output.status.success(), "dry-run JSON should succeed"); + + let json = parse_json(&output); + let cost = json + .get("write") + .and_then(|w| w.get("cost")) + .expect("write.cost present in JSON output"); + assert!( + cost.get("estimated").and_then(|v| v.as_f64()).is_some(), + "cost.estimated should be a number: {}", + json + ); + assert!( + cost.get("threshold").and_then(|v| v.as_f64()).is_some(), + "cost.threshold should be a number: {}", + json + ); + assert!( + cost.get("exceeds_threshold") + .and_then(|v| v.as_bool()) + .is_some(), + "cost.exceeds_threshold should be a bool: {}", + json + ); +} + +/// Non-transactional statements (CREATE INDEX CONCURRENTLY) cannot be previewed; +/// --allow-write alone must refuse with guidance to use --commit. +#[test] +fn test_sql_concurrent_index_requires_commit() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + + let output = project.run_pgcrate(&[ + "sql", + "-c", + "CREATE INDEX CONCURRENTLY users_name_idx ON users (name)", + "--allow-write", + ]); + assert!( + !output.status.success(), + "CREATE INDEX CONCURRENTLY under --allow-write should be refused" + ); + let err = stderr(&output); + assert!( + err.contains("--commit"), + "Refusal should point at --commit: {}", + err + ); +} + +// ============================================================================ +// Bypass regression: writes hidden inside a Query/Explain must not slip through +// the dry-run guard and commit in autocommit. +// ============================================================================ + +/// A writable CTE (`WITH d AS (DELETE … RETURNING *) SELECT …`) parses as a +/// plain query. Under `--allow-write` it must still dry-run and roll back — +/// previously it executed in autocommit and committed the DELETE. #[test] -fn test_sql_allows_write_with_flag() { +fn test_sql_writable_cte_delete_dry_runs() { skip_if_no_db!(); let db = TestDatabase::new(); let project = TestProject::from_fixture("with_migrations", &db); project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('cte@test.com', 'Keep')"); - // INSERT with --allow-write should work - let _output = project.run_pgcrate_ok(&[ + let output = project.run_pgcrate(&[ "sql", "-c", - "INSERT INTO users (email, name) VALUES ('allowed@test.com', 'Allowed')", + "WITH d AS (DELETE FROM users WHERE email = 'cte@test.com' RETURNING *) SELECT * FROM d", "--allow-write", ]); + assert!( + output.status.success(), + "writable-CTE delete dry-run should succeed: {}", + stderr(&output) + ); + assert!( + stdout(&output).to_uppercase().contains("DRY RUN"), + "should be announced as a dry run: {}", + stdout(&output) + ); + + // The row must survive — the embedded DELETE was rolled back. + let count = db.query("SELECT count(*) FROM users WHERE email = 'cte@test.com'"); + assert_eq!( + count, "1", + "writable-CTE DELETE must roll back under --allow-write (row must survive)" + ); +} + +/// A writable CTE under `--commit` actually applies the embedded write. +#[test] +fn test_sql_writable_cte_delete_commits() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('cte2@test.com', 'Gone')"); + + project.run_pgcrate_ok(&[ + "sql", + "-c", + "WITH d AS (DELETE FROM users WHERE email = 'cte2@test.com' RETURNING *) SELECT * FROM d", + "--commit", + ]); + + let count = db.query("SELECT count(*) FROM users WHERE email = 'cte2@test.com'"); + assert_eq!(count, "0", "writable-CTE DELETE under --commit must apply"); +} - // Verify data was inserted - let check = db.query("SELECT email FROM users WHERE email = 'allowed@test.com'"); +/// `EXPLAIN ANALYZE ` executes the statement. Under `--allow-write` it +/// must dry-run and roll back rather than committing the side effect. +#[test] +fn test_sql_explain_analyze_write_dry_runs() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + db.run_sql_ok("INSERT INTO users (email, name) VALUES ('ea@test.com', 'Keep')"); + + let output = project.run_pgcrate(&[ + "sql", + "-c", + "EXPLAIN ANALYZE DELETE FROM users WHERE email = 'ea@test.com'", + "--allow-write", + ]); assert!( - check.contains("allowed@test.com"), - "Data should be inserted with --allow-write" + output.status.success(), + "EXPLAIN ANALYZE write dry-run should succeed: {}", + stderr(&output) + ); + + let count = db.query("SELECT count(*) FROM users WHERE email = 'ea@test.com'"); + assert_eq!( + count, "1", + "EXPLAIN ANALYZE DELETE must roll back under --allow-write (row must survive)" + ); +} + +/// `SELECT … INTO new_table` materializes a table — a write. Under +/// `--allow-write` the table must not persist (rolled back). +#[test] +fn test_sql_select_into_dry_runs() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + project.run_pgcrate_ok(&["migrate", "up"]); + + let output = project.run_pgcrate(&[ + "sql", + "-c", + "SELECT * INTO zz_select_into FROM users", + "--allow-write", + ]); + assert!( + output.status.success(), + "SELECT INTO dry-run should succeed: {}", + stderr(&output) + ); + + let exists = db.query("SELECT to_regclass('zz_select_into') IS NOT NULL"); + assert_eq!( + exists, "f", + "SELECT INTO must roll back under --allow-write (table must not persist)" + ); +} + +// ============================================================================ +// Row caps +// ============================================================================ + +/// SELECT output is capped and a trailer announces the withheld rows. +#[test] +fn test_sql_select_is_capped() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + let output = project.run_pgcrate_ok(&[ + "sql", + "-c", + "SELECT g FROM generate_series(1, 20) g", + "--limit", + "5", + ]); + let out = stdout(&output); + assert!( + out.contains("more row"), + "Capped output should show a '+N more rows' trailer: {}", + out + ); +} + +/// `--limit 0` uncaps the result set (no trailer). +#[test] +fn test_sql_limit_zero_uncaps() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + let output = project.run_pgcrate_ok(&[ + "sql", + "-c", + "SELECT g FROM generate_series(1, 20) g", + "--limit", + "0", + ]); + let out = stdout(&output); + assert!( + !out.contains("more row"), + "--limit 0 should not truncate: {}", + out + ); + // All 20 values present. + assert!(out.contains("20"), "All rows should be shown: {}", out); +} + +/// The row cap and its truncation count are reflected in JSON output too. +#[test] +fn test_sql_cap_in_json() { + skip_if_no_db!(); + let db = TestDatabase::new(); + let project = TestProject::from_fixture("with_migrations", &db); + + let output = project.run_pgcrate(&[ + "sql", + "-c", + "SELECT g FROM generate_series(1, 20) g", + "--limit", + "5", + "--json", + ]); + assert!(output.status.success()); + let json = parse_json(&output); + let truncated = json + .get("results") + .and_then(|r| r.as_array()) + .and_then(|arr| { + arr.iter() + .find(|r| r.get("type").and_then(|t| t.as_str()) == Some("query")) + }) + .and_then(|q| q.get("truncated")) + .and_then(|t| t.as_u64()); + assert_eq!( + truncated, + Some(15), + "JSON query result should report 15 truncated rows: {}", + json ); }