Skip to content

PostgreSQL: cleanup CommentObject, AlterType, AlterDefaultPrivileges#2323

Closed
fmguerreiro wants to merge 75 commits into
apache:mainfrom
fmguerreiro:cleanup/comment-altertype-altdef
Closed

PostgreSQL: cleanup CommentObject, AlterType, AlterDefaultPrivileges#2323
fmguerreiro wants to merge 75 commits into
apache:mainfrom
fmguerreiro:cleanup/comment-altertype-altdef

Conversation

@fmguerreiro

Copy link
Copy Markdown
Contributor

Summary

  • CommentObject: add keyword_str and from_keyword helpers, collapse the 18-arm parse_comment match into a single lookup. MATERIALIZED VIEW stays inline as the only multi-keyword case. Display now reuses keyword_str.
  • Statement::Comment Display: drop a stray ; after the IF EXISTS write.
  • parse_alter_type (ALTER ATTRIBUTE): annotate let _ = parse_keywords(&[SET, DATA]) so it reads as canonicalization rather than a swallowed result.
  • AlterDefaultPrivilegesAction Display: factor a write_privileges_body helper for the <privs> ON <obj_type> {TO|FROM} <grantees> body shared by Grant and Revoke.
  • AlterDefaultPrivileges: drop the redundant Spanned impl that returned Span::empty(); the Statement::AlterDefaultPrivileges arm in spans.rs already covers the same fall-back.
  • AlterTypeOperation Attribute Display arms: use the existing display_option_spaced for the trailing drop_behavior, replacing four near-identical if let Some(...) blocks.
  • tests: drop one duplicate empty-arg verified_stmt (COMMENT ON FUNCTION public.noop()); the no_args() line in parse_comment_on_aggregate already exercises the same Some(vec![]) path.

No behavior change. -23 LOC net.

Test plan

  • cargo test (CI)
  • cargo fmt --all --check (clean)

…R TABLE operations (#1)

PostgreSQL supports FORCE ROW LEVEL SECURITY and NO FORCE ROW LEVEL SECURITY
as ALTER TABLE operations. Add parsing support for both variants.
feat: parse EXCLUDE constraints for PostgreSQL
…m-main

# Conflicts:
#	Cargo.toml
#	src/ast/ddl.rs
#	src/ast/helpers/stmt_create_table.rs
#	src/ast/mod.rs
#	src/ast/spans.rs
#	src/ast/table_constraints.rs
#	src/parser/mod.rs
#	tests/sqlparser_postgres.rs
chore: merge upstream apache/datafusion-sqlparser-rs main
- Add HANDLER and VALIDATOR keywords to the keyword list
- Add FdwRoutineClause enum for HANDLER/NO HANDLER and VALIDATOR/NO VALIDATOR clauses
- Add CreateForeignDataWrapper struct for CREATE FOREIGN DATA WRAPPER
- Add CreateForeignTable struct for CREATE FOREIGN TABLE
- Export new types from ast::mod and add Statement variants
- Add spans.rs coverage returning Span::empty() for the new variants
- Dispatch on FOREIGN keyword in parse_create, branching on DATA WRAPPER or TABLE
- parse_create_foreign_data_wrapper: parses optional HANDLER/NO HANDLER,
  VALIDATOR/NO VALIDATOR, and OPTIONS clauses
- parse_create_foreign_table: parses IF NOT EXISTS, column list via parse_columns,
  required SERVER name, and optional OPTIONS clause
Round-trip tests via pg().verified_stmt for:
- FDW: name-only, HANDLER, NO HANDLER, NO VALIDATOR, combined HANDLER+VALIDATOR+OPTIONS
- FOREIGN TABLE: basic columns+SERVER, IF NOT EXISTS, table-level OPTIONS
Add Statement::CreateAggregate, CreateAggregate struct, CreateAggregateOption
enum, and AggregateModifyKind enum to represent PostgreSQL CREATE AGGREGATE
DDL. Options are stored as a typed enum covering all documented parameters
(SFUNC, STYPE, FINALFUNC, PARALLEL, moving-aggregate variants, etc.).
Wire AGGREGATE into the CREATE dispatch (before the or_replace error branch
so CREATE OR REPLACE AGGREGATE is accepted). parse_create_aggregate parses
the name, argument-type list, and the options block. Each recognised option
keyword dispatches to parse_create_aggregate_option which produces the typed
CreateAggregateOption variant.
Three tests covering: basic old-style aggregate (SFUNC/STYPE/FINALFUNC/INITCOND),
CREATE OR REPLACE with PARALLEL = SAFE, and moving-aggregate options
(MSFUNC/MINVFUNC/MSTYPE/MFINALFUNC_EXTRA/MFINALFUNC_MODIFY). All use
pg().verified_stmt() to assert parse-then-display round-trips identically.
feat: parse CREATE TEXT SEARCH CONFIGURATION/DICTIONARY/PARSER/TEMPLATE
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
#	src/parser/mod.rs
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
#	tests/sqlparser_postgres.rs
…le AST nodes

Add AST structs and Display impls for four PostgreSQL catalog DDL
statements. Each follows the established fork pattern: struct in ddl.rs
with Debug/Clone/PartialEq/etc derives, Statement enum variant, Display
arm, From impl, and Span::empty() arm in spans.rs.

New keywords: ALSO, ASSIGNMENT, CONVERSION, IMPLICIT, INLINE,
PROCEDURAL, TRUSTED.

Closes pgmold-89, pgmold-91, pgmold-92, pgmold-93.
feat: parse SECURITY LABEL, CREATE USER MAPPING, CREATE TABLESPACE
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
#	src/ast/spans.rs
#	src/parser/mod.rs
#	tests/sqlparser_postgres.rs
feat: parse CREATE STATISTICS, CREATE ACCESS METHOD, CREATE EVENT TRIGGER, CREATE TRANSFORM
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
#	src/ast/spans.rs
#	src/parser/mod.rs
#	tests/sqlparser_postgres.rs
feat: parse CREATE CAST, CREATE CONVERSION, CREATE LANGUAGE, CREATE RULE
PRs #14, #15, and #16 each added new AST types that collide in the
same places in src/ast/ddl.rs, src/ast/mod.rs, src/parser/mod.rs,
and tests/sqlparser_postgres.rs. Several mid-match-arm and
mid-function-body concat resolutions during the three rebase-and-merge
passes left fn / impl / match blocks missing their closing braces
and left the ddl re-export list with duplicate entries.

This restores compile:
- Adds missing closing braces for CastFunctionKind, StatisticsKind,
  CreateRule / CreateStatistics / CreateEventTrigger / CreateTransform /
  CreateLanguage Display impls and parse fns, plus the CreateTransform
  From-to-Statement impl.
- Dedupes and alphabetizes the `pub use self::ddl::{ ... }` list in
  src/ast/mod.rs.
- Adds missing doc comments on CastFunctionKind fields and RuleEvent
  variants that #[warn(missing_docs)] requires.

Unblocks 0.60.10 publish.
fix: close delimiter gaps left by concat merges
PostgreSQL's pg_dump emits WITH NO DATA on unpopulated materialized views.
Add a with_data: Option<bool> field to CreateView and parse the clause
after the query body. Bump to 0.60.11.
This was an accidentally-committed gitlink (160000 mode) with no
corresponding entry in .gitmodules. It does not break git itself
but causes any cargo git-dep consumer to fail cloning the repo:

  failed to update submodule `.worktrees/fix-pr-2307-ci`
  no URL configured for submodule '.worktrees/fix-pr-2307-ci'

Same class of residue that 837b5a0 on feat/exclude-constraint-upstream
cleaned up, but that commit never reached main so the gitlink still
ships on every descendant branch.
feat(parser): parse WITH [NO] DATA on CREATE MATERIALIZED VIEW
Add AlterTableOperation::AttachPartitionOf and DetachPartitionOf variants
for PostgreSQL declarative partition DDL:

  ALTER TABLE parent ATTACH PARTITION child { FOR VALUES <spec> | DEFAULT }
  ALTER TABLE parent DETACH PARTITION child [ CONCURRENTLY | FINALIZE ]

These are distinct from the existing ClickHouse AttachPartition/DetachPartition
variants which use the PART|PARTITION <expr> syntax.

- Add FINALIZE keyword to the keyword list
- Add span impls for the new variants
- Add parser tests covering range, list, hash, default attach and
  plain, concurrently, finalize detach
- Bump version to 0.60.12
…h-partition

feat: parse ALTER TABLE ATTACH/DETACH PARTITION for PostgreSQL
FULLTEXT and SPATIAL are MySQL-specific table constraint keywords. When
parse_optional_table_constraint encountered them in a PostgreSQL context
it would consume the token, fall through to the _ arm, and call
prev_token() to restore it — but the consume-then-backtrack round-trip
left room for a subtle parser-state bug that caused the next call to
parse_data_type to see 'fulltext' instead of the actual column type.

Fix: add an explicit peek-based guard at the top of
parse_optional_table_constraint so that for dialects other than MySQL /
Generic the FULLTEXT / SPATIAL tokens are never consumed at all. The
caller (parse_columns) then correctly falls through to parse_column_def
and treats the word as an ordinary column identifier.

Also add a parser unit test covering a tsvector column literally named
'fulltext' (as in the pagila DVD-rental schema) and a GiST index that
references the same column by name.

Bump version to 0.60.13.
fix(parser): guard MySQL FULLTEXT/SPATIAL table constraint on dialect
…TE FUNCTION

TriggerExecBody previously reused FunctionDesc (and parse_function_desc),
which parsed arguments as CREATE-FUNCTION-style parameter declarations
(argname argtype DEFAULT expr). Any trigger that passes literal arguments
such as tsvector_update_trigger('fulltext', 'pg_catalog.english', ...) would
fail with "Expected: a data type name, found: 'fulltext'".

Replace the func_desc: FunctionDesc field with func_name: ObjectName and
args: Option<Vec<Expr>>, and replace the parse_function_desc() call in
parse_trigger_exec_body with parse_object_name() + parse_comma_separated
(Parser::parse_expr). This treats args as call-site expressions, matching
PostgreSQL semantics. The None/Some(vec![]) distinction preserves whether
parentheses were written at all.

Bump version to 0.60.14.
fix(parser): accept call-site expression args in CREATE TRIGGER EXECUTE FUNCTION
…est) (#23)

* chore(tests): restore green main; fix clippy and keyword sort regressions

`cargo test --all-features` and `cargo clippy --all-targets --all-features
-- -D warnings` were both failing on `main`; CI never ran on the fork to
catch it. This commit makes them green so subsequent PRs can be
verified.

Test compile errors:
- `Statement::AlterTable` was refactored from struct to tuple form
  carrying `AlterTable`. Update 10 ATTACH/DETACH PARTITION test sites
  in `sqlparser_postgres.rs` to match the new shape.
- `CreateView` gained a `with_data` field (added with the
  `WITH [NO] DATA` parser support). Add `with_data: _,` to 6 destructure
  sites in `sqlparser_common.rs`.

Test runtime failures:
- `all_keywords_sorted`: `CONFIGURATION`, `INLINE`, and `TRANSFORM` were
  inserted out of ASCII order in `keywords.rs`; move each to its
  alphabetical slot.
- `alter_procedure_rename` / `alter_procedure_set_search_path` /
  `parse_fulltext_column_and_index_in_postgres`: `verified_stmt` round-trip
  fails because Display canonicalises type names and access methods to
  uppercase. Update the input SQL to match.

Lib clippy regressions (forbidden via `#![forbid(clippy::unreachable)]`
and `-D warnings`):
- `parse_create_aggregate_options` PARALLEL arm: replace `_ => unreachable!()`
  with an explicit `Internal parser error` so `clippy::unreachable` is
  satisfied.
- `parse_create_aggregate_args`: drop the let-and-return.
- `parse_create_rule` INSTEAD/ALSO: collapse the always-false else-if
  branches into a single optional-keyword consume.

Doctest:
- `Statement::CreateStatistics` doc block was missing its opening
  ```sql fence.

Spans:
- Drop the duplicate `ForceRowLevelSecurity` / `NoForceRowLevelSecurity`
  arms in `AlterTableOperation::span()` (warnings, but `-D warnings`
  blocks clippy).

* chore(ci): fix lint, no-std, RAT, and benchmark-lint regressions

CI on `main` was failing five jobs (test/lint/compile-no-std/RAT/
benchmark-lint). The first commit on this branch fixed test compile +
runtime + a few clippy and doctest issues found locally. CI surfaced
the rest. This commit covers them.

`lint` (clippy 1.95 — 8 collapsible_match errors)
- Fold inner `if` blocks into match guards in `parse_sql` (END token
  delimiter), `parse_compound_field` (period-qualified identifier), the
  DO/loop sequence terminator, and the five `HiveRowFormat::DELIMITED`
  arms (FIELDS / COLLECTION / MAP / LINES / NULL DEFINED AS).

`compile-no-std` (`thumbv6m-none-eabi`)
- `src/ast/table_constraints.rs` introduced an EXCLUDE-constraint type
  with `pub operator: String` but did not import `String` in the
  `#[cfg(not(feature = "std"))]` alloc block. Add `string::String`.

`Release Audit Tool (RAT)`
- Add `CLAUDE.md` to `dev/release/rat_exclude_files.txt` next to the
  existing `AGENTS.md` exclusion. Both are agent-guideline files that
  do not need an Apache header.

`benchmark-lint`
- `sqlparser_bench/Cargo.toml` referenced the path-dep as `sqlparser`
  but the fork's package was renamed to `pgmold-sqlparser`. Add the
  `package = "pgmold-sqlparser"` rename so the bench resolves.

Verified locally on rustc 1.95.0:
  cargo fmt --all -- --check
  cargo clippy --all-targets --all-features -- -D warnings
  cargo test --all-features
  cargo check --no-default-features --target thumbv6m-none-eabi
  (cd sqlparser_bench && cargo clippy --all-targets --all-features -- -D warnings)
…AIN (#24)

`GRANT USAGE ON TYPE foo TO role` and `GRANT ON DOMAIN d TO role` are
valid PostgreSQL grammar but failed to parse — `parse_grant_target`
had no `TYPE` or `DOMAIN` keyword, and `GrantObjects` lacked variants
to carry them.

Add `GrantObjects::Types(Vec<ObjectName>)` and
`GrantObjects::Domains(Vec<ObjectName>)`, along with their `Display`
arms, the `Keyword::TYPE` / `Keyword::DOMAIN` arms in
`parse_grant_deny_revoke_privileges_objects`, and round-trip tests
in `parse_grant`.

Verified:
  cargo fmt --all -- --check
  cargo clippy --all-targets --all-features -- -D warnings
  cargo test --all-features parse_grant
…-quoted bodies (#25)

* PostgreSQL: extend COMMENT ON for trigger/aggregate/policy and dollar-quoted bodies

Adds Trigger, Aggregate, and Policy variants to CommentObject and extends
Statement::Comment with optional `arguments: Option<Vec<DataType>>` (for
FUNCTION/PROCEDURE/AGGREGATE overload signatures) and `relation: Option<ObjectName>`
(for the `ON <table>` tail of TRIGGER and POLICY). Parser now accepts:

- COMMENT ON AGGREGATE name(arg, ...) IS '...'
- COMMENT ON TRIGGER name ON table IS '...'
- COMMENT ON POLICY name ON table IS '...'
- COMMENT ON FUNCTION/PROCEDURE name(arg, ...) IS '...'

parse_literal_string now accepts Token::DollarQuotedString in the
PostgreSQL/Generic dialects, enabling COMMENT ON ... IS $$body$$ and
$tag$body$tag$.

NOTE: CommentObject and Statement::Comment are not #[non_exhaustive], so
new variants and new struct fields are a breaking change for downstream
consumers that pattern-match exhaustively. Recommend bumping to 0.61.0
for the next release rather than a 0.60.x patch.

* PostgreSQL: require argument list for COMMENT ON AGGREGATE, escape quotes in Display

- Reject bare `COMMENT ON AGGREGATE foo IS '...'`; PostgreSQL requires an
  argument list as part of the aggregate's identity (distinguishes ordinary
  vs ordered-set forms). FUNCTION and PROCEDURE remain tolerant of the bare
  form for backward compatibility.
- Escape embedded single quotes in the Statement::Comment Display impl via
  value::escape_single_quote_string. This was latent but became reachable
  with dollar-quoted parsing (e.g. $$it's$$ previously round-tripped to
  invalid SQL `'it's'`).
- Add negative parser test for the AGGREGATE guard and a round-trip test
  for a dollar-quoted body containing a single quote.
…nt::AlterDefaultPrivileges (#26)

* PostgreSQL: AlterTypeOperation owner/schema/attribute ops and Statement::AlterDefaultPrivileges

Postgres rejects `ALTER TYPE ... OWNER TO ...`, `ALTER TYPE ... SET SCHEMA ...`,
`ALTER TYPE ... { ADD | DROP | ALTER | RENAME } ATTRIBUTE ...`, and every
`ALTER DEFAULT PRIVILEGES ...` form because the AST has no shape for them. This
forces downstreams (e.g. pgmold) to strip those statements before parsing.

`AlterTypeOperation` gains six variants (`OwnerTo`, `SetSchema`,
`AddAttribute`, `DropAttribute`, `AlterAttribute`, `RenameAttribute`). The
parser branches accept the matching grammar including optional `IF [NOT]
EXISTS`, `COLLATE`, and `CASCADE | RESTRICT`.

A new `Statement::AlterDefaultPrivileges` carries `for_roles`, `in_schemas`,
and an `AlterDefaultPrivilegesAction { Grant | Revoke }` body that reuses
`Privileges` and `Grantee`. The `ON <kind>` target uses a new
`AlterDefaultPrivilegesObjectType` enum (`TABLES | SEQUENCES | FUNCTIONS |
ROUTINES | TYPES | SCHEMAS`). New keywords: `ATTRIBUTE`, `ROUTINES`, `TYPES`.
The `parse_alter` dispatch now accepts `DEFAULT` as the leading word.

Note: both `AlterTypeOperation` and `Statement` are not `non_exhaustive`, so
adding variants is technically a breaking change for downstream consumers
that match exhaustively without `_`. Recommend bumping the next release to
0.61.0.

Verified:
  cargo fmt --all
  cargo clippy --all-targets --all-features -- -D warnings
  cargo test --all-features
  cargo check --no-default-features --target thumbv6m-none-eabi
  (cd sqlparser_bench && cargo clippy --all-targets --all-features -- -D warnings)

* PostgreSQL: fix ALTER TYPE ADD ATTRIBUTE grammar, Grantee PUBLIC Display

Review fixes for PR #26:

1. Remove IF NOT EXISTS from AlterTypeOperation::AddAttribute. PostgreSQL's
   grammar for ADD ATTRIBUTE does not include IF NOT EXISTS; only ADD VALUE
   (for enum types) does. The parser was accepting SQL that PostgreSQL itself
   rejects.

2. Fix pre-existing trailing-space bug in Grantee Display: GranteesType::Public
   wrote "PUBLIC " (with trailing space) then fell through to the name-formatting
   path, yielding "PUBLIC " for bare PUBLIC grantees. Early-return with "PUBLIC"
   instead. Adds regression tests for GRANT ... TO PUBLIC and ALTER DEFAULT
   PRIVILEGES ... TO PUBLIC round-trips.

3. Document that AlterDefaultPrivileges::for_roles is keyword-agnostic: the
   parser accepts both FOR ROLE and FOR USER (PG synonyms) but Display
   canonicalises to FOR ROLE.
…rgname/VARIADIC (#28)

Postgres permits argmode (IN/OUT/INOUT/VARIADIC) and argname tokens
ahead of each argument type in COMMENT ON FUNCTION / PROCEDURE /
AGGREGATE; they are ignored for identity resolution but appear in
real-world dumps and consumer schema files. The previous parser used
bare parse_data_type for the argument list, which hard-failed on those
shapes.

Switch the function-like arms of parse_comment to parse_function_arg
and project to data_type, since Statement::Comment.arguments stores a
plain Vec<DataType> and the modes/names are intentionally discarded
for identity matching.

Add tests covering argname, IN/OUT/INOUT, VARIADIC, and AGGREGATE.
…Privileges

- CommentObject: add `keyword_str` and `from_keyword` helpers; collapse the
  18-arm `parse_comment` match into a single lookup, with `MATERIALIZED VIEW`
  preserved as the only multi-keyword case. Reuse `keyword_str` for `Display`.
- Statement::Comment Display: drop the stray `;` after `IF EXISTS` write.
- ALTER TYPE ... ALTER ATTRIBUTE: comment the `let _ = parse_keywords([SET,
  DATA])` so it reads as canonicalization rather than a swallowed result.
- AlterDefaultPrivilegesAction Display: extract a shared `write_privileges_body`
  helper for the GRANT/REVOKE body shape.
- AlterDefaultPrivileges: remove the redundant `Spanned` impl that returned
  `Span::empty()`; the `Statement::AlterDefaultPrivileges` arm in `spans.rs`
  already covers it.
- AlterTypeOperation attribute Display arms: use `display_option_spaced` for
  the trailing `drop_behavior`, dropping four near-identical `if let Some` blocks.
- tests: drop a duplicate empty-arg `verified_stmt` between
  `parse_comment_on_aggregate` and `parse_comment_on_function_with_arg_types`.

No behavior change.
@fmguerreiro

Copy link
Copy Markdown
Contributor Author

Apologies — opened against the wrong repository. This is a downstream cleanup for the pgmold-sqlparser fork (fmguerreiro/datafusion-sqlparser-rs); will reopen there. Sorry for the noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant