Skip to content

PostgreSQL: COMMENT ON CONSTRAINT / OPERATOR / RULE#2324

Closed
fmguerreiro wants to merge 78 commits into
apache:mainfrom
fmguerreiro:pgmold-297/comment-constraint-operator-rule
Closed

PostgreSQL: COMMENT ON CONSTRAINT / OPERATOR / RULE#2324
fmguerreiro wants to merge 78 commits into
apache:mainfrom
fmguerreiro:pgmold-297/comment-constraint-operator-rule

Conversation

@fmguerreiro

Copy link
Copy Markdown
Contributor

Summary

Adds parser + AST coverage for the three remaining COMMENT ON … shapes that the fork had been silently rejecting:

  • COMMENT ON CONSTRAINT name ON [DOMAIN] target IS '…'
  • COMMENT ON OPERATOR name (left, right) IS '…' — each side may be NONE for unary operators
  • COMMENT ON RULE name ON target IS '…'

Changes

  • CommentObject gains Constraint, Operator, Rule variants in keyword_str and from_keyword.
  • New CommentOperatorArgs { left: Option<DataType>, right: Option<DataType> } carries the operand signature for OPERATOR, with Display rendering NONE for unary slots.
  • Statement::Comment gains two fields: operator_args: Option<CommentOperatorArgs> and on_domain: bool (only meaningful for Constraint). Existing destructures will need to add the two fields.
  • parse_comment dispatches the new variants:
    • OPERATOR uses parse_operator_name() (so +, myschema.@@, etc. round-trip) and a mandatory (slot, slot) signature.
    • CONSTRAINT and RULE require an ON <relation> tail; CONSTRAINT additionally accepts ON DOMAIN <domain>.
  • Refactors parse_drop_operator_signature to share a new parse_operator_arg_type_or_none() helper with parse_comment.
  • Bumps fork to 0.63.0; adds changelog/0.63.0.md.

Breaking changes

Statement::Comment adds operator_args and on_domain fields. Consumers that destructure all fields explicitly must add them; those using .. rest patterns are unaffected. Documented in changelog/0.63.0.md.

Test plan

  • cargo test --package pgmold-sqlparser (CI)
  • New tests in tests/sqlparser_postgres.rs:
    • parse_comment_on_constraint_on_table (binary form + IS NULL + IF EXISTS)
    • parse_comment_on_constraint_on_domain (unqualified + schema-qualified domain)
    • parse_comment_on_constraint_requires_relation_tail (error case)
    • parse_comment_on_operator_binary (public.+(INTEGER, INTEGER))
    • parse_comment_on_operator_prefix_left_none (unary prefix, NONE left)
    • parse_comment_on_operator_postfix_right_none (unary postfix, NONE right)
    • parse_comment_on_operator_requires_argument_list (error case)
    • parse_comment_on_rule + IS NULL
    • parse_comment_on_rule_requires_relation_tail (error case)
  • Updated all existing Statement::Comment destructures in tests/sqlparser_postgres.rs to include the two new fields.

…R TABLE operations (#1)

PostgreSQL supports FORCE ROW LEVEL SECURITY and NO FORCE ROW LEVEL SECURITY
as ALTER TABLE operations. Add parsing support for both variants.
feat: parse EXCLUDE constraints for PostgreSQL
…m-main

# Conflicts:
#	Cargo.toml
#	src/ast/ddl.rs
#	src/ast/helpers/stmt_create_table.rs
#	src/ast/mod.rs
#	src/ast/spans.rs
#	src/ast/table_constraints.rs
#	src/parser/mod.rs
#	tests/sqlparser_postgres.rs
chore: merge upstream apache/datafusion-sqlparser-rs main
- Add HANDLER and VALIDATOR keywords to the keyword list
- Add FdwRoutineClause enum for HANDLER/NO HANDLER and VALIDATOR/NO VALIDATOR clauses
- Add CreateForeignDataWrapper struct for CREATE FOREIGN DATA WRAPPER
- Add CreateForeignTable struct for CREATE FOREIGN TABLE
- Export new types from ast::mod and add Statement variants
- Add spans.rs coverage returning Span::empty() for the new variants
- Dispatch on FOREIGN keyword in parse_create, branching on DATA WRAPPER or TABLE
- parse_create_foreign_data_wrapper: parses optional HANDLER/NO HANDLER,
  VALIDATOR/NO VALIDATOR, and OPTIONS clauses
- parse_create_foreign_table: parses IF NOT EXISTS, column list via parse_columns,
  required SERVER name, and optional OPTIONS clause
Round-trip tests via pg().verified_stmt for:
- FDW: name-only, HANDLER, NO HANDLER, NO VALIDATOR, combined HANDLER+VALIDATOR+OPTIONS
- FOREIGN TABLE: basic columns+SERVER, IF NOT EXISTS, table-level OPTIONS
Add Statement::CreateAggregate, CreateAggregate struct, CreateAggregateOption
enum, and AggregateModifyKind enum to represent PostgreSQL CREATE AGGREGATE
DDL. Options are stored as a typed enum covering all documented parameters
(SFUNC, STYPE, FINALFUNC, PARALLEL, moving-aggregate variants, etc.).
Wire AGGREGATE into the CREATE dispatch (before the or_replace error branch
so CREATE OR REPLACE AGGREGATE is accepted). parse_create_aggregate parses
the name, argument-type list, and the options block. Each recognised option
keyword dispatches to parse_create_aggregate_option which produces the typed
CreateAggregateOption variant.
Three tests covering: basic old-style aggregate (SFUNC/STYPE/FINALFUNC/INITCOND),
CREATE OR REPLACE with PARALLEL = SAFE, and moving-aggregate options
(MSFUNC/MINVFUNC/MSTYPE/MFINALFUNC_EXTRA/MFINALFUNC_MODIFY). All use
pg().verified_stmt() to assert parse-then-display round-trips identically.
feat: parse CREATE TEXT SEARCH CONFIGURATION/DICTIONARY/PARSER/TEMPLATE
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
#	src/parser/mod.rs
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
#	tests/sqlparser_postgres.rs
feat: parse CREATE STATISTICS, CREATE ACCESS METHOD, CREATE EVENT TRIGGER, CREATE TRANSFORM
# Conflicts:
#	src/ast/ddl.rs
#	src/ast/mod.rs
#	src/ast/spans.rs
#	src/parser/mod.rs
#	tests/sqlparser_postgres.rs
feat: parse CREATE CAST, CREATE CONVERSION, CREATE LANGUAGE, CREATE RULE
PRs #14, #15, and #16 each added new AST types that collide in the
same places in src/ast/ddl.rs, src/ast/mod.rs, src/parser/mod.rs,
and tests/sqlparser_postgres.rs. Several mid-match-arm and
mid-function-body concat resolutions during the three rebase-and-merge
passes left fn / impl / match blocks missing their closing braces
and left the ddl re-export list with duplicate entries.

This restores compile:
- Adds missing closing braces for CastFunctionKind, StatisticsKind,
  CreateRule / CreateStatistics / CreateEventTrigger / CreateTransform /
  CreateLanguage Display impls and parse fns, plus the CreateTransform
  From-to-Statement impl.
- Dedupes and alphabetizes the `pub use self::ddl::{ ... }` list in
  src/ast/mod.rs.
- Adds missing doc comments on CastFunctionKind fields and RuleEvent
  variants that #[warn(missing_docs)] requires.

Unblocks 0.60.10 publish.
fix: close delimiter gaps left by concat merges
PostgreSQL's pg_dump emits WITH NO DATA on unpopulated materialized views.
Add a with_data: Option<bool> field to CreateView and parse the clause
after the query body. Bump to 0.60.11.
This was an accidentally-committed gitlink (160000 mode) with no
corresponding entry in .gitmodules. It does not break git itself
but causes any cargo git-dep consumer to fail cloning the repo:

  failed to update submodule `.worktrees/fix-pr-2307-ci`
  no URL configured for submodule '.worktrees/fix-pr-2307-ci'

Same class of residue that 837b5a0 on feat/exclude-constraint-upstream
cleaned up, but that commit never reached main so the gitlink still
ships on every descendant branch.
feat(parser): parse WITH [NO] DATA on CREATE MATERIALIZED VIEW
Add AlterTableOperation::AttachPartitionOf and DetachPartitionOf variants
for PostgreSQL declarative partition DDL:

  ALTER TABLE parent ATTACH PARTITION child { FOR VALUES <spec> | DEFAULT }
  ALTER TABLE parent DETACH PARTITION child [ CONCURRENTLY | FINALIZE ]

These are distinct from the existing ClickHouse AttachPartition/DetachPartition
variants which use the PART|PARTITION <expr> syntax.

- Add FINALIZE keyword to the keyword list
- Add span impls for the new variants
- Add parser tests covering range, list, hash, default attach and
  plain, concurrently, finalize detach
- Bump version to 0.60.12
…h-partition

feat: parse ALTER TABLE ATTACH/DETACH PARTITION for PostgreSQL
FULLTEXT and SPATIAL are MySQL-specific table constraint keywords. When
parse_optional_table_constraint encountered them in a PostgreSQL context
it would consume the token, fall through to the _ arm, and call
prev_token() to restore it — but the consume-then-backtrack round-trip
left room for a subtle parser-state bug that caused the next call to
parse_data_type to see 'fulltext' instead of the actual column type.

Fix: add an explicit peek-based guard at the top of
parse_optional_table_constraint so that for dialects other than MySQL /
Generic the FULLTEXT / SPATIAL tokens are never consumed at all. The
caller (parse_columns) then correctly falls through to parse_column_def
and treats the word as an ordinary column identifier.

Also add a parser unit test covering a tsvector column literally named
'fulltext' (as in the pagila DVD-rental schema) and a GiST index that
references the same column by name.

Bump version to 0.60.13.
fix(parser): guard MySQL FULLTEXT/SPATIAL table constraint on dialect
…TE FUNCTION

TriggerExecBody previously reused FunctionDesc (and parse_function_desc),
which parsed arguments as CREATE-FUNCTION-style parameter declarations
(argname argtype DEFAULT expr). Any trigger that passes literal arguments
such as tsvector_update_trigger('fulltext', 'pg_catalog.english', ...) would
fail with "Expected: a data type name, found: 'fulltext'".

Replace the func_desc: FunctionDesc field with func_name: ObjectName and
args: Option<Vec<Expr>>, and replace the parse_function_desc() call in
parse_trigger_exec_body with parse_object_name() + parse_comma_separated
(Parser::parse_expr). This treats args as call-site expressions, matching
PostgreSQL semantics. The None/Some(vec![]) distinction preserves whether
parentheses were written at all.

Bump version to 0.60.14.
fix(parser): accept call-site expression args in CREATE TRIGGER EXECUTE FUNCTION
…est) (#23)

* chore(tests): restore green main; fix clippy and keyword sort regressions

`cargo test --all-features` and `cargo clippy --all-targets --all-features
-- -D warnings` were both failing on `main`; CI never ran on the fork to
catch it. This commit makes them green so subsequent PRs can be
verified.

Test compile errors:
- `Statement::AlterTable` was refactored from struct to tuple form
  carrying `AlterTable`. Update 10 ATTACH/DETACH PARTITION test sites
  in `sqlparser_postgres.rs` to match the new shape.
- `CreateView` gained a `with_data` field (added with the
  `WITH [NO] DATA` parser support). Add `with_data: _,` to 6 destructure
  sites in `sqlparser_common.rs`.

Test runtime failures:
- `all_keywords_sorted`: `CONFIGURATION`, `INLINE`, and `TRANSFORM` were
  inserted out of ASCII order in `keywords.rs`; move each to its
  alphabetical slot.
- `alter_procedure_rename` / `alter_procedure_set_search_path` /
  `parse_fulltext_column_and_index_in_postgres`: `verified_stmt` round-trip
  fails because Display canonicalises type names and access methods to
  uppercase. Update the input SQL to match.

Lib clippy regressions (forbidden via `#![forbid(clippy::unreachable)]`
and `-D warnings`):
- `parse_create_aggregate_options` PARALLEL arm: replace `_ => unreachable!()`
  with an explicit `Internal parser error` so `clippy::unreachable` is
  satisfied.
- `parse_create_aggregate_args`: drop the let-and-return.
- `parse_create_rule` INSTEAD/ALSO: collapse the always-false else-if
  branches into a single optional-keyword consume.

Doctest:
- `Statement::CreateStatistics` doc block was missing its opening
  ```sql fence.

Spans:
- Drop the duplicate `ForceRowLevelSecurity` / `NoForceRowLevelSecurity`
  arms in `AlterTableOperation::span()` (warnings, but `-D warnings`
  blocks clippy).

* chore(ci): fix lint, no-std, RAT, and benchmark-lint regressions

CI on `main` was failing five jobs (test/lint/compile-no-std/RAT/
benchmark-lint). The first commit on this branch fixed test compile +
runtime + a few clippy and doctest issues found locally. CI surfaced
the rest. This commit covers them.

`lint` (clippy 1.95 — 8 collapsible_match errors)
- Fold inner `if` blocks into match guards in `parse_sql` (END token
  delimiter), `parse_compound_field` (period-qualified identifier), the
  DO/loop sequence terminator, and the five `HiveRowFormat::DELIMITED`
  arms (FIELDS / COLLECTION / MAP / LINES / NULL DEFINED AS).

`compile-no-std` (`thumbv6m-none-eabi`)
- `src/ast/table_constraints.rs` introduced an EXCLUDE-constraint type
  with `pub operator: String` but did not import `String` in the
  `#[cfg(not(feature = "std"))]` alloc block. Add `string::String`.

`Release Audit Tool (RAT)`
- Add `CLAUDE.md` to `dev/release/rat_exclude_files.txt` next to the
  existing `AGENTS.md` exclusion. Both are agent-guideline files that
  do not need an Apache header.

`benchmark-lint`
- `sqlparser_bench/Cargo.toml` referenced the path-dep as `sqlparser`
  but the fork's package was renamed to `pgmold-sqlparser`. Add the
  `package = "pgmold-sqlparser"` rename so the bench resolves.

Verified locally on rustc 1.95.0:
  cargo fmt --all -- --check
  cargo clippy --all-targets --all-features -- -D warnings
  cargo test --all-features
  cargo check --no-default-features --target thumbv6m-none-eabi
  (cd sqlparser_bench && cargo clippy --all-targets --all-features -- -D warnings)
…AIN (#24)

`GRANT USAGE ON TYPE foo TO role` and `GRANT ON DOMAIN d TO role` are
valid PostgreSQL grammar but failed to parse — `parse_grant_target`
had no `TYPE` or `DOMAIN` keyword, and `GrantObjects` lacked variants
to carry them.

Add `GrantObjects::Types(Vec<ObjectName>)` and
`GrantObjects::Domains(Vec<ObjectName>)`, along with their `Display`
arms, the `Keyword::TYPE` / `Keyword::DOMAIN` arms in
`parse_grant_deny_revoke_privileges_objects`, and round-trip tests
in `parse_grant`.

Verified:
  cargo fmt --all -- --check
  cargo clippy --all-targets --all-features -- -D warnings
  cargo test --all-features parse_grant
…-quoted bodies (#25)

* PostgreSQL: extend COMMENT ON for trigger/aggregate/policy and dollar-quoted bodies

Adds Trigger, Aggregate, and Policy variants to CommentObject and extends
Statement::Comment with optional `arguments: Option<Vec<DataType>>` (for
FUNCTION/PROCEDURE/AGGREGATE overload signatures) and `relation: Option<ObjectName>`
(for the `ON <table>` tail of TRIGGER and POLICY). Parser now accepts:

- COMMENT ON AGGREGATE name(arg, ...) IS '...'
- COMMENT ON TRIGGER name ON table IS '...'
- COMMENT ON POLICY name ON table IS '...'
- COMMENT ON FUNCTION/PROCEDURE name(arg, ...) IS '...'

parse_literal_string now accepts Token::DollarQuotedString in the
PostgreSQL/Generic dialects, enabling COMMENT ON ... IS $$body$$ and
$tag$body$tag$.

NOTE: CommentObject and Statement::Comment are not #[non_exhaustive], so
new variants and new struct fields are a breaking change for downstream
consumers that pattern-match exhaustively. Recommend bumping to 0.61.0
for the next release rather than a 0.60.x patch.

* PostgreSQL: require argument list for COMMENT ON AGGREGATE, escape quotes in Display

- Reject bare `COMMENT ON AGGREGATE foo IS '...'`; PostgreSQL requires an
  argument list as part of the aggregate's identity (distinguishes ordinary
  vs ordered-set forms). FUNCTION and PROCEDURE remain tolerant of the bare
  form for backward compatibility.
- Escape embedded single quotes in the Statement::Comment Display impl via
  value::escape_single_quote_string. This was latent but became reachable
  with dollar-quoted parsing (e.g. $$it's$$ previously round-tripped to
  invalid SQL `'it's'`).
- Add negative parser test for the AGGREGATE guard and a round-trip test
  for a dollar-quoted body containing a single quote.
…nt::AlterDefaultPrivileges (#26)

* PostgreSQL: AlterTypeOperation owner/schema/attribute ops and Statement::AlterDefaultPrivileges

Postgres rejects `ALTER TYPE ... OWNER TO ...`, `ALTER TYPE ... SET SCHEMA ...`,
`ALTER TYPE ... { ADD | DROP | ALTER | RENAME } ATTRIBUTE ...`, and every
`ALTER DEFAULT PRIVILEGES ...` form because the AST has no shape for them. This
forces downstreams (e.g. pgmold) to strip those statements before parsing.

`AlterTypeOperation` gains six variants (`OwnerTo`, `SetSchema`,
`AddAttribute`, `DropAttribute`, `AlterAttribute`, `RenameAttribute`). The
parser branches accept the matching grammar including optional `IF [NOT]
EXISTS`, `COLLATE`, and `CASCADE | RESTRICT`.

A new `Statement::AlterDefaultPrivileges` carries `for_roles`, `in_schemas`,
and an `AlterDefaultPrivilegesAction { Grant | Revoke }` body that reuses
`Privileges` and `Grantee`. The `ON <kind>` target uses a new
`AlterDefaultPrivilegesObjectType` enum (`TABLES | SEQUENCES | FUNCTIONS |
ROUTINES | TYPES | SCHEMAS`). New keywords: `ATTRIBUTE`, `ROUTINES`, `TYPES`.
The `parse_alter` dispatch now accepts `DEFAULT` as the leading word.

Note: both `AlterTypeOperation` and `Statement` are not `non_exhaustive`, so
adding variants is technically a breaking change for downstream consumers
that match exhaustively without `_`. Recommend bumping the next release to
0.61.0.

Verified:
  cargo fmt --all
  cargo clippy --all-targets --all-features -- -D warnings
  cargo test --all-features
  cargo check --no-default-features --target thumbv6m-none-eabi
  (cd sqlparser_bench && cargo clippy --all-targets --all-features -- -D warnings)

* PostgreSQL: fix ALTER TYPE ADD ATTRIBUTE grammar, Grantee PUBLIC Display

Review fixes for PR #26:

1. Remove IF NOT EXISTS from AlterTypeOperation::AddAttribute. PostgreSQL's
   grammar for ADD ATTRIBUTE does not include IF NOT EXISTS; only ADD VALUE
   (for enum types) does. The parser was accepting SQL that PostgreSQL itself
   rejects.

2. Fix pre-existing trailing-space bug in Grantee Display: GranteesType::Public
   wrote "PUBLIC " (with trailing space) then fell through to the name-formatting
   path, yielding "PUBLIC " for bare PUBLIC grantees. Early-return with "PUBLIC"
   instead. Adds regression tests for GRANT ... TO PUBLIC and ALTER DEFAULT
   PRIVILEGES ... TO PUBLIC round-trips.

3. Document that AlterDefaultPrivileges::for_roles is keyword-agnostic: the
   parser accepts both FOR ROLE and FOR USER (PG synonyms) but Display
   canonicalises to FOR ROLE.
…rgname/VARIADIC (#28)

Postgres permits argmode (IN/OUT/INOUT/VARIADIC) and argname tokens
ahead of each argument type in COMMENT ON FUNCTION / PROCEDURE /
AGGREGATE; they are ignored for identity resolution but appear in
real-world dumps and consumer schema files. The previous parser used
bare parse_data_type for the argument list, which hard-failed on those
shapes.

Switch the function-like arms of parse_comment to parse_function_arg
and project to data_type, since Statement::Comment.arguments stores a
plain Vec<DataType> and the modes/names are intentionally discarded
for identity matching.

Add tests covering argname, IN/OUT/INOUT, VARIADIC, and AGGREGATE.
…30)

* PostgreSQL: code-quality cleanup for COMMENT, AlterType, AlterDefaultPrivileges

- CommentObject: add `keyword_str` and `from_keyword` helpers; collapse the
  18-arm `parse_comment` match into a single lookup, with `MATERIALIZED VIEW`
  preserved as the only multi-keyword case. Reuse `keyword_str` for `Display`.
- Statement::Comment Display: drop the stray `;` after `IF EXISTS` write.
- ALTER TYPE ... ALTER ATTRIBUTE: comment the `let _ = parse_keywords([SET,
  DATA])` so it reads as canonicalization rather than a swallowed result.
- AlterDefaultPrivilegesAction Display: extract a shared `write_privileges_body`
  helper for the GRANT/REVOKE body shape.
- AlterDefaultPrivileges: remove the redundant `Spanned` impl that returned
  `Span::empty()`; the `Statement::AlterDefaultPrivileges` arm in `spans.rs`
  already covers it.
- AlterTypeOperation attribute Display arms: use `display_option_spaced` for
  the trailing `drop_behavior`, dropping four near-identical `if let Some` blocks.
- tests: drop a duplicate empty-arg `verified_stmt` between
  `parse_comment_on_aggregate` and `parse_comment_on_function_with_arg_types`.

No behavior change.

* fix: drop spurious import; display_option_spaced lives in ddl.rs
Aligns the partner-table field name with sibling AST nodes
(AlterPolicy.table_name, CreatePolicy.table_name, DropPolicy.table_name,
AlterTrigger.table_name). Breaking change on the public AST.

Pure mechanical rename; no behavior change. Display impl, parser
constructor, and existing COMMENT ON TRIGGER / POLICY / AGGREGATE /
FUNCTION / COLLATION tests updated to match.
Minor bump per the fork convention that any AST change warrants 0.(N+1).
Includes the breaking rename of Statement::Comment.relation -> table_name
(#31) plus #28 and #30 as fork-side work since the 0.61.0 release.
Adds parser + AST coverage for the three remaining COMMENT ON shapes
that pgmold-sqlparser had been silently rejecting:

- `COMMENT ON CONSTRAINT name ON [DOMAIN] target IS '…'`
- `COMMENT ON OPERATOR name (left, right) IS '…'` (each side may be NONE)
- `COMMENT ON RULE name ON target IS '…'`

`Statement::Comment` gains two fields: `operator_args:
Option<CommentOperatorArgs>` (operand types for OPERATOR, modeled as
`Option<DataType>` per side so unary `NONE` is preserved), and
`on_domain: bool` (set when CONSTRAINT uses the `ON DOMAIN <domain>`
form). The existing `arguments` and `table_name` fields keep their
prior semantics for FUNCTION/PROCEDURE/AGGREGATE and TRIGGER/POLICY.

Refactors `parse_drop_operator_signature` to share the new
`parse_operator_arg_type_or_none` helper with `parse_comment`, since
both parse the `{ DataType | NONE }` slot shape.

Bumps fork to 0.63.0 and adds changelog/0.63.0.md.
@fmguerreiro

Copy link
Copy Markdown
Contributor Author

Created against the wrong remote — closing. This is fork-only work targeting fmguerreiro/datafusion-sqlparser-rs (the pgmold-sqlparser publishing fork).

@fmguerreiro fmguerreiro deleted the pgmold-297/comment-constraint-operator-rule branch April 27, 2026 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant