SQL: Query topics#574
Conversation
✅ Deploy Preview for rp-cloud ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
4d7b551 to
20ad041
Compare
20ad041 to
ddefdad
Compare
Renames modules/sql/pages/query/ to modules/sql/pages/query-data/ and renames the streaming-topic how-to from query-redpanda-topics.adoc to query-streaming-topics.adoc to match the SQL GA IA. Retitles the page "Query streaming topics" and reframes the description and learning objectives around live streaming data; bridge-query and Iceberg content stays out of this page (DOC-2006 owns the Iceberg-topics how-to). Adds a pointer to the Iceberg topics how-to under the intro and lists it under Next steps. Updates the enable-prereq xref to point to the Enable Redpanda SQL page. Drops the CREATE REDPANDA CATALOG link from Next steps to align with the v1 framing that users do not typically create their own Redpanda catalog. Reframes the Query data index page description for v1 Iceberg scope (live and historical data in Redpanda topics; no external Iceberg lakehouse). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # modules/sql/pages/query-data/redpanda-catalogs.adoc
This reverts commit 522ad59.
75ae890 to
48ead8c
Compare
Feediver1
left a comment
There was a problem hiding this comment.
PR Review: SQL: Query topics (#574)
Files reviewed: 4 .adoc files (109 additions / 94 deletions)
Overall assessment: Solid documentation structure and content. Same integration-branch xref challenges as #571 — six unresolved cross-PR xrefs. One nav-linked stub page with no body. No What's New entry. A couple of em dashes that violate the style guide.
What this PR does
Expands Redpanda SQL query documentation on the rp-sql integration branch:
modules/sql/pages/query-data/index.adoc(new, 3 lines) — section index for "Query Data".modules/sql/pages/query-data/query-streaming-topics.adoc(new, 80 lines) — how-to: map a topic to a SQL table and run analytical queries.modules/sql/pages/query-data/redpanda-catalogs.adoc(1+ / 80−) — heavily reduced from a full reference to a 1-line stub.modules/reference/pages/sql/sql-statements/create-table.adoc(25+ / 14−) — updated reference:schema_subjectnow required, expandedstruct_mapping_policy(with cyclic-type guidance), newconfluent_wire_protocoloption, three full examples.
Jira ticket alignment
Ticket: DOC-1990 — "Document feature query Redpanda topics" (extracted from branch name).
Status: The PR delivers the planned query how-to and refreshes the CREATE TABLE reference. The stubbed redpanda-catalogs.adoc is mentioned in the PR description as "likely to be reworked" — worth confirming what the eventual replacement plan is before the integration branch lands.
Critical issues (must fix)
-
Six broken xrefs to pages that aren't on
rp-sqlor in this branch:File:line xref target Provided by query-streaming-topics.adoc:10sql:query-data/query-iceberg-topics.adocPR #575 (still OPEN) query-streaming-topics.adoc:23sql:get-started/deploy-sql-cluster.adocPR #571 (still OPEN) query-streaming-topics.adoc:24sql:manage/manage-access.adocPR #580 (still OPEN) query-streaming-topics.adoc:25sql:get-started/sql-quickstart.adocPR #571 (still OPEN) query-streaming-topics.adoc:50sql:query-data/query-nested-fields.adocNo known PR provides this — confirm it's planned, or remove the reference query-streaming-topics.adoc:77sql:query-data/query-iceberg-topics.adoc(Next steps)PR #575 (still OPEN) - Fix: Coordinate merge ordering — all sibling PRs need to land on
rp-sqlbeforerp-sqllands onmain, otherwise the build will surface sixtarget of xref not founderrors. Specifically check onquery-nested-fields.adoc— if no PR is in flight for it, the inline reference at line 50 should be removed for now.
- Fix: Coordinate merge ordering — all sibling PRs need to land on
-
redpanda-catalogs.adocis a 1-line stub butnav.adoc:355links to it as "Redpanda Catalogs". Users clicking that nav entry hit an empty page. The PR description acknowledges this is intentional ("likely to be reworked"), but a nav-linked empty page is bad UX.- Fix: Either (a) put a 2–3 sentence placeholder with "Coming soon — see [other page]" pointer, (b) leave the original content until the replacement lands and gut it in a later PR, or (c) remove the line from
nav.adoc:355and re-add when the page has content.
- Fix: Either (a) put a 2–3 sentence placeholder with "Coming soon — see [other page]" pointer, (b) leave the original content until the replacement lands and gut it in a later PR, or (c) remove the line from
-
Missing What's New entry. Same gap as #571: the May 2026 section of
whats-new-cloud.adochas no entry for the Redpanda SQL query workflow. Since this is GA documentation, a coordinated What's New entry should cover both PRs (and the broader SQL GA story across #571 / #575 / #580).- Fix: Add a single "Redpanda SQL: General availability" entry under
== May 2026that covers the get-started + query + auth pages together, rather than fragmenting into per-PR entries.
- Fix: Add a single "Redpanda SQL: General availability" entry under
-
Em dashes in
create-table.adoc(style guide says no em dashes):-
Line 7: "
CREATE TABLEin Redpanda SQL maps Redpanda topics to SQL tables — it does not create standalone tables with user-defined schemas." -
Line 56: "Cyclic types are not supported in
COMPOUNDmode — useJSONfor recursive schemas." -
Fix: Replace both em dashes with either a period + new sentence, a colon, or restructure the clause. Example for line 56: "Cyclic types are not supported in
COMPOUNDmode. UseJSONfor recursive schemas."
-
Suggestions (should consider)
-
Page-title case mismatch on the index.
query-data/index.adoc:1has= Query data(sentence case), butnav.adoc:354labels it as "Query Data" (title case). Per team convention, page titles use title case to match the nav label.- Current:
= Query data - Suggested:
= Query Data
- Current:
-
Stub page comment. The 1-line
redpanda-catalogs.adocuses// stubas the only body marker. If you keep the stub approach, consider a more user-facing placeholder (e.g., a NOTE block or an xref to the related how-to) so the rendered page isn't blank. -
Checks boxes in PR body are all empty. Tick the relevant one ("New feature" or "Content gap") for tracking.
Impact on other files
modules/ROOT/nav.adoc✓ — new pages already in nav at lines 354–357, including the (still-missing)query-iceberg-topics.adocentry at line 357 — consistent with the rp-sql integration plan.modules/get-started/pages/whats-new-cloud.adoc❌ — no SQL GA entry (Critical #3).- Cross-component xrefs verified:
xref:reference:sql/sql-statements/create-table.adoc✓xref:reference:sql/index.adoc✓xref:reference:sql/sql-data-types/row.adoc(in create-table.adoc:56) — exists in rp-sql ✓xref:reference:sql/sql-statements/create-redpanda-catalog.adoc(in create-table.adoc:7) — exists in rp-sql ✓xref:sql:connect-to-sql/index.adoc✓- All other
xref:sql:*xrefs — listed as broken in Critical #1.
- Sibling PR dependencies: #571 (deploy + quickstart), #575 (query-iceberg), #580 (manage-access). Plus the unknown source for
query-nested-fields.adoc.
CodeRabbit findings worth considering
None. CodeRabbit's check passed with no review summary or actionable comments.
What works well
- Clean module layout: index + how-to + reference, all in the right places.
- Comprehensive prerequisites section lists exactly what a reader needs before they can succeed: SQL engine enabled, RBAC permission, psql connection, registered Schema Registry schema.
- Real-world SQL examples beyond toy
SELECT *— aggregation withGROUP BY,ORDER BY,WHEREfilters,LIMIT. - CREATE TABLE reference is thorough: required/optional column in the options table, three full examples (basic, multi-message Protobuf, error handling) covering distinct use cases.
- Frontmatter compliance:
:page-topic-type: how-tofor the how-to,:page-topic-type: referencefor the reference, learning objectives observable and measurable, personas correctly scoped (app_developer,data_engineer— query-side audience, not platform admins). - Sentence case correct on every H2+ heading in the new content.
- Source-block syntax is consistent with the rest of the SQL module (long-form
[source,sql]— matches the convention used inget-started/*.adoc). schema_subjectis now correctly marked Required in the reference table, addressing the schema-required guidance that was unclear before.- Helpful guidance on cyclic types in
struct_mapping_policy— clearly tells users to switch toJSONmode for recursive schemas. confluent_wire_protocoloption fully documented with defaults and when to use each value.- CI is fully green and Netlify preview links cover the two main new pages.
Final-pass review via /docs-team-standards:pr-review.
|
@kbatuigas Ping me again after you get your SME approvals and I can do a more thorough review |
|
|
||
| Map a Redpanda topic to a SQL table to run analytical queries directly against live streaming data without building ETL pipelines. Redpanda SQL reads each record's fields from the topic's registered schema. | ||
|
|
||
| To extend queries past your Redpanda retention window by reading the Iceberg history of Iceberg-enabled topics, see xref:sql:query-data/query-iceberg-topics.adoc[Query Iceberg-enabled Topics]. |
| |STRING | ||
| |No | ||
| |Schema Registry subject name to use for deserializing topic data. | ||
| |Yes |
There was a problem hiding this comment.
Sorry I may have mislead you, this is not required in code for the GA. If not provided we default to TopicNameStrategy so <topic>-value
There was a problem hiding this comment.
I wonder where should we mention the existence of redpanda and redpanda_raw structs. The first is iceberg equivalent so all partition offset etc properties are there. The second is DLQ equivalent, filled when FILL_NULL error policy is set
There was a problem hiding this comment.
@JacekGalazka1 do those only exist for Iceberg topics? There is mention of them in this other doc specifically for querying Iceberg https://github.com/redpanda-data/cloud-docs/pull/575/changes#diff-3ab2a15f947f028cb3f75cdb5184029657557cac26b1c961cb27c72554ba3533R83 Is redpanda_raw populated only when FILL_NULL is set?
There was a problem hiding this comment.
They are always added to each kafka reader, so both pure kafka and iceberg backed will have it.
redpanda_raw is populated only when FILL_NULL is set and only for records that failed to decode. in all other cases it's NULL.
There was a problem hiding this comment.
@JacekGalazka1 Added to CREATE TABLE: https://github.com/redpanda-data/cloud-docs/pull/574/changes#diff-4bce4f4f19fcf950f1034b3f3186de2e99bdfc3ba7d66276569d458e13dcec5eR76
and brief description on this page: https://github.com/redpanda-data/cloud-docs/pull/574/changes#diff-b3c7d2ada14afb120454cfcdba2a29f97af19440772908571bfc21ebd02f344fR55
|
There is a most unfortunate wrap on the Required column head in the table here: https://deploy-preview-574--rp-cloud.netlify.app/redpanda-cloud/reference/sql/sql-statements/create-table/#options Any way you can fix this? |
Feediver1
left a comment
There was a problem hiding this comment.
PR Review: SQL: Query topics (#574) — re-review
Files reviewed: 4 .adoc files (184 additions / 95 deletions — significantly more content than yesterday, mostly in create-table.adoc)
Overall assessment: Substantial improvements since my earlier review on 2026-05-21. Em-dashes removed, page-title casing fixed, stub page now clearly tagged for replacement, and a new "Auto-added columns" reference section documents the redpanda / redpanda_raw metadata columns per @JacekGalazka1's request. Two engineer SMEs (@mattschumpert, @JacekGalazka1) have APPROVED. Critical xref-to-sibling-PR issues are unchanged; What's New entry still missing.
What's changed since my earlier review
| Commit | Date | Change |
|---|---|---|
39cae297 |
2026-05-22 18:12 | "Address review comments" — applies @JacekGalazka1's correction that schema_subject is not required (defaults to topic-name strategy). |
b8d94b3b |
2026-05-22 18:32 | "Add info on redpanda and redpanda_raw structs" — adds a full "Auto-added columns" H2 + two H3 sub-sections in create-table.adoc (~63 new lines), plus a short note + xref in query-streaming-topics.adoc. |
ff560eb1 |
2026-05-22 18:45 | "Review pass" — final polish across create-table.adoc and the stub redpanda-catalogs.adoc (which now has a clear TODO referencing the DOC-2049 / PR #573 future-home). |
Review state changes since yesterday:
- ✅ @mattschumpert APPROVED (00:35 UTC today)
- ✅ @JacekGalazka1 (Jacek, SME) APPROVED (14:29 UTC today)
mergeStateStatus: CLEAN (was BLOCKED)reviewDecision: APPROVED
Jira ticket alignment
Ticket: DOC-1990 — "Document feature query Redpanda topics."
Status: Unchanged from earlier. ✅ Satisfies the ticket. New redpanda / redpanda_raw documentation enriches the page beyond the original ticket scope (in a good way).
Critical issues (must fix)
-
Six broken xrefs (carried over from my earlier review — most still unresolved):
File:line xref target Provided by query-streaming-topics.adoc:10,:85sql:query-data/query-iceberg-topics.adocPR #575 (still OPEN) query-streaming-topics.adoc:21sql:get-started/deploy-sql-cluster.adocPR #571 (still OPEN) query-streaming-topics.adoc:22sql:manage/manage-access.adocPR #580 (still OPEN) query-streaming-topics.adoc:23sql:get-started/sql-quickstart.adocPR #571 (still OPEN) query-streaming-topics.adoc:53sql:query-data/query-nested-fields.adocNo known PR provides this — still no source identified Five of six resolve once siblings #571 / #575 / #580 land. The sixth (
query-nested-fields.adoc) remains orphaned and is the one item that warrants direct action — confirm the page is planned, or drop the inline xref. -
Missing What's New entry — still missing. Same recommendation as before: a single coordinated "Redpanda SQL: General availability" entry in
modules/get-started/pages/whats-new-cloud.adoccovering #571 / #574 / #575 / #580 / #584. None of the five SQL GA PRs add it; whichever lands last should also land the What's New entry.
Suggestions (should consider)
None new. Yesterday's suggestions are all resolved:
-
✅ Em-dashes in
create-table.adoc(was lines 7, 56) — removed. -
✅ H1 case on
index.adoc— now= Query Data(title case). -
✅ Stub page comment —
redpanda-catalogs.adocnow reads:= Redpanda Catalogs // TODO: Full content rewrite lives on the DOC-2049 branch (PR #573). // Replace this stub when DOC-2049 merges into rp-sql.
Clear pointer to PR #573 instead of a bare
// stub. Reader still hits an empty page if they click the nav entry — worth deciding whether to keep the nav entry until PR #573 merges, or temporarily remove it. Either is defensible.
Impact on other files
modules/ROOT/nav.adoc✓ — entries unchanged from earlier review (still at lines 354–357).modules/get-started/pages/whats-new-cloud.adoc❌ — still no SQL GA entry (Critical #2).- Cross-references inside the diff: the new xref from
query-streaming-topics.adoc:60toxref:reference:sql/sql-statements/create-table.adoc#auto-added-columns[Auto-added columns]resolves — verified the[#auto-added-columns]anchor is correctly placed on the new H2 increate-table.adoc:76. - Cross-page consistency: the
schema_subjectis-required-or-not story is now consistent acrosscreate-table.adoc:32–35(Required: No, defaults to topic-name strategy) andquery-streaming-topics.adoc(no mention of explicitly setting it in the basic CREATE TABLE example). Matches what @JacekGalazka1 confirmed about the GA behavior. - Sibling PR dependencies (unchanged): #571 / #575 / #580 / #573 / #584 — same set as before.
CodeRabbit findings worth considering
None. CodeRabbit's check passed with no actionable findings on the current state.
Outstanding review activity — status
- @JacekGalazka1 APPROVED (2026-05-22 14:29) — engineer SME, after his inline review on
redpanda/redpanda_rawwas answered by theb8d94b3bcommit. His thread closed with "Perfect. Good to go". - @mattschumpert APPROVED (2026-05-22 00:35).
reviewDecision: APPROVED. No outstandingCHANGES_REQUESTED.
What works well
- New "Auto-added columns" reference section in
create-table.adocis well-structured: parent H2 explains the contract (always present on every row; names are reserved), then=== \redpanda`(always present, Kafka metadata struct) and=== `redpanda_raw`(populated only on FILL_NULL deserialization failures) each get their own H3 with description and field tables. The dead-letter pattern explanation forredpanda_raw` ("rows whose value fails schema deserialization remain queryable, with the malformed payload preserved for inspection or reprocessing") is a real conceptual win for users. - Schema-subject correction is accurate. Now matches what @JacekGalazka1 confirmed about the GA behavior (TopicNameStrategy default).
- Cross-page consistency between
query-streaming-topics.adocandcreate-table.adocon the metadata columns: short summary on the how-to page with an anchored xref to the full reference. - Stub page is now properly tagged. The TODO points at the specific PR (#573) and ticket (DOC-2049) that will replace the stub.
- Style holds up: all H2+ headings in the diff are sentence case; H3 code-identifier headings (
=== \redpanda`) are appropriately formatted; no em-dashes anywhere; source blocks all use the established[source,sql]` convention. - Two engineer SME approvals plus full CI green.
Re-review via /docs-team-standards:pr-review.
Feediver1
left a comment
There was a problem hiding this comment.
Approving with the understanding that all the related tickets will resolve critical issues, and 597 adds What's New.
Description
This pull request updates and expands the Redpanda SQL documentation to clarify table mapping, schema requirements, and streaming topic queries. It refines the
CREATE TABLEreference, introduces new how-to guides for querying topics, and streamlines catalog documentation.Documentation improvements for querying and mapping topics:
query-streaming-topics.adoc) that walks users through mapping a Redpanda topic to a SQL table and running analytical queries directly on live data. This guide covers prerequisites, table creation, querying, and links to further resources.query-data/index.adoc) to provide an entry point for users learning to query Redpanda topics with SQL.Enhancements and clarifications in the SQL reference:
CREATE TABLEdocumentation to clarify thatschema_subjectis required and that Redpanda SQL needs a schema to query a topic. Improved the explanation ofstruct_mapping_policy, especially regarding handling of nested and recursive types, and added documentation for theconfluent_wire_protocoloption. [1] [2]Catalog documentation simplification:
Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 21 May
Page previews
Checks