[Draft] Branch catalog spi#64304
Open
morningman wants to merge 7 commits into
Open
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 29382 ms |
Contributor
TPC-DS: Total hot run time: 169289 ms |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
FE Regression Coverage ReportIncrement line coverage |
b8d6426 to
f09b6df
Compare
This multi-month refactor needs persistent state for progress, decisions, risks, and cross-session agent handoff. Establishes a file-based tracking system including dashboard, ADR decision log, deviation log, risk register, per-stage task files, per-connector tracking, and an agent collaboration playbook covering context budget / subagent usage / handoff norms. Closes 18 design decisions (D-001..D-018) and registers 14 risks (R-001..R-014). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…T27) (#63582) ## Summary Lands the P0 SPI baseline for the catalog-SPI migration (master plan §3.1 / RFC §2.1), with zero impact on the already-migrated JDBC + ES connectors. - **Batch 0** (commits 1-2): SPI types + fe-core bridges — `ConnectorMetaInvalidator`, `ConnectorTransaction`, `ConnectorMvccSnapshot`, `ExternalMetaCacheInvalidator`, `ConnectorMvccSnapshotAdapter`, `PluginDrivenTransactionManager` generalization. - **Batch 1** (commit 3): DDL + Partition SPI — `ConnectorCreateTableRequest` + 4 spec POJOs, 4 new defaults on `ConnectorTableOps`, 3 new fields on `ConnectorPartitionInfo`, fe-core converter, `PluginDrivenExternalCatalog.createTable` routing. - **Batch 2** (commit 4): Import-gate + unit tests — `tools/check-connector-imports.sh` wired through exec-maven-plugin; `FakeConnectorPlugin` covering every default fall-through; routing tests for the invalidator; converter tests for all 4 partition styles + 2 bucket flavors. ## Commits - `[feat](connector) add P0 batch 0 SPI baseline: MetaInvalidator / Transaction / MvccSnapshot` (T03-T08) - `[feat](connector) wire P0 batch 0 SPI into fe-core` (T09-T12) - `[feat](connector) add P0 batch 1 SPI: CreateTableRequest + listPartitions` (T13-T20) - `[feat](connector) add P0 batch 2 gate + unit tests` (T21-T23, T26-T27) ## Test plan - [x] `mvn -pl fe-connector/fe-connector-api,fe-connector/fe-connector-spi -am compile` — SPI modules compile - [x] `mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false` — fe-core compile - [x] `mvn -pl fe-core checkstyle:check` — 0 violations - [x] `mvn -pl fe-connector validate` — import gate runs and passes (baseline clean) - [x] `mvn -pl fe-core -am test -Dtest='FakeConnectorPluginTest,ExternalMetaCacheInvalidatorTest,CreateTableInfoToConnectorRequestConverterTest,ConnectorPluginManagerTest,ConnectorSessionImplTest'` — 39/39 green - [x] `mvn -pl fe-connector/fe-connector-jdbc,fe-connector/fe-connector-es -am compile` — downstream connectors compile unchanged - [ ] JDBC regression-test suite (T24) — to be exercised by this PR's CI pipeline - [ ] ES regression-test suite (T25) — to be exercised by this PR's CI pipeline ## Tracking Full plan, decisions, and risk log live under `plan-doc/` in the repo (introduced by 6315983, already on the base branch). Per-task status: `plan-doc/tasks/P0-spi-foundation.md`. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…63641) ## Summary P1 batch A — close out scan-node SPI consolidation while keeping migration-period fallbacks in place. Three surgical changes route `PluginDrivenExternalTable` first in the nereids translator hot paths so already-migrated SPI connectors (JDBC, ES) take the SPI route, while the existing `instanceof XExternalTable` chains remain as fallbacks for connectors still pending migration (P3–P7). - **T3** — `PhysicalPlanTranslator.visitPhysicalFileScan`: move the existing `PluginDrivenExternalTable` branch from position 8 to position 1; the 7 connector-specific branches (HMS / Iceberg / Paimon / Trino / MaxCompute / LakeSoul / RemoteDoris) stay in place as migration-period fallbacks - **T4** — `PhysicalPlanTranslator.visitPhysicalHudiScan`: add a `PluginDrivenExternalTable` branch routed to `PluginDrivenScanNode.create(...)`, threading `tableSnapshot` + `scanParams` through `FileQueryScanNode` setters; `incrementalRelation` flagged as a P3 Hudi SPI extension TODO. The new branch is unreachable today (`PhysicalHudiScan` is only built for `HMSExternalTable + DLAType.HUDI`), so this is groundwork for P3 with zero current-day runtime impact - **T5** — `LogicalFileScan`: in `computeOutput()`, add a `PluginDrivenExternalTable` branch calling new helper `computePluginDrivenOutput()` — same shape as `computeIcebergOutput`, using `getFullSchema()` + virtualColumns; in `supportPruneNestedColumn()`, add an explicit `PluginDrivenExternalTable → false` branch. Both behaviorally equivalent for JDBC/ES today since they have no hidden cols and no virtualColumns P1 batch B (T1 — delete 13 legacy `Jdbc*Client` + `JdbcFieldSchema`) is deferred to P8 because the 3 fe-core callers — `PostgresResourceValidator`, `StreamingJobUtils`, `CdcStreamTableValuedFunction` — are live CDC streaming code that requires SPI extension for `getPrimaryKeys` / `getColumnsFromJdbc` / `listTables`, which is out of P1 surgical scope. Background and tracking docs live in `plan-doc/` (Master Plan §3.2 P1, tasks/P1-scan-node-cleanup.md, decisions log). ## Test plan - [x] `mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false` → BUILD SUCCESS - [x] `mvn -pl fe-core checkstyle:check` → 0 violations - [x] JDBC + ES regression-test passing — baseline established in P0 / PR #63582 - [ ] PR CI green on this PR - [ ] Manual scan-node smoke for an SPI connector — JDBC `SELECT *` should fall into the new `PluginDrivenExternalTable` branch first 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#64096) ### What problem does this PR solve? Related PR: #63582 (P0 — SPI baseline), #63641 (P1 — nereids plugin-driven routing) Problem Summary: This is **P2** of the catalog SPI migration and targets the `branch-catalog-spi` feature branch (continuing P0 #63582 and P1 #63641). It fully migrates `trino-connector` off the legacy in-tree `fe-core/datasource/trinoconnector/` implementation and onto the connector SPI module `fe-connector-trino`, making `trino-connector` the first connector to complete the SPI consumption playbook that later connectors will reuse as a template. All five batches land together so there is no intermediate state where a newly-created trino catalog cannot be serialized. **Batch A — complete the SPI surface (`fe-connector-trino` only, no fe-core changes)** - `TrinoConnectorProvider.validateProperties`: enforce the required `trino.connector.name` property at `CREATE CATALOG` time (ported from the legacy `checkProperties`). - `TrinoDorisConnector.preCreateValidation`: call `ensureInitialized()` so plugin loading + connector-factory resolution happen at catalog creation instead of being deferred to the first `SELECT`. - `TrinoConnectorDorisMetadata.applyFilter` / `applyProjection`: bridge Trino native filter/projection pushdown, reusing `TrinoPredicateConverter` to translate a Doris `ConnectorExpression` into a Trino `TupleDomain`. `remainingFilter` is conservatively returned as the original expression to match legacy behavior (conjuncts are not stripped; BE re-evaluates them). **Batch B — fe-core bridge for image compatibility** - `GsonUtils`: atomically replace the three legacy `registerSubtype` entries (`TrinoConnectorExternalCatalog` / `Database` / `Table`) with `registerCompatibleSubtype` redirects onto the `PluginDrivenExternal*` hierarchy. This must be atomic — `RuntimeTypeAdapterFactory` rejects duplicate labels, so keeping both bindings would throw at static init. Mirrors what ES/JDBC already did. - `PluginDrivenExternalCatalog.gsonPostProcess`: extract a `legacyLogTypeToCatalogType()` helper that maps `Type.TRINO_CONNECTOR` → `"trino-connector"`; the generic `name().toLowerCase()` would otherwise produce the wrong `"trino_connector"` (underscore) that `CatalogFactory` does not recognize. - `PluginDrivenExternalTable.getEngine()` / `getEngineTableTypeName()`: add `trino-connector` branches that preserve the legacy engine-name / table-type display across `SHOW TABLE STATUS` and `information_schema`. **Batch C — flip the switch** - Add `"trino-connector"` to `CatalogFactory.SPI_READY_TYPES` so catalog creation routes through the SPI path. **Batch D — remove legacy code** - Drop the `instanceof TrinoConnectorExternalTable` scan branch in `PhysicalPlanTranslator` (the `PluginDrivenExternalTable` SPI branch already handles it). - Drop `case "trino-connector"` in `CatalogFactory`. - Delete `fe-core/datasource/trinoconnector/` (10 files) and the now-dead legacy `TrinoConnectorPredicateTest`. - Route the `TRINO_CONNECTOR` db-build case in `ExternalCatalog` to `PluginDrivenExternalDatabase` (mirrors the migrated JDBC case). - **Retained for image compatibility**: the `InitCatalogLog.Type.TRINO_CONNECTOR` and `TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` enums, the GsonUtils redirects, and the `MetastoreProperties` trino-connector entry. **Batch E — tests + tracking docs** - 29 JUnit 5 unit tests over the plugin-free converters: - `TrinoPredicateConverterTest` — `ConnectorExpression` pushdown trees → Trino `TupleDomain` (EQ / range / NE / IN / IS [NOT] NULL / AND / OR, Slice encoding), plus graceful degradation to `TupleDomain.all()` on null/unsupported input. - `TrinoTypeMappingTest` — Trino SPI type → Doris `ConnectorType` (scalars, decimal precision/scale, timestamp precision clamp, array/map/struct, unsupported-type failure). - `TrinoConnectorProviderTest` — `validateProperties` fast-fails when `trino.connector.name` is missing/empty. - No Trino plugin/cluster required; plugin-dependent paths remain covered by the existing `external_table_p0/p2` `trino_connector` regression suites. - Sync the migration tracking docs under `plan-doc/` (already carried on this feature branch since P0). **Net effect**: 28 files, +1025 / −2681 (~1656 LOC net removed). Old FE images holding legacy trino catalogs / databases / tables deserialize onto the `PluginDrivenExternal*` hierarchy through the GsonUtils string-name redirect, with engine-name display preserved. **Deferred (follow-ups, not in this PR)**: - `trino_connector_migration_compat` regression test (old-image deserialization) — requires a running cluster + Trino plugin + docker, unavailable in this dev environment; tracked as a CI/cluster follow-up. - The plugin-install documentation update lives in the `doris-website` repo and is handled separately. ### Release note None ### Check List (For Author) - Test - [x] Unit Test — 29 new tests in `fe-connector-trino` (predicate converter / type mapping / property validation). - [ ] Regression test — existing `trino_connector` suites cover plugin paths; the new old-image compat regression is deferred to a CI/cluster follow-up. - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason - Behavior changed: - [x] No. Internal routing moves from the legacy fe-core path to the SPI path; image compatibility, engine-name display, and pushdown semantics all mirror the legacy behavior. All batches land together, so there is no serialization-gap window. - Does this need documentation? - [x] Yes. The trino-connector plugin-install doc update is a follow-up in the `doris-website` repo. ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label
…tch design (hybrid, T02-T08) (#64143) ## Proposed changes testing with #64146 P3 of the catalog-SPI migration (base: `branch-catalog-spi`). Migrates the **hudi** connector following the **hybrid** strategy (D-019): harden the dormant HMS-over-SPI hudi connector to correctness parity, build a test baseline, and write the per-table dispatch design — **all behind the closed gate** (`SPI_READY_TYPES` unchanged). >⚠️ **No user-visible behavior change.** The SPI hudi path stays dormant (gate closed); hudi queries continue to use the legacy `HMSExternalTable.dlaType=HUDI` path. This PR removes correctness blockers ahead of the live cutover (deferred to P7 / batch E). ### What's included **Correctness fixes (hardening dormant code, behind gate):** - **T02** — fix hudi JNI `column_types` double bug: emit full Hive type strings (was Doris bare type names, losing precision/scale/subtypes) and send `column_names`/`column_types`/`delta_logs` as typed lists end-to-end (was comma join/split, which shattered `decimal(10,2)` / `struct<...>`). Matches the BE `hudi_jni_reader.cpp` contract (names `,` / types `#` / delta `,`). - **T04** — fail loud on time-travel / incremental read in the SPI `visitPhysicalHudiScan` branch (was silently returning the latest snapshot / silently full-scanning). - **T05** — real EQ/IN partition pruning in `HudiConnectorMetadata.applyFilter` (was a placeholder that ignored predicates and unconditionally switched the partition source from Hudi-metadata to HMS); faithfully mirrors `HiveConnectorMetadata.applyFilter`. - **T07** — column-name casing fix in `avroSchemaToColumns` (top-level lowercase, mirroring legacy `HMSExternalTable`). **Test baseline (all three connector modules started P3 with 0 tests):** - `fe-connector-hudi` (33): type-mapping / schema-parity (COW/MOR golden) / table-type / partition-pruning / scan-range. - `fe-connector-hms` (12): shared Hive-type-string parser tests. - `fe-connector-hive` (14): file-format / partition-pruning (mirrors T05). - COW/MOR schema is **type-agnostic** (golden parity vs legacy `initHudiSchema`); table type only affects scan planning. **Decisions / design (code-grounded, design-only):** - **T03** — defer `schema_id`/`history_schema_info` field-id evolution to batch E (DV-006; not a model-agnostic SPI fix). - **T06** — keep MVCC/snapshot SPI defaults (opt-out) + document (DV-007). - **T08** — `tableFormatType` dispatch design memo + **D-020**: single `hms` catalog per-table routing via a new backward-compatible `ConnectorMetadata.getScanPlanProvider(handle)` (per-table provider seam); refines D-005. The keystone gap is split into M1 (identity consumption, fe-core reads `tableFormatType` as an opaque string) and M2 (scan routing). ### Deferred to batch E / P7 (not in this PR) Gate flip (`SPI_READY_TYPES += hms/hudi`), fe-core `tableFormatType` consumption (M1+M2 implementation), live cutover, delete legacy `datasource/hudi/`, full incremental/time-travel/MVCC, Iceberg-on-hms via SPI (needs P6 `IcebergScanPlanProvider`), cluster/runtime validation. ### Verification Per task tracking, each code batch landed with: per-module compile + checkstyle 0 (incl. test sources) + connector import-gate pass + new unit tests green. The two most recent commits are docs-only (`plan-doc/`); the code is unchanged since the last green batch. Gate stays closed → the dormant SPI path is unreachable at runtime → zero live-path risk. CI re-verifies. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…core + make fe-core odps-free (T07-T09) (#64300) Follow-up to #64253 (the MaxCompute catalog-SPI cutover). After the cutover a `max_compute` catalog deserializes to `PluginDrivenExternalCatalog` and no legacy `MaxComputeExternal*` object is ever instantiated, so the legacy MaxCompute subsystem in fe-core is dead code. This removes it and makes fe-core's dependency tree fully odps-free. **1. Remove legacy subsystem** (`7a4db351100`) - Delete 20 fe-core files: `datasource/maxcompute/*` (incl. `MCTransaction`, `MaxComputeScanNode`/`Split`), the MaxCompute sink/insert/txn plumbing, and 2 legacy-only tests. - Clean ~21 reverse-reference sites (imports + dead `instanceof`/visitor/rule branches), keeping every `PluginDriven`/connector sibling branch and the image/replay keep-set (GsonUtils compat strings; `TableType`/`TransactionType`/`TableFormatType`/`InitCatalogLog.Type` `MAX_COMPUTE` enums; block-id thrift). - Rewire 3 tests; e.g. `FrontendServiceImplTest`'s block-id RPC test now mocks the generic `Transaction` SPI, since `getMaxComputeBlockIdRange` reads the PluginDriven connector transaction. **2. Make fe-core odps-free** (`409300a75b8`) - Drop the two odps deps from `fe-core/pom.xml`. - Move `MCUtils` from fe-common into `be-java-extensions/max-compute-connector` (its only consumer after the removal); keep `MCProperties` (odps-free constants) in fe-common. - Drop `odps-sdk-core` from fe-common — it was also leaking netty/protobuf transitively to fe-common's own `DorisHttpException`/`GsonUtilsBase`, so declare `netty-all` + `protobuf-java` directly (proper dependency hygiene). **3. Doc-sync** (`f8c305765e8`) — plan-doc PROGRESS/HANDOFF/deviations/design tracking notes. - `mvn -pl :fe-core -am test-compile` (main+test) passes; checkstyle 0 violations; connector import-gate passes. - `grep -rn com.aliyun.odps fe/fe-core/src` → empty. - `mvn -pl :fe-core dependency:tree | grep odps` → empty (no odps, direct or transitive). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
f09b6df to
e9c5b3e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
only for testing