Skip to content

[lake/hudi] Introduce fluss-lake-hudi module and HudiLakeStorage#3256

Open
fhan688 wants to merge 4 commits intoapache:mainfrom
fhan688:Introduce-fluss-lake-hudi-module-and-HudiLakeStorage
Open

[lake/hudi] Introduce fluss-lake-hudi module and HudiLakeStorage#3256
fhan688 wants to merge 4 commits intoapache:mainfrom
fhan688:Introduce-fluss-lake-hudi-module-and-HudiLakeStorage

Conversation

@fhan688
Copy link
Copy Markdown

@fhan688 fhan688 commented May 6, 2026

Purpose

Linked issue: #3258

Introduce an initial Hudi LakeStorage plugin module for Fluss and wire it into build/distribution, so Hudi can be recognized as a supported data lake format.

Brief change log

  • Add HUDI("hudi") to DataLakeFormat.
  • Introduce new module fluss-lake-hudi.
  • Add HudiLakeStorage and HudiLakeStoragePlugin as the initial Hudi LakeStorage implementation scaffold.
  • Register LakeStoragePlugin via service loader metadata.
  • Include the new module in lake parent modules and distribution/plugin assembly.
  • Update quickstart-flink build preparation to include hudi plugin artifact.

Tests

  • This commit mainly introduces module/plugin scaffolding and build wiring.

  • Suggested verification:

mvn -pl fluss-lake/fluss-lake-hudi -am clean test
mvn -pl fluss-dist -am clean package -DskipTests

  • No dedicated Hudi functional UT/IT is included in this commit.

API and Format

  • API impact: additive change by introducing HUDI in DataLakeFormat.
  • No storage format migration or incompatible format change in this PR.

Documentation

  • No user-facing documentation changes in this PR.
  • Hudi usage/configuration docs can be added in a follow-up PR.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an initial Apache Hudi lake-storage plugin module (fluss-lake-hudi) and wires it into the Fluss build and distribution so that “hudi” can be selected as a lake format and the plugin can be discovered via ServiceLoader.

Changes:

  • Adds HUDI("hudi") to DataLakeFormat and introduces a new fluss-lake-hudi module with stub HudiLakeStorage + HudiLakeStoragePlugin.
  • Wires the new module into Maven reactor (fluss-lake/pom.xml), dist plugin assembly (fluss-dist), and quickstart build preparation.
  • Adds a Hudi Flink bundle dependency (provided) and a new hudi.version Maven property.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
pom.xml Adds hudi.version property for dependency management.
fluss-lake/pom.xml Adds fluss-lake-hudi to the lake parent reactor modules.
fluss-lake/fluss-lake-hudi/pom.xml New module POM for the Hudi lake plugin.
fluss-lake/fluss-lake-hudi/src/main/java/org/apache/fluss/lake/hudi/HudiLakeStorage.java Adds initial (stub) LakeStorage implementation for Hudi.
fluss-lake/fluss-lake-hudi/src/main/java/org/apache/fluss/lake/hudi/HudiLakeStoragePlugin.java Adds LakeStoragePlugin implementation for ServiceLoader discovery.
fluss-lake/fluss-lake-hudi/src/main/resources/META-INF/services/org.apache.fluss.lake.lakestorage.LakeStoragePlugin Registers HudiLakeStoragePlugin via ServiceLoader metadata.
fluss-lake/fluss-lake-hudi/src/main/resources/META-INF/NOTICE Adds NOTICE file for the new module.
fluss-lake/fluss-lake-hudi/src/test/resources/log4j2-test.properties Adds test logging configuration for the new module.
fluss-lake/fluss-lake-hudi/src/test/resources/org.junit.jupiter.api.extension.Extension Adds JUnit extension auto-registration for tests.
fluss-flink/fluss-flink-common/pom.xml Adds provided Hudi Flink bundle dependency.
fluss-dist/src/main/assemblies/plugins.xml Copies the built fluss-lake-hudi jar into plugins/hudi/ in the dist.
fluss-dist/pom.xml Adds fluss-lake-hudi as a (provided) dependency for build ordering/wiring.
fluss-common/src/main/java/org/apache/fluss/metadata/DataLakeFormat.java Adds HUDI to supported DataLakeFormat enum values.
docker/quickstart-flink/prepare_build.sh Includes Hudi plugin build output directory in quickstart build preparation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fluss-lake/fluss-lake-hudi/pom.xml Outdated
Comment thread fluss-lake/fluss-lake-hudi/pom.xml Outdated
Comment thread fluss-lake/fluss-lake-hudi/src/main/resources/META-INF/NOTICE Outdated
Comment thread fluss-lake/fluss-lake-hudi/src/test/resources/log4j2-test.properties Outdated
Comment thread fluss-flink/fluss-flink-common/pom.xml
@luoyuxia
Copy link
Copy Markdown
Contributor

luoyuxia commented May 9, 2026

cc @XuQianJin-Stars

fhan688 and others added 3 commits May 9, 2026 10:52
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants