feat: add file store commit and manifest merger support#95
Open
lucasfang wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces file-store commit functionality and supporting utilities/metrics in the Paimon C++ core, exposing new public APIs to build a commit context and perform commits/overwrites while adding manifest compaction/merging logic and accompanying tests.
Changes:
- Added public APIs for commit context construction and a
FileStoreCommitinterface. - Implemented file-store commit workflow (
FileStoreCommitImpl) including snapshot commit, manifest merging, and commit metrics emission. - Added manifest file merger utilities plus unit tests for commit metrics, manifest merging, and commit creation.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
src/paimon/core/operation/metrics/commit_metrics.h |
Defines commit metric keys used by commit implementation. |
src/paimon/core/operation/metrics/commit_metrics_test.cpp |
Unit test for counter setting/merging behavior in metrics. |
src/paimon/core/operation/manifest_file_merger.h |
Declares manifest merge/compaction helper API. |
src/paimon/core/operation/manifest_file_merger.cpp |
Implements full/minor manifest compaction and entry merging. |
src/paimon/core/operation/manifest_file_merger_test.cpp |
Unit tests for manifest merge/compaction behavior. |
src/paimon/core/operation/file_store_commit.cpp |
Factory for creating a commit implementation from CommitContext. |
src/paimon/core/operation/file_store_commit_test.cpp |
Smoke test for FileStoreCommit::Create. |
src/paimon/core/operation/file_store_commit_impl.h |
Declares the core commit/overwrite implementation class. |
src/paimon/core/operation/file_store_commit_impl.cpp |
Implements commit/overwrite, manifest handling, conflict checks, and metrics. |
src/paimon/core/operation/commit_context.cpp |
Implements CommitContext and CommitContextBuilder. |
include/paimon/file_store_commit.h |
Public API header for commit operations and metrics access. |
include/paimon/commit_context.h |
Public API header for commit context configuration/building. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+33
to
+36
| static constexpr char LAST_CHANGELOG_FILES_APPENDED[] = "lastChangelogFilesAppended"; | ||
| static constexpr char LAST_CHANGELOG_FILES_COMMIT_COMPACTED[] = | ||
| "lastChangelogFileCommitCompacted"; | ||
| static constexpr char LAST_GENERATED_SNAPSHOTS[] = "lastGeneratedSnapshots"; |
Comment on lines
+22
to
+28
| #include <algorithm> | ||
| #include <cstddef> | ||
| #include <future> | ||
| #include <list> | ||
| #include <set> | ||
| #include <unordered_map> | ||
| #include <utility> |
Comment on lines
+863
to
+870
| for (const auto& merge_after_manifest : merge_after_manifests) { | ||
| if (merge_before_manifest_set.find(merge_after_manifest.FileName()) == | ||
| merge_before_manifest_set.end()) { | ||
| manifest_list_->DeleteQuietly(merge_after_manifest.FileName()); | ||
| PAIMON_LOG_DEBUG(logger_, "delete new file %s", | ||
| merge_after_manifest.FileName().c_str()); | ||
| } | ||
| } |
Comment on lines
+969
to
+974
| int64_t FileStoreCommitImpl::RowCounts(const std::vector<ManifestEntry>& files) { | ||
| return std::accumulate(files.begin(), files.end(), 0L, | ||
| [](int64_t row_count, const ManifestEntry& entry) { | ||
| return row_count + entry.File()->row_count; | ||
| }); | ||
| } |
Comment on lines
+725
to
+727
| // commit exception, not sure about the situation and should not clean up the files. | ||
| PAIMON_LOG_WARN(logger_, "You need call FilterAndCommit to retry commit for exception. %s", | ||
| commit_result.status().ToString().c_str()); |
Comment on lines
+736
to
+738
| guard.Release(); | ||
| return Status::Invalid("You need call FilterAndCommit to retry commit for exception. ", | ||
| commit_result.status().ToString()); |
Comment on lines
+57
to
+62
| pool_ = GetDefaultPool(); | ||
| dir_ = UniqueTestDirectory::Create(); | ||
| ASSERT_TRUE(dir_); | ||
| test_root_ = dir_->Str(); | ||
| partition_type_ = arrow::int64(); | ||
| CreateManifestFile(test_root_); |
Comment on lines
+127
to
+133
| ASSERT_OK_AND_ASSIGN( | ||
| static std::shared_ptr<FileStorePathFactory> path_factory, | ||
| FileStorePathFactory::Create( | ||
| path_str, schema, /*partition_keys=*/{"f0"}, options.GetPartitionDefaultName(), | ||
| options.GetFileFormat()->Identifier(), options.DataFilePrefix(), | ||
| options.LegacyPartitionNameEnabled(), external_paths, global_index_external_path, | ||
| options.IndexFileInDataFileDir(), pool_)); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: No linked issue
This change adds file store commit and manifest merger support.
Included changes:
Public API (
include/paimon/):commit_context.hfor commit context definitionsfile_store_commit.hfor file store commit interfaceCore Operation (
src/paimon/core/operation/):commit_context.cppimplementationfile_store_commit.cppandfile_store_commit_impl.h/.cppfor file store commit logicmanifest_file_merger.h/.cppfor manifest file merging utilitiesfile_store_commit_test.cpp,manifest_file_merger_test.cppMetrics (
src/paimon/core/operation/metrics/):commit_metrics.hfor commit metrics definitionscommit_metrics_test.cppfor metrics test coverageTests
Not run. Local compile, CMake, and gtest environment checks are not part of this PR description.
Test coverage included in this change:
FileStoreCommitTestManifestFileMergerTestCommitMetricsTestAPI and Format
This change adds public API in
include/paimon/commit_context.handinclude/paimon/file_store_commit.h.No storage format or protocol changes.
Documentation
No documentation changes required.
Generative AI tooling
Migrate-by: Aone Copilot (Qwen-3.7-Max)