Skip to content

feat: add file store commit and manifest merger support#95

Open
lucasfang wants to merge 1 commit into
apache:mainfrom
lucasfang:migrate_3
Open

feat: add file store commit and manifest merger support#95
lucasfang wants to merge 1 commit into
apache:mainfrom
lucasfang:migrate_3

Conversation

@lucasfang

Copy link
Copy Markdown
Contributor

Purpose

Linked issue: No linked issue

This change adds file store commit and manifest merger support.

Included changes:

  • Public API (include/paimon/):

    • Adds commit_context.h for commit context definitions
    • Adds file_store_commit.h for file store commit interface
  • Core Operation (src/paimon/core/operation/):

    • Adds commit_context.cpp implementation
    • Adds file_store_commit.cpp and file_store_commit_impl.h/.cpp for file store commit logic
    • Adds manifest_file_merger.h/.cpp for manifest file merging utilities
    • Adds test coverage: file_store_commit_test.cpp, manifest_file_merger_test.cpp
  • Metrics (src/paimon/core/operation/metrics/):

    • Adds commit_metrics.h for commit metrics definitions
    • Adds commit_metrics_test.cpp for metrics test coverage

Tests

Not run. Local compile, CMake, and gtest environment checks are not part of this PR description.

Test coverage included in this change:

  • FileStoreCommitTest
  • ManifestFileMergerTest
  • CommitMetricsTest

API and Format

This change adds public API in include/paimon/commit_context.h and include/paimon/file_store_commit.h.

No storage format or protocol changes.

Documentation

No documentation changes required.

Generative AI tooling

Migrate-by: Aone Copilot (Qwen-3.7-Max)

Copilot AI review requested due to automatic review settings June 18, 2026 02:48

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces file-store commit functionality and supporting utilities/metrics in the Paimon C++ core, exposing new public APIs to build a commit context and perform commits/overwrites while adding manifest compaction/merging logic and accompanying tests.

Changes:

  • Added public APIs for commit context construction and a FileStoreCommit interface.
  • Implemented file-store commit workflow (FileStoreCommitImpl) including snapshot commit, manifest merging, and commit metrics emission.
  • Added manifest file merger utilities plus unit tests for commit metrics, manifest merging, and commit creation.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/paimon/core/operation/metrics/commit_metrics.h Defines commit metric keys used by commit implementation.
src/paimon/core/operation/metrics/commit_metrics_test.cpp Unit test for counter setting/merging behavior in metrics.
src/paimon/core/operation/manifest_file_merger.h Declares manifest merge/compaction helper API.
src/paimon/core/operation/manifest_file_merger.cpp Implements full/minor manifest compaction and entry merging.
src/paimon/core/operation/manifest_file_merger_test.cpp Unit tests for manifest merge/compaction behavior.
src/paimon/core/operation/file_store_commit.cpp Factory for creating a commit implementation from CommitContext.
src/paimon/core/operation/file_store_commit_test.cpp Smoke test for FileStoreCommit::Create.
src/paimon/core/operation/file_store_commit_impl.h Declares the core commit/overwrite implementation class.
src/paimon/core/operation/file_store_commit_impl.cpp Implements commit/overwrite, manifest handling, conflict checks, and metrics.
src/paimon/core/operation/commit_context.cpp Implements CommitContext and CommitContextBuilder.
include/paimon/file_store_commit.h Public API header for commit operations and metrics access.
include/paimon/commit_context.h Public API header for commit context configuration/building.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +33 to +36
static constexpr char LAST_CHANGELOG_FILES_APPENDED[] = "lastChangelogFilesAppended";
static constexpr char LAST_CHANGELOG_FILES_COMMIT_COMPACTED[] =
"lastChangelogFileCommitCompacted";
static constexpr char LAST_GENERATED_SNAPSHOTS[] = "lastGeneratedSnapshots";
Comment on lines +22 to +28
#include <algorithm>
#include <cstddef>
#include <future>
#include <list>
#include <set>
#include <unordered_map>
#include <utility>
Comment on lines +863 to +870
for (const auto& merge_after_manifest : merge_after_manifests) {
if (merge_before_manifest_set.find(merge_after_manifest.FileName()) ==
merge_before_manifest_set.end()) {
manifest_list_->DeleteQuietly(merge_after_manifest.FileName());
PAIMON_LOG_DEBUG(logger_, "delete new file %s",
merge_after_manifest.FileName().c_str());
}
}
Comment on lines +969 to +974
int64_t FileStoreCommitImpl::RowCounts(const std::vector<ManifestEntry>& files) {
return std::accumulate(files.begin(), files.end(), 0L,
[](int64_t row_count, const ManifestEntry& entry) {
return row_count + entry.File()->row_count;
});
}
Comment on lines +725 to +727
// commit exception, not sure about the situation and should not clean up the files.
PAIMON_LOG_WARN(logger_, "You need call FilterAndCommit to retry commit for exception. %s",
commit_result.status().ToString().c_str());
Comment on lines +736 to +738
guard.Release();
return Status::Invalid("You need call FilterAndCommit to retry commit for exception. ",
commit_result.status().ToString());
Comment on lines +57 to +62
pool_ = GetDefaultPool();
dir_ = UniqueTestDirectory::Create();
ASSERT_TRUE(dir_);
test_root_ = dir_->Str();
partition_type_ = arrow::int64();
CreateManifestFile(test_root_);
Comment on lines +127 to +133
ASSERT_OK_AND_ASSIGN(
static std::shared_ptr<FileStorePathFactory> path_factory,
FileStorePathFactory::Create(
path_str, schema, /*partition_keys=*/{"f0"}, options.GetPartitionDefaultName(),
options.GetFileFormat()->Identifier(), options.DataFilePrefix(),
options.LegacyPartitionNameEnabled(), external_paths, global_index_external_path,
options.IndexFileInDataFileDir(), pool_));
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants