Skip to content

[Feature] Support Fluss as ActionStateStore backend#628

Open
beryllw wants to merge 9 commits intoapache:mainfrom
beryllw:fluss-state
Open

[Feature] Support Fluss as ActionStateStore backend#628
beryllw wants to merge 9 commits intoapache:mainfrom
beryllw:fluss-state

Conversation

@beryllw
Copy link
Copy Markdown

@beryllw beryllw commented Apr 14, 2026

Linked issue: #627

Purpose of change

Uses Fluss log table + in-memory HashMap (similar to Kafka approach):

  • Write: Sync append to Fluss log, then update cache
  • Read: O(1) lookup from in-memory map
  • Recovery: Incremental log scan from bucket offsets (recovery markers), last-write-wins
  • Prune: Evict from cache only; physical cleanup via Fluss retention

Table schema: (state_key STRING, state_payload BYTES, agent_key STRING), distributed by agent_key.

  • FlussActionStateStore — core implementation with divergence detection and data-loss validation
  • ActionStateSerde — extracted backend-agnostic serializer (shared by Kafka & Fluss)
  • ActionStateStore.BackendType.FLUSS — new enum value
  • 9 new Fluss config options (bootstrap servers, database, table, buckets, SASL auth)
  • Integration tests against embedded Fluss cluster (plain + SASL)
  • Docs updated (configuration & deployment)

Tests

API

Documentation

  • doc-needed
  • doc-not-needed
  • doc-included

@beryllw beryllw marked this pull request as draft April 14, 2026 10:06
@github-actions github-actions Bot added doc-label-missing The Bot applies this label either because none or multiple labels were provided. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. labels Apr 14, 2026
@joeyutong
Copy link
Copy Markdown
Contributor

Thanks for working on this. I’m also looking into Fluss as an ActionStateStore backend, so I’ve been comparing different designs here.

One question on the PK table approach: with synchronous upsert in put() and no-op rebuildState(), it seems backend upsert/lookup may sit on the hot path. For new seqNums, get() will usually need to hit Fluss, while the Kafka-style design keeps get() local by rebuilding post-checkpoint state into an in-memory cache.

Did you consider using a Log Table instead, plus local cache + recovery marker + changelog rebuild? My main concern is the latency tradeoff of putting PK-table lookup/upsert directly on the action execution path.

@beryllw beryllw marked this pull request as ready for review April 22, 2026 12:35
Copy link
Copy Markdown
Contributor

@joeyutong joeyutong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this.

Comment thread docs/content/docs/operations/configuration.md
Comment thread runtime/pom.xml
Comment thread docs/content/docs/operations/configuration.md Outdated
@github-actions github-actions Bot added doc-included Your PR already contains the necessary documentation updates. and removed doc-label-missing The Bot applies this label either because none or multiple labels were provided. labels Apr 29, 2026
@beryllw beryllw requested a review from joeyutong April 29, 2026 06:57
Comment thread runtime/src/test/java/org/apache/flink/agents/runtime/actionstate/TestAction.java Outdated
…d multi-bucket and SASL integration tests, enhance unit test coverage, and update security protocol docs.
@beryllw beryllw requested a review from joeyutong April 29, 2026 11:56
@github-actions github-actions Bot added doc-included Your PR already contains the necessary documentation updates. and removed doc-included Your PR already contains the necessary documentation updates. labels Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-included Your PR already contains the necessary documentation updates. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants