[SPARK-57183][SS] Close LRUCache on RocksDB.close() in unbounded memory mode#56234
[SPARK-57183][SS] Close LRUCache on RocksDB.close() in unbounded memory mode#56234kete1987 wants to merge 1 commit into
Conversation
d4b935b to
02fd273
Compare
|
I'll ask other folks to review the change (I'm a bit away from recent improvement of RocksDB state store provider), but I'm going to give a general suggestion. Please do not remove the section of PR template and consider filling the section as one of the requirement/duty.
This doesn't replace the requirement the PR template asks about the usage of LLM model and the clarification of the model. The template guides how to describe the model - please check it out again. https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE |
|
@HeartSaVioR Thanks for the feedback! I've updated the PR description to include the AI tooling section. |
02fd273 to
67a7b8a
Compare
…ry mode In unbounded memory mode (the default, boundedMemoryUsage=false), RocksDBMemoryManager creates a new LRUCache per RocksDB instance but RocksDB.close() never calls lruCache.close(). The Java LRUCache wrapper holds a C++ shared_ptr<Cache>, so the native object is only freed when the JVM GC finalizes the wrapper -- which rarely happens under low heap pressure. Closing explicitly ensures native memory is reclaimed deterministically when the instance is released. In bounded mode the cache is a shared singleton managed by RocksDBMemoryManager and must not be closed per instance. Add a parameterized test in RocksDBSuite covering both bounded and unbounded modes, verifying correct LRUCache handle behavior after RocksDB.close() via LRUCache.isOwningHandle(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
67a7b8a to
6c4b046
Compare
… memory mode ### What changes were proposed in this pull request? Backport of #56234 to branch-4.2. ### Why are the changes needed? See #56234. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? See #56234. ### Was this patch authored or co-authored using generative AI tooling? Co-authored with Claude (Anthropic), used for analysis, code generation and review assistance. Generated-by: Claude Sonnet 4.6 Closes #56281 from kete1987/SPARK-57183-rocksdb-lrucache-leak-branch-4.2. Authored-by: Iván Morales <kete1987@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
why it's not merged in branch 4.x? this is not a breaking change. |
|
@kete1987 can you open a 4.x PR? |
|
Where are we? Do we not have a fix in 4.x? Shall we just cherry-pick and push as we would do it in merge script? |
|
@HeartSaVioR @cloud-fan Already done — #56281 was merged to branch-4.2. Thanks! |
|
@kete1987 branch-4.2 and branch-4.x are two different branches... |
|
@cloud-fan opened #56308 for branch-4.x. Sorry for the confusion! |
…ry mode ### What changes were proposed in this pull request? In unbounded memory mode (the default, `boundedMemoryUsage = false`), `RocksDBMemoryManager` creates a new `LRUCache` per instance: https://github.com/apache/spark/blob/d7df1920cfb8577c17a9b1555592d3772f061417/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBMemoryManager.scala#L185 but `RocksDB.close()` never calls `lruCache.close()`: https://github.com/apache/spark/blob/d7df1920cfb8577c17a9b1555592d3772f061417/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala#L2125-L2135 The Java `LRUCache` wrapper holds a C++ `shared_ptr<Cache>`, so the native object is only freed when the JVM GC finalizes the wrapper — which rarely happens under low heap pressure. This causes native memory to accumulate until GC eventually runs, leading to OOM kills in long-running processes or CI runs with many RocksDB-heavy test suites. The fix adds an explicit `lruCache.close()` call in `RocksDB.close()` for unbounded mode. In bounded mode the cache is a shared singleton managed by `RocksDBMemoryManager` and must not be closed per instance. This is a separate issue from SPARK-56523 (Statistics native memory leak), which was already fixed. ### Why are the changes needed? Without explicit `close()`, each `RocksDB` instance in unbounded mode leaks one `LRUCache` worth of native memory (`blockCacheSizeMB`, default 8 MB) for as long as GC does not run. The memory is never reclaimed deterministically. A standalone reproducer tool confirms ~8.5 MB of native memory growth per open/close cycle in leak mode vs flat memory in fixed mode: https://github.com/kete1987/rocksdb-leak-tool ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added a test in `RocksDBSuite` (`SPARK-57183: LRUCache is closed on RocksDB.close() in unbounded memory mode`) that verifies the native handle is released after `close()` via `LRUCache.isOwningHandle()`. ### Was this patch authored or co-authored using generative AI tooling? Co-authored with Claude (Anthropic), used for analysis, code generation and review assistance. Generated-by: Claude Sonnet 4.6 I affirm that the contribution is my original work and that I license the work to the project under the project's open source license. Closes apache#56234 from kete1987/SPARK-57183-rocksdb-lrucache-leak. Authored-by: Ivan Morales <kete1987@gmail.com> Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
What changes were proposed in this pull request?
In unbounded memory mode (the default,
boundedMemoryUsage = false),RocksDBMemoryManagercreates a newLRUCacheper instance:spark/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBMemoryManager.scala
Line 185 in d7df192
but
RocksDB.close()never callslruCache.close():spark/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala
Lines 2125 to 2135 in d7df192
The Java
LRUCachewrapper holds a C++shared_ptr<Cache>, so the native object is only freed when the JVM GC finalizes the wrapper — which rarely happens under low heap pressure. This causes native memory to accumulate until GC eventually runs, leading to OOM kills in long-running processes or CI runs with many RocksDB-heavy test suites.The fix adds an explicit
lruCache.close()call inRocksDB.close()for unbounded mode. In bounded mode the cache is a shared singleton managed byRocksDBMemoryManagerand must not be closed per instance.This is a separate issue from SPARK-56523 (Statistics native memory leak), which was already fixed.
Why are the changes needed?
Without explicit
close(), eachRocksDBinstance in unbounded mode leaks oneLRUCacheworth of native memory (blockCacheSizeMB, default 8 MB) for as long as GC does not run. The memory is never reclaimed deterministically.A standalone reproducer tool confirms ~8.5 MB of native memory growth per open/close cycle in leak mode vs flat memory in fixed mode:
https://github.com/kete1987/rocksdb-leak-tool
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added a test in
RocksDBSuite(SPARK-57183: LRUCache is closed on RocksDB.close() in unbounded memory mode) that verifies the native handle is released afterclose()viaLRUCache.isOwningHandle().Was this patch authored or co-authored using generative AI tooling?
Co-authored with Claude (Anthropic), used for analysis, code generation and review assistance.
Generated-by: Claude Sonnet 4.6
I affirm that the contribution is my original work and that I license the work to the project under the project's open source license.