fix(core): retain buffered data in WriteGenerator on write failure#7400
fix(core): retain buffered data in WriteGenerator on write failure#7400TennyZhuang wants to merge 3 commits into
Conversation
dentiny
left a comment
There was a problem hiding this comment.
May I ask some dumb questions?
- It seems to me that the "issue" reproduces if users call
writeagain after a failedwrite. In production, why would users do that? - In terms of retry layer (which calls write repeatedly after failure), it seems to sit below the write generator. I'm curious if we could provide a reproducible example to show data loss with retry layer used together?
WriteGenerator's write() and close() methods called buffer.take() before attempting to write to the underlying writer. If the write failed, the buffered data was lost, making retry impossible without data loss. Fix by cloning the taken buffer before writing, and restoring it on failure. This matches the invariant maintained by other writer implementations (AppendWriter, BlockWriter, MultipartWriter, PositionWriter). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… duplication On write failure in inexact mode, only restore the previously buffered data (not the current bs). The caller will retry with the same bs, so including it in the restored buffer would cause data duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
46bfec9 to
1c10532
Compare
|
@dentiny Good question. A few clarifications:
I've added a test |
Which issue does this PR close?
N/A - discovered via deterministic testing of error-handling paths.
Rationale
WriteGenerator'swrite()andclose()methods calledbuffer.take()before attempting to write to the underlying writer. If the write failed, the buffered data was permanently lost, making retry impossible without data loss.This violated the invariant maintained by other writer implementations (
AppendWriter,BlockWriter,MultipartWriter,PositionWriter), which only clear state after a successful write.Changes
close()method: Same clone-and-restore pattern in the flush loop.FailOnceMockWriterthat fails the first write call, then succeeds — verifying no data loss in inexact mode, exact mode, and close() paths.Are there any user-facing changes?
Users who rely on retry (e.g., via
RetryLayer) will no longer silently lose data when the underlying storage temporarily fails during a write flush.This PR was generated with the assistance of AI (Claude).