Spring batch with spring boot 3.5.4 Failing with EOFException with YugabyteDB #5135
Unanswered
rahulpoddar-fyndna
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Spring Batch + YugabyteDB + HikariCP — EOFException when updating STEP_EXECUTION_CONTEXT after long-running batch
Environment
Spring Boot: 3.5.4
Spring Batch: Partitioned job with multiple threads
Database: YugabyteDB (version 2024.2.3-b1)
Connection Pool: HikariCP
Runtime Environment: Google Kubernetes Engine (GKE)
Job Scheduling: io.kubernetes:client-java (Kubernetes client)
Alternative Tested: Fabric8 Kubernetes client
Issue Summary
We run a long-running Spring Batch job (multiple partitions, multiple threads).
After ~2 hours, the job consistently fails with:
PreparedStatementCallback; SQL [UPDATE batch_STEP_EXECUTION_CONTEXT SET ...];
An I/O error occurred while sending to the backend.
Caused by: java.io.EOFException
This corresponds to:
UPDATE batch_STEP_EXECUTION_CONTEXT
SET SHORT_CONTEXT = ?, SERIALIZED_CONTEXT = ?
WHERE STEP_EXECUTION_ID = ?
HikariCP settings used
spring.datasource.hikari:
max-lifetime: 1500000 # 25 minutes
keepalive-time: 120000 # 2 minutes
Strange Behavior Observed
The batch processes millions of records successfully, but the only place it ever fails is inside Spring Batch metadata update:
JdbcExecutionContextDao.persistSerializedContext(...)
Using Fabric8 client → job runs fine, no EOF errors
Using io.kubernetes:client-java → EOFException always appears
There is no obvious link between the Kubernetes client library and JDBC behavior, which makes this more confusing.
Even with keep-alive settings, the connection sometimes appears to be dead at the moment Spring Batch tries to persist the execution context.
Stack Trace (Trimmed)
org.springframework.dao.DataAccessResourceFailureException:
PreparedStatementCallback; SQL [...] An I/O error occurred while sending to the backend.
Caused by: com.yugabyte.util.PSQLException: An I/O error occurred while sending to the backend.
Caused by: java.io.EOFException
at com.yugabyte.core.PGStream.receiveChar(PGStream.java:469)
What We Want to Know
Specifically when Spring Batch internally executes:
JdbcExecutionContextDao.updateExecutionContext(...)
Is there a supported extension point to intercept I/O failures on metadata updates?
Chunk processing, partition execution, reader/writer logic runs fine.
Only the metadata update query fails.
Is Spring Batch reusing the same connection differently for metadata writes?
This is the most puzzling part.
Could thread scheduling / resource pressure / network buffering affect HikariCP behavior indirectly?
We considered:
Catching Yugabyte PSQLException
Detecting underlying EOFException
Calling HikariCP soft-evict to rebuild pool
But since Spring Batch performs the metadata update inside its own transaction callback, we are unsure whether this is safe.
Request to the Community
Any insights on:
How Spring Batch handles JDBC connection reuse for metadata updates
Whether others have seen EOF failures only during execution context updates
Best practices to integrate long-running batch jobs with YugabyteDB and Hikari
How to safely refresh pool connections while a batch is running
Would be greatly appreciated.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions