Spring batch with spring boot 3.5.4 Failing with EOFException with YugabyteDB #5135

rahulpoddar-fyndna · 2025-12-05T08:17:10Z

rahulpoddar-fyndna
Dec 5, 2025

Spring Batch + YugabyteDB + HikariCP — EOFException when updating STEP_EXECUTION_CONTEXT after long-running batch

Environment

Spring Boot: 3.5.4
Spring Batch: Partitioned job with multiple threads
Database: YugabyteDB (version 2024.2.3-b1)
Connection Pool: HikariCP
Runtime Environment: Google Kubernetes Engine (GKE)
Job Scheduling: io.kubernetes:client-java (Kubernetes client)
Alternative Tested: Fabric8 Kubernetes client

Issue Summary

We run a long-running Spring Batch job (multiple partitions, multiple threads).
After ~2 hours, the job consistently fails with:
PreparedStatementCallback; SQL [UPDATE batch_STEP_EXECUTION_CONTEXT SET ...];
An I/O error occurred while sending to the backend.
Caused by: java.io.EOFException
This corresponds to:
UPDATE batch_STEP_EXECUTION_CONTEXT
SET SHORT_CONTEXT = ?, SERIALIZED_CONTEXT = ?
WHERE STEP_EXECUTION_ID = ?
HikariCP settings used
spring.datasource.hikari:
max-lifetime: 1500000 # 25 minutes
keepalive-time: 120000 # 2 minutes

Strange Behavior Observed

The failure always happens when Spring Batch updates the STEP_EXECUTION_CONTEXT table
The batch processes millions of records successfully, but the only place it ever fails is inside Spring Batch metadata update:
JdbcExecutionContextDao.persistSerializedContext(...)
When switching Kubernetes client
Using Fabric8 client → job runs fine, no EOF errors
Using io.kubernetes:client-java → EOFException always appears
There is no obvious link between the Kubernetes client library and JDBC behavior, which makes this more confusing.
Seems like HikariCP returns a stale connection
Even with keep-alive settings, the connection sometimes appears to be dead at the moment Spring Batch tries to persist the execution context.
Stack Trace (Trimmed)
org.springframework.dao.DataAccessResourceFailureException:
PreparedStatementCallback; SQL [...] An I/O error occurred while sending to the backend.

Caused by: com.yugabyte.util.PSQLException: An I/O error occurred while sending to the backend.
Caused by: java.io.EOFException
at com.yugabyte.core.PGStream.receiveChar(PGStream.java:469)

What We Want to Know

Can Spring Batch catch / wrap / retry on this EOFException?
Specifically when Spring Batch internally executes:
JdbcExecutionContextDao.updateExecutionContext(...)
Is there a supported extension point to intercept I/O failures on metadata updates?
Why does this only fail on STEP_EXECUTION_CONTEXT updates?
Chunk processing, partition execution, reader/writer logic runs fine.
Only the metadata update query fails.
Is Spring Batch reusing the same connection differently for metadata writes?
Why does switching between Fabric8 vs Kubernetes client-java change JDBC stability?
This is the most puzzling part.
Could thread scheduling / resource pressure / network buffering affect HikariCP behavior indirectly?
Is soft-eviction of HikariCP pool recommended after detecting stale connections?
We considered:
Catching Yugabyte PSQLException
Detecting underlying EOFException
Calling HikariCP soft-evict to rebuild pool
But since Spring Batch performs the metadata update inside its own transaction callback, we are unsure whether this is safe.

Request to the Community

Any insights on:
How Spring Batch handles JDBC connection reuse for metadata updates
Whether others have seen EOF failures only during execution context updates
Best practices to integrate long-running batch jobs with YugabyteDB and Hikari
How to safely refresh pool connections while a batch is running
Would be greatly appreciated.
Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Spring batch with spring boot 3.5.4 Failing with EOFException with YugabyteDB #5135

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Spring batch with spring boot 3.5.4 Failing with EOFException with YugabyteDB #5135

Uh oh!

rahulpoddar-fyndna Dec 5, 2025

Spring Batch + YugabyteDB + HikariCP — EOFException when updating STEP_EXECUTION_CONTEXT after long-running batch

Environment

Issue Summary

Strange Behavior Observed

What We Want to Know

Request to the Community

Replies: 0 comments

rahulpoddar-fyndna
Dec 5, 2025