[SPARK-57274][CONNECT] Support fetch/type accessors and getMoreResults for SparkConnectStatement#56341
[SPARK-57274][CONNECT] Support fetch/type accessors and getMoreResults for SparkConnectStatement#56341j1wonpark wants to merge 5 commits into
Conversation
…s for SparkConnectStatement Signed-off-by: Jiwon Park <jpark92@outlook.kr>
| if (seconds < 0) { | ||
| throw new SQLException("Query timeout must be zero or a positive integer.") | ||
| } | ||
| queryTimeout = seconds |
There was a problem hiding this comment.
IIRC, DataGrip fails immediately when calling this method, because the JDBC API javadoc does not say it may throw SQLFeatureNotSupportedException, so the fix is indeed required.
For the implementation, I feel we should not store a dummy state, the API javadoc does not say the get* method must return the same value accepted by the set* the, so getQueryTimeout should always return 0 to reflect the real state.
So the suggestion here is: keep the checkOpen() and if (seconds < 0) check, silent drop the value, and keep existing getQueryTimeout
There was a problem hiding this comment.
Done. Removed the stored field — getQueryTimeout now always returns 0 to reflect the real state, and setQueryTimeout keeps the checkOpen()/seconds < 0 checks but silently drops the value.
| if (resultSetConcurrency != ResultSet.CONCUR_READ_ONLY) { | ||
| throw new SQLFeatureNotSupportedException( | ||
| s"ResultSet concurrency $resultSetConcurrency is not supported; " + | ||
| "only CONCUR_READ_ONLY.") |
There was a problem hiding this comment.
resultSetConcurrency is an int, you should stringify it to a readable string, see JdbcErrorUtils.stringify* methods
There was a problem hiding this comment.
Done. Added JdbcErrorUtils.stringifyResultSetConcurrency and applied it here.
| resultSetHoldability: Int): Statement = | ||
| throw new SQLFeatureNotSupportedException | ||
| resultSetHoldability: Int): Statement = { | ||
| // holdability is ignored |
There was a problem hiding this comment.
do we really need to override this method? the javadoc clearly states that SQLFeatureNotSupportedException may be thrown – if the JDBC driver does not support this method or this method is not supported for the specified result set type, result set holdability and result set concurrency.
There was a problem hiding this comment.
Done. Reverted the 3-arg createStatement(type, concurrency, holdability) to throw SQLFeatureNotSupportedException, matching the javadoc and the Hive JDBC driver. Holdability has no meaning without transaction support, so accepting it while ignoring the value was misleading.
| override def setFetchSize(rows: Int): Unit = | ||
| throw new SQLFeatureNotSupportedException | ||
| // stored as a hint; Spark Connect results are forward-only and server-paginated | ||
| override def setFetchSize(rows: Int): Unit = { |
There was a problem hiding this comment.
Done. Same treatment as get/setQueryTimeout: getFetchSize returns 0 and setFetchSize validates then silently drops the value. Also removed the now-unused DEFAULT_FETCH_SIZE.
| override def setFetchDirection(direction: Int): Unit = { | ||
| checkOpen() | ||
| if (direction != ResultSet.FETCH_FORWARD) { | ||
| throw new SQLException(s"Fetch direction $direction is not supported.") |
There was a problem hiding this comment.
ditto: stringify, and please check all places
There was a problem hiding this comment.
Done. Applied stringify* here and audited all call sites: the remaining ones (ResultSet fetch direction, Connection transaction isolation / holdability) were already using stringify*. No raw int constants are left in error messages.
…t/fetchSize state getQueryTimeout and getFetchSize now always return 0 to reflect the real state; the setters validate and silently drop the value instead of echoing it back. Remove the now-unused DEFAULT_FETCH_SIZE constant. Signed-off-by: Jiwon Park <jpark92@outlook.kr>
… in error messages Use JdbcErrorUtils.stringify* so error messages show readable names instead of raw int codes. Add stringifyResultSetConcurrency and apply it, along with the existing stringifyFetchDirection/stringifyResultSetType, to the remaining call sites in SparkConnectStatement and SparkConnectConnection. Signed-off-by: Jiwon Park <jpark92@outlook.kr>
| } | ||
|
|
||
| // SCROLL_INSENSITIVE is accepted but the returned statement is forward-only. | ||
| // Mirrors the Hive JDBC driver policy used by the Spark Thrift Server. |
There was a problem hiding this comment.
I don't think it's correct. If you look at the Hive JDBC driver implementation, you will find it has the corresponding code for SCROLL_INSENSITIVE in ResultSet impl, which is different from FORWARD_ONLY
There was a problem hiding this comment.
Good catch. Spark Connect has no server-side scrollable cursor (unlike HiveServer2's FETCH_FIRST), so the code accepted SCROLL_INSENSITIVE but silently downgraded it to forward-only, which—as you said—doesn't match Hive. I've changed checkSupportedResultSet to only allow TYPE_FORWARD_ONLY. This also makes it consistent with supportsResultSetType/supportsResultSetConcurrency, which already advertise FORWARD_ONLY + READ_ONLY only. Proper scrollable support would need client-side materialization, so I'd prefer to do it in a follow-up. Does that sound reasonable?
… with holdability Revert the 3-arg createStatement(type, concurrency, holdability) to throw SQLFeatureNotSupportedException, matching the JDBC javadoc (which permits it) and the Hive JDBC driver. Holdability has no meaning without transaction support, so accepting it while ignoring the value is misleading. Signed-off-by: Jiwon Park <jpark92@outlook.kr>
… result sets Reject TYPE_SCROLL_INSENSITIVE instead of accepting it and silently downgrading to forward-only. Spark Connect has no server-side scrollable cursor, and the previous code claimed to mirror the Hive JDBC driver while behaving differently from it (Hive backs SCROLL_INSENSITIVE with distinct ResultSet logic). Scrollable result sets can be added in a follow-up with proper client-side support. Signed-off-by: Jiwon Park <jpark92@outlook.kr>
What changes were proposed in this pull request?
Implement the
SparkConnectStatementaccessors that previously threwSQLFeatureNotSupportedException(setFetchSize/getFetchSize,setFetchDirection/getFetchDirection,getResultSetType,setQueryTimeout/getQueryTimeout), fixgetMoreResults(which threw, breaking JDBC result-drain loops), and implement thecreateStatementtype/concurrency overloads onSparkConnectConnection. Follow-up to SPARK-54108 / SPARK-54014.Fetch size / query timeout are stored as hints (Spark Connect is forward-only and server-paginated);
createStatementaccepts forward-only / scroll-insensitive and rejects updatable / scroll-sensitive, mirroring the Spark Thrift Server's Hive JDBC policy. See SPARK-57274 for details.Why are the changes needed?
JDBC client tools (e.g. DataGrip) call these methods around every query, so the throws abort the query path.
getMoreResultsthrowing in particular makes the standard drain loopwhile (getMoreResults() || getUpdateCount() != -1)error out or spin forever; DataGrip hangs on a result-less command such asUSE <db>.Does this PR introduce any user-facing change?
Yes, in the unreleased master only: methods that previously threw
SQLFeatureNotSupportedExceptionare now implemented. No existing behavior is removed.How was this patch tested?
New cases in
SparkConnectStatementSuite: accessor defaults/validation, drain-loop termination, and the typedcreateStatementoverloads.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8)