Skip to content

feat(cubesql): add disable_post_processing to /v1/cubesql endpoint#11183

Open
rynmccrmck wants to merge 2 commits into
cube-js:masterfrom
rynmccrmck:feat/cubesql-disable-post-processing
Open

feat(cubesql): add disable_post_processing to /v1/cubesql endpoint#11183
rynmccrmck wants to merge 2 commits into
cube-js:masterfrom
rynmccrmck:feat/cubesql-disable-post-processing

Conversation

@rynmccrmck

@rynmccrmck rynmccrmck commented Jun 30, 2026

Copy link
Copy Markdown

Closes here.

Queries that require DataFusion post-processing fetch intermediate rows from the source DB capped at CUBEJS_DB_QUERY_LIMIT (default 50k). On large datasets this silently truncates the intermediate result, producing wrong aggregates.

The sql4sql (/v1/sql?format=sql) path already classifies plans as pushdown/regular/post-processing and reports it to the caller, but never executes so it cant truncate. The /v1/cubesql path executes and streams results, so to guarantee accuracy it should block before execution.

This adds the same control to /v1/cubesql with a hard guarantee: if the final plan still requires post-processing after the cost model is biased against it, the request fails (the error is delivered in the streamed response body) rather than returning truncated data.

Changes:

  • env.ts: register CUBESQL_DISABLE_POST_PROCESSING env var (server-wide default, off by default) so all services are covered without per-request changes
  • gateway.ts: read disable_post_processing from request body, fall back to env var, pass through to execSql
  • sql-server.ts, index.ts: thread disablePostProcessing param to native
  • node_export.rs: set CUBESQL_PENALIZE_POST_PROCESSING_VAR before planning (biases the cost model toward pushdown) and, after planning, fail if the plan root is not a single CubeScan(Wrapped) node

Tests:

  • smoke-cubesql.test.ts: error path (asserts the post-processing error is surfaced in the streamed body), push-down success, default-off baseline
  • api-gateway/test/index.test.ts: gateway unit test asserting the param is forwarded to execSql; existing /v1/cubesql call-signature assertions updated for the new trailing argument
  • backend-shared/test/env.test.ts: unit test for CUBESQL_DISABLE_POST_PROCESSING

Check List

  • Tests have been run in packages where changes have been made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

Queries that require DataFusion post-processing fetch intermediate rows
from the source DB capped at CUBEJS_DB_QUERY_LIMIT (default 50k). On
large datasets this silently truncates the intermediate result, producing
wrong aggregates.

The sql4sql (/v1/sql?format=sql) path already classifies plans as
pushdown/regular/post-processing and reports it to the caller, but never
executes — so it cannot truncate. The /v1/cubesql path executes and
streams results, so to guarantee accuracy it must block before execution.

This adds the same control to /v1/cubesql with a hard guarantee: if the
final plan still requires post-processing after the cost model is biased
against it, the request fails (the error is delivered in the streamed
response body) rather than returning truncated data.

Changes:
- env.ts: register CUBESQL_DISABLE_POST_PROCESSING env var (server-wide
  default, off by default) so all services are covered without per-request
  changes
- gateway.ts: read disable_post_processing from request body, fall back to
  env var, pass through to execSql
- sql-server.ts, index.ts: thread disablePostProcessing param to native
- node_export.rs: set CUBESQL_PENALIZE_POST_PROCESSING_VAR before planning
  (biases the cost model toward pushdown) and, after planning, fail if the
  plan root is not a single CubeScan(Wrapped) node

Tests:
- smoke-cubesql.test.ts: error path (asserts the post-processing error is
  surfaced in the streamed body), push-down success, default-off baseline
- api-gateway/test/index.test.ts: gateway unit test asserting the param is
  forwarded to execSql; existing /v1/cubesql call-signature assertions
  updated for the new trailing argument
- backend-shared/test/env.test.ts: unit test for CUBESQL_DISABLE_POST_PROCESSING
@rynmccrmck rynmccrmck requested review from a team and keydunov as code owners June 30, 2026 15:39
@github-actions github-actions Bot added rust Pull requests that update Rust code javascript Pull requests that update Javascript code pr:community Contribution from Cube.js community members. labels Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

javascript Pull requests that update Javascript code pr:community Contribution from Cube.js community members. rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQL API: option to reject queries that don't fully push down (instead of silently falling back to in-memory)

1 participant