Skip to content

feat: add SparkPow UDF returning Infinity for pow(0, negative)#22605

Merged
comphead merged 10 commits into
apache:mainfrom
Brijesh-Thakkar:spark-pow-func
May 29, 2026
Merged

feat: add SparkPow UDF returning Infinity for pow(0, negative)#22605
comphead merged 10 commits into
apache:mainfrom
Brijesh-Thakkar:spark-pow-func

Conversation

@Brijesh-Thakkar
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #22598

Rationale for this change

In Apache Spark, the pow(base, exp) function follows IEEE 754 semantics where raising 0 (or -0.0) to a negative exponent yields positive Infinity.

Currently, DataFusion's default core PowerFunc mimics PostgreSQL behavior, throwing an explicit error ("zero raised to a negative power is undefined"). To support standard Spark compatibility without breaking core DataFusion expectations, this PR introduces a specialized SparkPow UDF inside the datafusion-spark crate.

What changes are included in this PR?

This PR introduces the following changes within the datafusion-spark integration crate:

  • Added SparkPow UDF (datafusion/spark/src/function/math/pow.rs): Overrides the Float64 execution path to evaluate base == 0.0 && exp < 0.0 as f64::INFINITY (safely catching both 0.0 and -0.0 due to IEEE 754 equality rules).
  • Decimal Delegation: Preserves correctness by delegating non-float types (like decimals) back to the standard PowerFunc, as decimals cannot represent infinity.
  • Function Registration (datafusion/spark/src/function/math/mod.rs): Registers the new pow function and establishes power as a valid alias.
  • SQL Integration Tests (datafusion/sqllogictest/test_files/spark/math/pow.slt): Updates and adds test coverage ensuring pow(0, -1), power(0, -1), and pow(0.0, -1.0) successfully return Infinity.

Are these changes tested?

Yes, the changes are covered via both unit and integration tests:

  1. Unit Tests: Added test_spark_pow_zero_negative_returns_infinity and test_spark_pow_normal_cases within pow.rs to validate the core scalar execution logic.
  2. Integration Tests: Extended datafusion/sqllogictest/test_files/spark/math/pow.slt to verify the end-to-end SQL evaluation behavior.

Are there any user-facing changes?

Yes, but only for users utilizing the datafusion-spark compatibility features. When the Spark dialect/crate is active, evaluating pow(0, <negative>) will now return Infinity instead of throwing an evaluation error. Core DataFusion behavior remains completely unchanged.

Spark returns Infinity for pow(0, <negative>) following IEEE 754,
while the DataFusion default (PowerFunc) raises an error to match
PostgreSQL behavior.

This adds SparkPow to the datafusion-spark crate which overrides
the Float64 path to explicitly return +Infinity when base == 0.0
and exp < 0.0 (covers both 0.0 and -0.0), and delegates all
decimal types to the existing PowerFunc.

Both 'pow' and 'power' aliases are covered.

Closes apache#22598
Copilot AI review requested due to automatic review settings May 28, 2026 20:42
@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) spark labels May 28, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a Spark-compatible pow/power implementation and updates SLT expectations to match Spark’s 0 ^ negative = Infinity behavior.

Changes:

  • Introduces SparkPow UDF that overrides DataFusion’s default pow/power semantics for 0 ^ negative.
  • Registers the new function in the Spark math module and exposes it via expr_fn.
  • Updates SLT cases to assert Spark’s Infinity results for pow(0, -1) variants.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
datafusion/sqllogictest/test_files/spark/math/pow.slt Updates Spark SLT expectations for pow and adds 0 ^ negative coverage expecting Infinity.
datafusion/spark/src/function/math/pow.rs Adds SparkPow UDF wrapping PowerFunc with Spark-specific edge-case behavior and unit tests.
datafusion/spark/src/function/math/mod.rs Registers pow UDF module, exports it, and adds to the function registry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +82 to +86
// Only Float64 needs the Spark override.
// Decimal / integer paths are delegated to the standard PowerFunc which
// already handles them correctly (decimal can't represent Infinity anyway).
if !matches!(args.args[0].data_type(), DataType::Float64) {
return self.inner.invoke_with_args(args);
Comment on lines +132 to +138
(Some(b), Some(e)) => {
if b == 0.0 && e < 0.0 {
Some(f64::INFINITY)
} else {
Some(b.powf(e))
}
}
Comment on lines +110 to +114
// ── Array path ───────────────────────────────────────────────────────
let [base, exponent] = take_function_args(self.name(), &args.args)?;

let base_arr: ArrayRef = base.to_array(num_rows)?;
let exp_arr: ArrayRef = exponent.to_array(num_rows)?;
query R
SELECT pow(2::int, 3::int);
----
8
Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Brijesh-Thakkar this a solid PR, please remove tests from pow.rs, those are repetitive.

for the pow.slt lets have more double edgecases, specifically, nulls, nans, -0, +0, -inf, +Inf

Spark returns Infinity for pow(0, <negative>) following IEEE 754,
while the DataFusion default (PowerFunc) raises an error to match
PostgreSQL behavior.

This adds SparkPow to the datafusion-spark crate which overrides
the Float64 path to return +Infinity when base == 0.0 and exp < 0.0
(covers both 0.0 and -0.0). All decimal types delegate to PowerFunc.

Both 'pow' and 'power' aliases are covered.

Adds sqllogictest edge cases for: nulls, NaN, signed zeros (-0/+0),
and signed infinities (-Inf/+Inf) including array and mixed paths.

Closes apache#22598
@Brijesh-Thakkar
Copy link
Copy Markdown
Contributor Author

Thanks @Brijesh-Thakkar this a solid PR, please remove tests from pow.rs, those are repetitive.

for the pow.slt lets have more double edgecases, specifically, nulls, nans, -0, +0, -inf, +Inf

@comphead Removed tests from pow.rs file as you sugegested and also added more edge cases
Thank you

] = args.args.as_slice()
{
// b and e are &Option<f64>; Option<f64> is Copy.
let result = (*b).zip(*e).map(|(b, e)| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets call it base, exp instead of b, e

let result: Float64Array = base_f64
.iter()
.zip(exp_f64.iter())
.map(|(b, e)| match (b, e) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@Brijesh-Thakkar
Copy link
Copy Markdown
Contributor Author

@comphead I will fix this and commit the changes
Thanks for review

@Brijesh-Thakkar
Copy link
Copy Markdown
Contributor Author

@comphead I have done the changes as you suggested and commited them as well
Thanks

@comphead comphead enabled auto-merge May 29, 2026 19:01
@Brijesh-Thakkar
Copy link
Copy Markdown
Contributor Author

@comphead All requested changes have been addressed and checks are passing. Could you please approve the PR when you get a chance? Thanks!

Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Brijesh-Thakkar lgtm!

@comphead comphead added this pull request to the merge queue May 29, 2026
Merged via the queue into apache:main with commit d8c4588 May 29, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Spark pow function

3 participants