Skip to content

Implement native function of pow/power #1513

@slfan1989

Description

@slfan1989

Description

In Apache Spark, both pow and power are implemented as the same function.
In FunctionRegistry.scala, the two names are registered to the same expression class:

expression[Pow]("pow", true),
expression[Pow]("power")

The corresponding expression class Pow extends BinaryMathExpression and internally calls
java.lang.StrictMath.pow(left, right):

case class Pow(left: Expression, right: Expression)
  extends BinaryMathExpression(StrictMath.pow, "POWER") {
  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    defineCodeGen(ctx, ev, (c1, c2) => s"java.lang.StrictMath.pow($c1, $c2)")
  }
  override protected def withNewChildrenInternal(
    newLeft: Expression, newRight: Expression): Expression = copy(left = newLeft, right = newRight)
}

The behavior of StrictMath.pow follows Java’s standard semantics — always returning a double,
and supporting mixed numeric types such as:

SELECT pow(2, 3.0);  -- valid
SELECT pow(2.0, 3);  -- valid

Difference from DataFusion

In DataFusion, the built-in PowerFunc only supports limited type signatures:

(Int64, Int64)   → Int64
(Float64, Float64) → Float64

This narrow typing makes it impossible to match Spark’s flexible behavior directly,
as Spark allows cross-type arithmetic between integers and floating types.
Therefore, we implemented a native Spark-compatible function called spark_pow.

Design

To stay close to Spark’s semantics and simplify the implementation:

  • All supported numeric inputs (Short, Int, Long, Float, Double)
    are converted internally to Float64 (f64 in Rust) for computation.

  • This matches Spark’s approach of using double precision for StrictMath.pow.

  • Integer exponents use powi() for better performance and stability,
    while floating-point exponents use powf().

  • Special cases follow Spark’s semantics:

    • 0 ** negative+∞
    • Nulls propagate
  • For arrays, both sides are converted element-wise to Option<f64>
    and computed in a vectorized manner to return a Float64Array.

Result

spark_pow provides a type-compatible and behavior-compatible replacement for Spark’s pow / power,
enabling mixed numeric expressions like:

SELECT spark_pow(2, 3.0);   -- returns 8.0
SELECT spark_pow(2.0, 3);   -- returns 8.0
SELECT spark_pow(1.5, 2);   -- returns 2.25

while keeping the code path concise and close to StrictMath.pow semantics through f64 unification.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions