-
Notifications
You must be signed in to change notification settings - Fork 210
Description
Description
In Apache Spark, both pow and power are implemented as the same function.
In FunctionRegistry.scala, the two names are registered to the same expression class:
expression[Pow]("pow", true),
expression[Pow]("power")
The corresponding expression class Pow extends BinaryMathExpression and internally calls
java.lang.StrictMath.pow(left, right):
case class Pow(left: Expression, right: Expression)
extends BinaryMathExpression(StrictMath.pow, "POWER") {
override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
defineCodeGen(ctx, ev, (c1, c2) => s"java.lang.StrictMath.pow($c1, $c2)")
}
override protected def withNewChildrenInternal(
newLeft: Expression, newRight: Expression): Expression = copy(left = newLeft, right = newRight)
}
The behavior of StrictMath.pow follows Java’s standard semantics — always returning a double,
and supporting mixed numeric types such as:
SELECT pow(2, 3.0); -- valid
SELECT pow(2.0, 3); -- valid
Difference from DataFusion
In DataFusion, the built-in PowerFunc only supports limited type signatures:
(Int64, Int64) → Int64
(Float64, Float64) → Float64
This narrow typing makes it impossible to match Spark’s flexible behavior directly,
as Spark allows cross-type arithmetic between integers and floating types.
Therefore, we implemented a native Spark-compatible function called spark_pow.
Design
To stay close to Spark’s semantics and simplify the implementation:
-
All supported numeric inputs (
Short,Int,Long,Float,Double)
are converted internally toFloat64(f64 in Rust) for computation. -
This matches Spark’s approach of using double precision for StrictMath.pow.
-
Integer exponents use
powi()for better performance and stability,
while floating-point exponents use powf(). -
Special cases follow Spark’s semantics:
0 ** negative→+∞- Nulls propagate
-
For arrays, both sides are converted element-wise to
Option<f64>
and computed in a vectorized manner to return aFloat64Array.
Result
spark_pow provides a type-compatible and behavior-compatible replacement for Spark’s pow / power,
enabling mixed numeric expressions like:
SELECT spark_pow(2, 3.0); -- returns 8.0
SELECT spark_pow(2.0, 3); -- returns 8.0
SELECT spark_pow(1.5, 2); -- returns 2.25
while keeping the code path concise and close to StrictMath.pow semantics through f64 unification.