Skip to content

Fix java.lang.NoClassDefFoundError: io/netty/buffer/Unpooled when running with shaded-spark #1597

@ShreyeshArangath

Description

@ShreyeshArangath

Describe the bug

In production, we use a shaded version of Spark which shades all of Spark's dependencies. When running one of our production jobs with Auron, we noticed the following error. This was because Auron is using netty that is provided by Spark and not shading it within its own namespace.

java.lang.RuntimeException: poll record batch error: Execution error: native execution panics: Execution error: Execution error: output_with_sender[Shuffle] error: Execution error: output_with_sender[Shuffle]: output() returns error: shuffle: executing insert_batch() error
caused by
IO error: External error: Java exception thrown at native-engine/datafusion-ext-plans/src/memmgr/spill.rs:235: java.lang.NoClassDefFoundError: io/netty/buffer/Unpooled
	at org.apache.spark.sql.auron.JniBridge.nextBatch(Native Method)
	at org.apache.spark.sql.auron.AuronCallNativeWrapper$$anon$1.hasNext(AuronCallNativeWrapper.scala:86)
	at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at org.apache.spark.util.CompletionIterator.foreach(CompletionIterator.scala:25)
	at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
	at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
	at scala.collection.TraversableOnce.to(TraversableOnce.scala:348)
	at scala.collection.TraversableOnce.to$(TraversableOnce.scala:346)
	at org.apache.spark.util.CompletionIterator.to(CompletionIterator.scala:25)
	at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:340)
	at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:340)
	at org.apache.spark.util.CompletionIterator.toBuffer(CompletionIterator.scala:25)
	at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:327)
	at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:321)
	at org.apache.spark.util.CompletionIterator.toArray(CompletionIterator.scala:25)
	at org.apache.spark.sql.execution.auron.shuffle.AuronShuffleWriterBase.nativeShuffleWrite(AuronShuffleWriterBase.scala:68)
	at org.apache.spark.sql.execution.auron.plan.NativeShuffleExchangeExec$$anon$1.write(NativeShuffleExchangeExec.scala:157)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:531)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1575)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:534)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

To Reproduce

  1. Use shaded spark that shades netty
  2. Run a production flow with Auron

Expected behavior

Auron should ideally bring in its own dependencies (netty, in this case) and shade it during compilation.

Screenshots

N/A

Additional context

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions