Skip to content

Unsupported MapType in RoundRobinPartitioning leads to shuffle execution failure #1100

@merrily01

Description

@merrily01

Describe the bug

MapType is not supported in RoundRobinPartitioning. Related to :#1024

When using repartition(...) on a column (or nested structure) containing a MapType, the query fails with:

Not yet implemented: not yet implemented: Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false)

This error is triggered during ShuffleExchangeExec planning when the output partitioning is RoundRobinPartitioning and one or more output attributes are (or contain) MapType.

Currently, there is no early validation or warning about this limitation, which leads to unexpected runtime failures.

To Reproduce

  1. Create a table with a MapType column:
CREATE TABLE t_map USING parquet AS
SELECT map('a', '1', 'b', '2') AS data_map;
  1. Run a query with repartition:
SELECT /*+ repartition(10) */ data_map FROM t_map;

  1. Observe the following error:
Not yet implemented: not yet implemented: Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false)

This issue also occurs with nested structures like StructType or ArrayType that contain MapType.

Expected behavior

Ideally, either:

Blaze should support MapType in RoundRobinPartitioning, or

The framework should fail and fallback early with a clear and specific message such as:

WARN BlazeConverters: Falling back exec: ShuffleExchangeExec: assertion failed: Unsupported type in RoundRobinPartitioning(10): 'MapType(StringType,StringType,true)'

Screenshots

Image

Additional context

Add test cases that demonstrate failure scenarios including:

  • Top-level MapType

  • MapType with ArrayType as value

  • StructType with a MapType field

  • ArrayType of MapType

  • Nested MapType (Map as value in a Map)

  • MapType with StructType as value

  • ArrayType of StructType containing MapType

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions