-
Notifications
You must be signed in to change notification settings - Fork 210
Description
Describe the bug
MapType is not supported in RoundRobinPartitioning. Related to :#1024
When using repartition(...) on a column (or nested structure) containing a MapType, the query fails with:
Not yet implemented: not yet implemented: Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false)
This error is triggered during ShuffleExchangeExec planning when the output partitioning is RoundRobinPartitioning and one or more output attributes are (or contain) MapType.
Currently, there is no early validation or warning about this limitation, which leads to unexpected runtime failures.
To Reproduce
- Create a table with a MapType column:
CREATE TABLE t_map USING parquet AS
SELECT map('a', '1', 'b', '2') AS data_map;
- Run a query with repartition:
SELECT /*+ repartition(10) */ data_map FROM t_map;
- Observe the following error:
Not yet implemented: not yet implemented: Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false)
This issue also occurs with nested structures like StructType or ArrayType that contain MapType.
Expected behavior
Ideally, either:
Blaze should support MapType in RoundRobinPartitioning, or
The framework should fail and fallback early with a clear and specific message such as:
WARN BlazeConverters: Falling back exec: ShuffleExchangeExec: assertion failed: Unsupported type in RoundRobinPartitioning(10): 'MapType(StringType,StringType,true)'
Screenshots
Additional context
Add test cases that demonstrate failure scenarios including:
-
Top-level MapType
-
MapType with ArrayType as value
-
StructType with a MapType field
-
ArrayType of MapType
-
Nested MapType (Map as value in a Map)
-
MapType with StructType as value
-
ArrayType of StructType containing MapType