Skip to content

Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema#683

Merged
richox merged 2 commits intoapache:masterfrom
wForget:BLAZE-682
Dec 6, 2024
Merged

Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema#683
richox merged 2 commits intoapache:masterfrom
wForget:BLAZE-682

Conversation

@wForget
Copy link
Member

@wForget wForget commented Dec 5, 2024

Which issue does this PR close?

Closes #682

Rationale for this change

Native schema of BroadcastJoin cannot be empty.

https://github.com/kwai/blaze/blob/b662097e7379c4a5912706d58562658b02274335/native-engine/datafusion-ext-plans/src/joins/bhj/full_join.rs#L126

Are there any user-facing changes?

@richox
Copy link
Contributor

richox commented Dec 6, 2024

thank you for reporting this bug, we can fix it by specifying the number of batch in the code:
https://github.com/kwai/blaze/blob/b662097e7379c4a5912706d58562658b02274335/native-engine/datafusion-ext-plans/src/joins/join_hash_map.rs#L375
replace RecordBatch::try_new with RecordBatch::try_new_with_option

so we don't need to fallback this operator.

@wForget
Copy link
Member Author

wForget commented Dec 6, 2024

thank you for reporting this bug, we can fix it by specifying the number of batch in the code:

https://github.com/kwai/blaze/blob/b662097e7379c4a5912706d58562658b02274335/native-engine/datafusion-ext-plans/src/joins/join_hash_map.rs#L375

replace RecordBatch::try_new with RecordBatch::try_new_with_option
so we don't need to fallback this operator.

Creating a new RecordBatch with option does not seem to solve this isssue, the columns will be empty due to the schema fields being used as projection.

image

Sorry, I noticed that RecordBatch seems to allow columns to be empty, I will try to do that.

@wForget wForget changed the title Fallback when BroadcastJoin output is empty Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema Dec 6, 2024
@richox richox merged commit 44ad0c3 into apache:master Dec 6, 2024
@richox richox mentioned this pull request Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error when bnlj output is empty

2 participants