support nulleq in tiflash#10726
support nulleq in tiflash#10726ti-chi-bot[bot] merged 10 commits intopingcap:feature/release-8.5-materialized-viewfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a TiDB null-safe equality function ("tidbNullEQ"), maps multiple NullEQ scalar signatures (including VectorFloat32) to that function, updates DeltaMerge filter-parser to treat NullEQ with NULL literal as is-null, and adds unit tests in Coprocessor, Functions, and DeltaMerge.
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds support for TiDB’s null-safe equality (<=>) in TiFlash by introducing a dedicated tidbNullEQ function, wiring TiDB DAG signatures to it, and extending DM rough-set filter parsing to treat <=> NULL as isnull(col).
Changes:
- Add
tidbNullEQcomparison function with correct NULL semantics and collation forwarding. - Map
tipb::ScalarFuncSig::NullEQ*totidbNullEQin DAG function name resolution and toEqualin DM rough-set filter parsing (with<=> NULLlowered toisnull). - Add unit tests covering function behavior, DAG mapping, and DM filter parsing cases.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| dbms/src/Storages/DeltaMerge/tests/gtest_dm_filter_parser_nulleq.cpp | Adds DM filter parser unit tests for <=> lowering to equal / isnull. |
| dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp | Adds <=> NULL handling in rough-set compare parsing; maps NullEQ* sigs to Equal. |
| dbms/src/Functions/tests/gtest_tidb_null_eq.cpp | Adds unit tests validating tidbNullEQ semantics (nullable/const/null-only/collation). |
| dbms/src/Functions/FunctionsComparison.cpp | Implements and registers new tidbNullEQ function. |
| dbms/src/Flash/Coprocessor/tests/gtest_tidb_null_eq_func.cpp | Adds unit tests ensuring DAGUtils maps NullEQ* sigs to tidbNullEQ. |
| dbms/src/Flash/Coprocessor/DAGUtils.cpp | Wires NullEQ* signatures to function name tidbNullEQ. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if ((expr.sig() == tipb::ScalarFuncSig::NullEQInt || expr.sig() == tipb::ScalarFuncSig::NullEQReal | ||
| || expr.sig() == tipb::ScalarFuncSig::NullEQString || expr.sig() == tipb::ScalarFuncSig::NullEQDecimal | ||
| || expr.sig() == tipb::ScalarFuncSig::NullEQTime || expr.sig() == tipb::ScalarFuncSig::NullEQDuration | ||
| || expr.sig() == tipb::ScalarFuncSig::NullEQVectorFloat32) |
There was a problem hiding this comment.
parseTiCompareExpr special-cases <=> NULL to isnull(col) for several NullEQ* signatures, but NullEQJson is mapped to RSFilterType::Equal below and is not included in this signature list. If a NullEQJson comparison against NULL ever reaches this code path, it would incorrectly produce equal(col, NULL) instead of isnull(col). Consider including tipb::ScalarFuncSig::NullEQJson in this check (or refactor to a shared helper for all NullEQ* sigs).
| || expr.sig() == tipb::ScalarFuncSig::NullEQVectorFloat32) | |
| || expr.sig() == tipb::ScalarFuncSig::NullEQVectorFloat32 | |
| || expr.sig() == tipb::ScalarFuncSig::NullEQJson) |
| /// - both NULL => 1 | ||
| /// - one NULL => 0 | ||
| /// - no NULL => equals result | ||
| auto eq_mutable = (*std::move(eq_col)).mutate(); |
There was a problem hiding this comment.
auto eq_mutable = (*std::move(eq_col)).mutate(); uses std::move on eq_col even though it isn't being moved-from (no move-construction happens). This is a bit misleading and can be simplified to avoid confusion about eq_col's lifetime/ownership.
| auto eq_mutable = (*std::move(eq_col)).mutate(); | |
| auto eq_mutable = eq_col->mutate(); |
There was a problem hiding this comment.
Actually, the mutate() can only be called on rvalue
a393697 to
80b0894
Compare
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
80b0894 to
af5acc1
Compare
|
@pantheon-bot review |
|
Review failed due to infrastructure/execution failure after retries. Please re-trigger review. ℹ️ Learn more details on Pantheon AI. |
|
@pantheon-bot review |
|
Review failed due to infrastructure/execution failure after retries. Please re-trigger review. ℹ️ Learn more details on Pantheon AI. |
|
|
||
| /// Use equals to validate that the input types are comparable. | ||
| /// Always return non-nullable UInt8 because `NULL <=> x` is always true/false (not NULL). | ||
| FunctionEquals().getReturnTypeImpl({removeNullable(arguments[0]), removeNullable(arguments[1])}); |
There was a problem hiding this comment.
Maybe we can directly return at L65 as FunctionEquals().getReturnTypeImpl always returns DataTypeUInt8.
There was a problem hiding this comment.
It also does some input types validation in FunctionEquals().getReturnTypeImpl(), so I think it's better to keep it.
|
@pantheon-bot review this pull request again |
|
Hi @Mini256, You're not in the allow list for this repository. Only users in the allow list can trigger Pantheon here. Please contact the repo maintainer if you need access. |
|
@pantheon-bot review this pull request again |
|
Review Complete Findings: 1 issues ℹ️ Learn more details on Pantheon AI. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gengliqi, xzhangxian1008 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
7d306f9
into
pingcap:feature/release-8.5-materialized-view
| //{tipb::ScalarFuncSig::NullEQTime, "cast"}, | ||
| //{tipb::ScalarFuncSig::NullEQDuration, "cast"}, | ||
| //{tipb::ScalarFuncSig::NullEQJson, "cast"}, | ||
| {tipb::ScalarFuncSig::NullEQInt, FilterParser::RSFilterType::Equal}, |
There was a problem hiding this comment.
[P1] DeltaMerge rough-set filter incorrectly prunes packs for NOT (col <=> non-null_literal)
Why: <=> is NULL-safe (never returns NULL), so for nullable col and non-NULL literal k, NOT (col <=> k) must match col IS NULL rows. However, the rough-set path lowers col <=> k to the same RS operator as col = k, which can produce AllNull for packs containing both {k} and NULL. Then NOT flips that to NoneNull, causing the pack to be skipped entirely and silently dropping matching NULL rows under default rough-set filtering.
Evidence:
- Null-safe equality mapped to
Equalat FilterParser.cpp:592, lowered tocreateEqual(...)for non-NULL literals (line 214) Equalcan returnAllwhenmin==max==k(RoughCheck.h:42), and null-mark forcesAllNullviaaddNullIfHasNull(MinMaxIndex.cpp:402, 603)NOTapplied as!RSResult(Not.h:31), and!flipsAll -> Nonewhile preservinghas_null(RSResult.cpp:26, RSResult.h:69), soAllNull -> NoneNull- Packs with
None/NoneNullare skipped (RSResult.h:63), DMFileReader advances past them (DMFileReader.cpp:115) - Reachable in default config:
dt_enable_rough_set_filterdefaults totrue(Settings.h:176)
There was a problem hiding this comment.
@pantheon-ai Can you construct a unit test to reproduce this "incorrectly prunes packs" problem?
There was a problem hiding this comment.
@JaySon-Huang I've created a unit test that reproduces this issue. The test is located at:
dbms/src/Storages/DeltaMerge/tests/gtest_dm_filter_parser_nulleq_not.cpp
Test Name: DMFilterParserTest.RoughSet_NotNullEQ_PrunesPacksWithLiteralAndNull_P1BugRepro
What it tests:
- Creates a nullable column pack with values
{1, NULL} - Applies filter
NOT (a <=> 1) - Expects rough-set result
Some(read the pack, apply precise filter) - Current buggy behavior: returns
NoneNull(skips pack, dropping matching NULL rows)
Key assertion (intentionally fails on current code):
EXPECT_EQ(results[0], RSResult::Some);To run:
cd tiflash
git submodule update --init --recursive
CC=clang CXX=clang++ cmake --preset dev
cmake --build cmake-build-debug --target gtests_dm
./cmake-build-debug/dbms/src/Storages/DeltaMerge/tests/gtests_dm \
--gtest_filter=DMFilterParserTest.RoughSet_NotNullEQ_PrunesPacksWithLiteralAndNull_P1BugReproThe test demonstrates that NOT (col <=> non-null_literal) incorrectly prunes packs containing both the literal value AND NULL values, causing wrong query results.
ℹ️ Learn more details on Pantheon AI.
There was a problem hiding this comment.
Thanks @windtalker! I see the fix PR #10769 has been submitted.
The unit test I created (gtest_dm_filter_parser_nulleq_not.cpp) can be used to verify the fix works correctly. Once the fix is merged, the test assertion EXPECT_EQ(results[0], RSResult::Some) should pass instead of failing.
Let me know if you'd like me to review the fix PR to ensure it addresses all the edge cases identified in the original issue.
ℹ️ Learn more details on Pantheon AI.
DeltaMerge: fix rough-set semantics for NullEQ - add a dedicated `NullEqual` rough-set operator instead of lowering to `Equal` - teach minmax index to evaluate nullable `NullEQ` with null-safe semantics - keep the old minmax-index compatibility path conservative when it cannot distinguish pure-null from mixed packs - add regression coverage for nullable packs and old minmax-index compatibility - update the new NullEQ test files introduced in #10726 to use copyright year 2026
What problem does this PR solve?
Issue Number: close #xxx
Problem Summary:
What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
New Features
Behavior
Tests