Skip to content

support nulleq in tiflash#10726

Merged
ti-chi-bot[bot] merged 10 commits intopingcap:feature/release-8.5-materialized-viewfrom
windtalker:add_nulleq_support
Mar 24, 2026
Merged

support nulleq in tiflash#10726
ti-chi-bot[bot] merged 10 commits intopingcap:feature/release-8.5-materialized-viewfrom
windtalker:add_nulleq_support

Conversation

@windtalker
Copy link
Copy Markdown
Contributor

@windtalker windtalker commented Feb 14, 2026

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

What is changed and how it works?


Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Summary by CodeRabbit

  • New Features

    • Added a null-aware equality operator covering more types (including vector float32) with proper string collation forwarding.
  • Behavior

    • Query/filter parsing now recognizes null-safe equality and emits is-null predicates when comparing to NULL.
  • Tests

    • Added unit tests validating mapping, runtime null propagation, collation behavior, and filter-parser null-safe cases.

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 14, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a TiDB null-safe equality function ("tidbNullEQ"), maps multiple NullEQ scalar signatures (including VectorFloat32) to that function, updates DeltaMerge filter-parser to treat NullEQ with NULL literal as is-null, and adds unit tests in Coprocessor, Functions, and DeltaMerge.

Cohort / File(s) Summary
Functions / Implementation & Tests
dbms/src/Functions/FunctionsComparison.cpp, dbms/src/Functions/tests/gtest_tidb_null_eq.cpp
Adds FunctionTiDBNullEQ (tidbNullEQ) with custom NULL semantics and collator forwarding; registers it and adds unit tests for non-null, nullable, all-null, const-null, and collator cases.
Filter Parser / DeltaMerge tests
dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp, dbms/src/Storages/DeltaMerge/tests/gtest_dm_filter_parser_nulleq.cpp
Maps NullEQ* signatures (including NullEQVectorFloat32) to RSFilterType::Equal; parseTiCompareExpr now emits isnull when the compared literal is NULL; adds parser tests for null-safe equality scenarios.
Coprocessor DAG utils & tests
dbms/src/Flash/Coprocessor/DAGUtils.cpp, dbms/src/Flash/Coprocessor/tests/gtest_tidb_null_eq_func.cpp
Updates scalar_func_map to map NullEQ variants to "tidbNullEQ" (adds NullEQVectorFloat32); adds a unit test asserting getFunctionName(expr) returns "tidbNullEQ" for those signatures.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client as Client/Planner
participant DAG as Coprocessor / DAGUtils
participant Parser as FilterParser
participant Func as tidbNullEQ
participant Storage as DeltaMerge / Storage
Client->>DAG: construct tipb::Expr (NullEQ*)
DAG->>Parser: forward mapped function name ("tidbNullEQ")
Parser->>Storage: produce predicate (equal() or isnull() if literal NULL)
Client->>Func: execute tidbNullEQ on Block columns
Func->>Func: unwrap nullable/const, delegate to equals, post-process NULL semantics
Func-->>Client: return UInt8 result per row

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

"I nibbled on bytes beneath the moon's soft glow,
tidbNullEQ hops where nulls and values go.
A test, a map, a parser's tiny cheer,
equal when matched — or isnull when clear.
— Rabbit 🐇"

🚥 Pre-merge checks | ✅ 1 | ❌ 3
❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (11 files):

⚔️ dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cpp (content)
⚔️ dbms/src/Flash/Coprocessor/DAGStorageInterpreter.cpp (content)
⚔️ dbms/src/Flash/Coprocessor/DAGUtils.cpp (content)
⚔️ dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp (content)
⚔️ dbms/src/Flash/Coprocessor/RemoteRequest.cpp (content)
⚔️ dbms/src/Flash/Planner/Plans/PhysicalTableScan.cpp (content)
⚔️ dbms/src/Functions/FunctionsComparison.cpp (content)
⚔️ dbms/src/Storages/DeltaMerge/Filter/PushDownFilter.cpp (content)
⚔️ dbms/src/Storages/DeltaMerge/Filter/RSOperator.cpp (content)
⚔️ dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp (content)
⚔️ dbms/src/Storages/KVStore/Types.h (content)

These conflicts must be resolved before merging into feature/release-8.5-materialized-view.
Resolve conflicts locally and push changes to this branch.
Description check ⚠️ Warning The pull request description is incomplete. Critical sections lack substantive content: 'Problem Summary' is empty, the 'What is changed and how it works?' section contains only an empty commit-message block, and no implementation details are provided. Provide a detailed problem summary explaining the nulleq feature being added, include a non-empty commit message or description of changes, and update the checklist with actually selected items (currently only 'Unit test' is checked but appears to be a template artifact).
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'support nulleq in tiflash' directly and clearly summarizes the main objective of the PR, which is to add support for null-safe equality operations across multiple components in TiFlash.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 14, 2026
@windtalker windtalker changed the title [DNM] support nulleq in tiflash support nulleq in tiflash Feb 14, 2026
@windtalker windtalker requested a review from Copilot February 14, 2026 13:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for TiDB’s null-safe equality (<=>) in TiFlash by introducing a dedicated tidbNullEQ function, wiring TiDB DAG signatures to it, and extending DM rough-set filter parsing to treat <=> NULL as isnull(col).

Changes:

  • Add tidbNullEQ comparison function with correct NULL semantics and collation forwarding.
  • Map tipb::ScalarFuncSig::NullEQ* to tidbNullEQ in DAG function name resolution and to Equal in DM rough-set filter parsing (with <=> NULL lowered to isnull).
  • Add unit tests covering function behavior, DAG mapping, and DM filter parsing cases.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
dbms/src/Storages/DeltaMerge/tests/gtest_dm_filter_parser_nulleq.cpp Adds DM filter parser unit tests for <=> lowering to equal / isnull.
dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp Adds <=> NULL handling in rough-set compare parsing; maps NullEQ* sigs to Equal.
dbms/src/Functions/tests/gtest_tidb_null_eq.cpp Adds unit tests validating tidbNullEQ semantics (nullable/const/null-only/collation).
dbms/src/Functions/FunctionsComparison.cpp Implements and registers new tidbNullEQ function.
dbms/src/Flash/Coprocessor/tests/gtest_tidb_null_eq_func.cpp Adds unit tests ensuring DAGUtils maps NullEQ* sigs to tidbNullEQ.
dbms/src/Flash/Coprocessor/DAGUtils.cpp Wires NullEQ* signatures to function name tidbNullEQ.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if ((expr.sig() == tipb::ScalarFuncSig::NullEQInt || expr.sig() == tipb::ScalarFuncSig::NullEQReal
|| expr.sig() == tipb::ScalarFuncSig::NullEQString || expr.sig() == tipb::ScalarFuncSig::NullEQDecimal
|| expr.sig() == tipb::ScalarFuncSig::NullEQTime || expr.sig() == tipb::ScalarFuncSig::NullEQDuration
|| expr.sig() == tipb::ScalarFuncSig::NullEQVectorFloat32)
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseTiCompareExpr special-cases <=> NULL to isnull(col) for several NullEQ* signatures, but NullEQJson is mapped to RSFilterType::Equal below and is not included in this signature list. If a NullEQJson comparison against NULL ever reaches this code path, it would incorrectly produce equal(col, NULL) instead of isnull(col). Consider including tipb::ScalarFuncSig::NullEQJson in this check (or refactor to a shared helper for all NullEQ* sigs).

Suggested change
|| expr.sig() == tipb::ScalarFuncSig::NullEQVectorFloat32)
|| expr.sig() == tipb::ScalarFuncSig::NullEQVectorFloat32
|| expr.sig() == tipb::ScalarFuncSig::NullEQJson)

Copilot uses AI. Check for mistakes.
/// - both NULL => 1
/// - one NULL => 0
/// - no NULL => equals result
auto eq_mutable = (*std::move(eq_col)).mutate();
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto eq_mutable = (*std::move(eq_col)).mutate(); uses std::move on eq_col even though it isn't being moved-from (no move-construction happens). This is a bit misleading and can be simplified to avoid confusion about eq_col's lifetime/ownership.

Suggested change
auto eq_mutable = (*std::move(eq_col)).mutate();
auto eq_mutable = eq_col->mutate();

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the mutate() can only be called on rvalue

Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@windtalker
Copy link
Copy Markdown
Contributor Author

@pantheon-bot review

@pantheon-ai
Copy link
Copy Markdown

pantheon-ai bot commented Mar 23, 2026

Review failed due to infrastructure/execution failure after retries. Please re-trigger review.

ℹ️ Learn more details on Pantheon AI.

@windtalker
Copy link
Copy Markdown
Contributor Author

@pantheon-bot review

@pantheon-ai
Copy link
Copy Markdown

pantheon-ai bot commented Mar 23, 2026

Review failed due to infrastructure/execution failure after retries. Please re-trigger review.

ℹ️ Learn more details on Pantheon AI.

Copy link
Copy Markdown
Contributor

@xzhangxian1008 xzhangxian1008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other LGTM


/// Use equals to validate that the input types are comparable.
/// Always return non-nullable UInt8 because `NULL <=> x` is always true/false (not NULL).
FunctionEquals().getReturnTypeImpl({removeNullable(arguments[0]), removeNullable(arguments[1])});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can directly return at L65 as FunctionEquals().getReturnTypeImpl always returns DataTypeUInt8.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also does some input types validation in FunctionEquals().getReturnTypeImpl(), so I think it's better to keep it.

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Mar 24, 2026
@Mini256
Copy link
Copy Markdown
Member

Mini256 commented Mar 24, 2026

@pantheon-bot review this pull request again

@pantheon-ai
Copy link
Copy Markdown

pantheon-ai bot commented Mar 24, 2026

Hi @Mini256,

You're not in the allow list for this repository. Only users in the allow list can trigger Pantheon here. Please contact the repo maintainer if you need access.

@Mini256
Copy link
Copy Markdown
Member

Mini256 commented Mar 24, 2026

@pantheon-bot review this pull request again

@pantheon-ai
Copy link
Copy Markdown

pantheon-ai bot commented Mar 24, 2026

Review Complete

Findings: 1 issues
Posted: 1
Duplicates/Skipped: 0

ℹ️ Learn more details on Pantheon AI.

Copy link
Copy Markdown
Contributor

@gengliqi gengliqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 24, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gengliqi, xzhangxian1008

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm approved and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 24, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 24, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-03-24 01:18:49.791546334 +0000 UTC m=+231125.827616594: ☑️ agreed by xzhangxian1008.
  • 2026-03-24 03:31:07.513691165 +0000 UTC m=+239063.549761425: ☑️ agreed by gengliqi.

@ti-chi-bot ti-chi-bot bot merged commit 7d306f9 into pingcap:feature/release-8.5-materialized-view Mar 24, 2026
5 checks passed
//{tipb::ScalarFuncSig::NullEQTime, "cast"},
//{tipb::ScalarFuncSig::NullEQDuration, "cast"},
//{tipb::ScalarFuncSig::NullEQJson, "cast"},
{tipb::ScalarFuncSig::NullEQInt, FilterParser::RSFilterType::Equal},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] DeltaMerge rough-set filter incorrectly prunes packs for NOT (col <=> non-null_literal)

Why: <=> is NULL-safe (never returns NULL), so for nullable col and non-NULL literal k, NOT (col <=> k) must match col IS NULL rows. However, the rough-set path lowers col <=> k to the same RS operator as col = k, which can produce AllNull for packs containing both {k} and NULL. Then NOT flips that to NoneNull, causing the pack to be skipped entirely and silently dropping matching NULL rows under default rough-set filtering.

Evidence:

  • Null-safe equality mapped to Equal at FilterParser.cpp:592, lowered to createEqual(...) for non-NULL literals (line 214)
  • Equal can return All when min==max==k (RoughCheck.h:42), and null-mark forces AllNull via addNullIfHasNull (MinMaxIndex.cpp:402, 603)
  • NOT applied as !RSResult (Not.h:31), and ! flips All -> None while preserving has_null (RSResult.cpp:26, RSResult.h:69), so AllNull -> NoneNull
  • Packs with None/NoneNull are skipped (RSResult.h:63), DMFileReader advances past them (DMFileReader.cpp:115)
  • Reachable in default config: dt_enable_rough_set_filter defaults to true (Settings.h:176)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pantheon-ai Can you construct a unit test to reproduce this "incorrectly prunes packs" problem?

Copy link
Copy Markdown

@pantheon-ai pantheon-ai bot Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JaySon-Huang I've created a unit test that reproduces this issue. The test is located at:

dbms/src/Storages/DeltaMerge/tests/gtest_dm_filter_parser_nulleq_not.cpp

Test Name: DMFilterParserTest.RoughSet_NotNullEQ_PrunesPacksWithLiteralAndNull_P1BugRepro

What it tests:

  • Creates a nullable column pack with values {1, NULL}
  • Applies filter NOT (a <=> 1)
  • Expects rough-set result Some (read the pack, apply precise filter)
  • Current buggy behavior: returns NoneNull (skips pack, dropping matching NULL rows)

Key assertion (intentionally fails on current code):

EXPECT_EQ(results[0], RSResult::Some);

To run:

cd tiflash
git submodule update --init --recursive
CC=clang CXX=clang++ cmake --preset dev
cmake --build cmake-build-debug --target gtests_dm
./cmake-build-debug/dbms/src/Storages/DeltaMerge/tests/gtests_dm \
  --gtest_filter=DMFilterParserTest.RoughSet_NotNullEQ_PrunesPacksWithLiteralAndNull_P1BugRepro

The test demonstrates that NOT (col <=> non-null_literal) incorrectly prunes packs containing both the literal value AND NULL values, causing wrong query results.

ℹ️ Learn more details on Pantheon AI.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submit a new pr to fix this #10769

Copy link
Copy Markdown

@pantheon-ai pantheon-ai bot Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @windtalker! I see the fix PR #10769 has been submitted.

The unit test I created (gtest_dm_filter_parser_nulleq_not.cpp) can be used to verify the fix works correctly. Once the fix is merged, the test assertion EXPECT_EQ(results[0], RSResult::Some) should pass instead of failing.

Let me know if you'd like me to review the fix PR to ensure it addresses all the edge cases identified in the original issue.

ℹ️ Learn more details on Pantheon AI.

ti-chi-bot bot pushed a commit that referenced this pull request Mar 25, 2026
 

DeltaMerge: fix rough-set semantics for NullEQ

- add a dedicated `NullEqual` rough-set operator instead of lowering to `Equal`
- teach minmax index to evaluate nullable `NullEQ` with null-safe semantics
- keep the old minmax-index compatibility path conservative when it cannot distinguish pure-null from mixed packs
- add regression coverage for nullable packs and old minmax-index compatibility
- update the new NullEQ test files introduced in #10726 to use copyright year 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants