From fe781a57bb71ee9875e2f08d301e6a79ab510fc2 Mon Sep 17 00:00:00 2001 From: xiaohongbo Date: Wed, 11 Feb 2026 14:08:38 +0800 Subject: [PATCH 1/2] add doc for filter by _ROW_ID --- docs/content/pypaimon/data-evolution.md | 10 ++++++++++ docs/content/pypaimon/python-api.md | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/docs/content/pypaimon/data-evolution.md b/docs/content/pypaimon/data-evolution.md index f941136e9d63..567c2c07f680 100644 --- a/docs/content/pypaimon/data-evolution.md +++ b/docs/content/pypaimon/data-evolution.md @@ -98,6 +98,16 @@ table_commit.close() # 'f1': [-1001, 1002] ``` +## Filter by _ROW_ID + +On data evolution tables you can filter by `_ROW_ID` to prune files at scan time. Supported: `equal('_ROW_ID', id)`, `is_in('_ROW_ID', [id1, ...])`, `between('_ROW_ID', low, high)`. + +```python +pb = table.new_read_builder().new_predicate_builder() +rb = table.new_read_builder().with_filter(pb.equal('_ROW_ID', 0)) +result = rb.new_read().to_arrow(rb.new_scan().plan().splits()) +``` + ## Update Columns By Shards If you want to **compute a derived column** (or **update an existing column based on other columns**) without providing diff --git a/docs/content/pypaimon/python-api.md b/docs/content/pypaimon/python-api.md index cf838372fafb..9da4a07b66a7 100644 --- a/docs/content/pypaimon/python-api.md +++ b/docs/content/pypaimon/python-api.md @@ -258,7 +258,7 @@ predicate_5 = predicate_builder.and_predicates([predicate3, predicate4]) read_builder = read_builder.with_filter(predicate_5) ``` -See [Predicate]({{< ref "python-api#predicate" >}}) for all supported filters and building methods. +See [Predicate]({{< ref "python-api#predicate" >}}) for all supported filters and building methods. Filter by `_ROW_ID`: see [Data Evolution]({{< ref "pypaimon/data-evolution#filter-by-_row_id" >}}). You can also pushdown projection by `ReadBuilder`: From c37bc04972d4d4366a6af2ebaa9a2835504cd7d6 Mon Sep 17 00:00:00 2001 From: xiaohongbo Date: Wed, 11 Feb 2026 14:26:33 +0800 Subject: [PATCH 2/2] add Prerequisites for data evolution filter --- docs/content/pypaimon/data-evolution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/content/pypaimon/data-evolution.md b/docs/content/pypaimon/data-evolution.md index 567c2c07f680..e35e40fe9ddd 100644 --- a/docs/content/pypaimon/data-evolution.md +++ b/docs/content/pypaimon/data-evolution.md @@ -100,7 +100,7 @@ table_commit.close() ## Filter by _ROW_ID -On data evolution tables you can filter by `_ROW_ID` to prune files at scan time. Supported: `equal('_ROW_ID', id)`, `is_in('_ROW_ID', [id1, ...])`, `between('_ROW_ID', low, high)`. +Requires the same [Prerequisites](#prerequisites) (row-tracking and data-evolution enabled). On such tables you can filter by `_ROW_ID` to prune files at scan time. Supported: `equal('_ROW_ID', id)`, `is_in('_ROW_ID', [id1, ...])`, `between('_ROW_ID', low, high)`. ```python pb = table.new_read_builder().new_predicate_builder()