Implement fast bazel target lookup with index caching #16974

titusfortner · 2026-01-22T05:28:27Z

User description

I've tried several times to improve how we determine which targets to run tests against.

The options are all either inaccurate (just run everything) or slow (so slow they were timing out).
This is the hybrid approach. Do the slow thing 2x a day when scheduled CI runs, and the fast thing every other time. Should be good enough.

The pre-release PR found 2437 matching targets, this PR finds the correct 570.

💥 What does this PR do?

Adds Rake task bazel:build_test_index to index test targets based on affected BUILD packages

./go bazel:build_test_index            # --> path: 'build/bazel-test-target-index.json'
./go bazel:build_test_index index.json # --> path: 'index.json'

Adds Rake task bazel:affected_targets to provide a list of test targets that apply

./go bazel:affected_targets            # --> HEAD^..HEAD (last commit)
./go bazel:affected_targets abc123     # --> abc123^..abc123 (that commit)
./go bazel:affected_targets abc..def   # --> explicit range

When ci.yml event is scheduled it runs the indexing
When ci.yml event is anything else it checks the index for matching targets
Indexes get zipped and stored in GitHub cache (very small, very quick to zip/unzip)
Deleted scripts/github-actions/check-bazel-targets.sh (replaced by Rake tasks)
Added cache-name input to bazel.yml workflow
Simplified per-binding conditionals in ci.yml

Edge Cases:

Add src file not in index: #find_bazel_package locates the package and runs tests for that package
Delete src file: #find_bazel_package locates the package and runs tests for that package
Add test file not in index: all changed test files get corresponding targets added directly
Delete test file: ignored
Add new package: generates target list from package (slow but infrequent)
.NET tests depend on Java server causing all .NET tests to run each time java server code is changed; since .NET does not have any tests running against a grid, this does a complete filter
If one test file is updated only one test file should run

🔧 Implementation Notes

I've spent way too long on this. I tried the tools, I tried various shortcuts, no good way to get accuracy with any speed.
I went back and forth on shell vs python vs rake, and Rake gives me the most flexibility, and lets us easily do this from local or on CI

💡 Additional Considerations

JavaScript will need to be added later
Rust is ignored in the indexing because we aren't toggling rust tests from this
This doesn't actually use the targets in the ci.yml yet, so we're still just seeing if there are any and running everything.
Actually limiting things is next

🔄 Types of changes

New feature (non-breaking change which adds functionality and tests!)

PR Type

Enhancement, Tests

Description

Replace slow shell script with fast Rake-based Bazel target lookup using index caching
Add bazel:build_test_index task to generate test target index from BUILD packages
Add bazel:affected_targets task to find affected targets using index or fallback
Simplify CI conditionals by splitting target output into per-binding flags
Switch index generation from macOS to Linux for faster, cheaper builds
Support workflow dispatch to rebuild index and run all tests

Diagram Walkthrough

flowchart LR
  A["Changed Files"] --> B["affected_targets Task"]
  B --> C{"Index Exists?"}
  C -->|Yes| D["Query Index"]
  C -->|No| E["Directory Fallback"]
  D --> F["Test Targets"]
  E --> F
  G["build_test_index Task"] --> H["Generate Index from BUILD Packages"]
  H --> I["Cache Index"]
  I --> D

File Walkthrough

Relevant files

Enhancement

check-bazel-targets.sh `Remove slow shell-based target lookup script` scripts/github-actions/check-bazel-targets.sh Deleted entire shell script that performed slow Bazel target lookup Functionality replaced by Rake tasks in Rakefile Script used rdeps queries which were timing out	+0/-108
bazel.yml `Add cache restore support for index files` .github/workflows/bazel.yml Added `cache-name` input parameter for cache restore functionality Added cache restore step using actions/cache/restore@v4 Allows workflows to restore pre-built index files	+12/-0
ci-build-index.yml `Switch index generation to Linux platform` .github/workflows/ci-build-index.yml Changed index build OS from macOS to Linux Improves build speed and reduces CI costs Requires trunk fix for macOS-only archive rules	+1/-1
ci.yml `Refactor CI workflow to use Rake tasks and per-binding flags` .github/workflows/ci.yml Updated build-index job to run on schedule or workflow_dispatch events Replaced shell script invocation with inline Rake task calls Refactored read-targets job to output per-binding flags instead of combined targets Simplified per-binding conditionals from complex multi-line expressions to simple flag checks Added cache-name parameter to check job for index restoration	+41/-48
Rakefile `Add Rake tasks for fast Bazel target lookup with index` Rakefile Commented out lint task implementation (formatting, mypy, ruff, steep checks) Added `bazel:affected_targets` task to find test targets affected by file changes Added `bazel:build_test_index` task to generate index from BUILD packages Added helper functions: `find_bazel_package`, `affected_targets_with_index`, `targets_from_lookup`, `query_package_dep`, `targets_from_tests`, `affected_targets_fallback` Supports index-based lookup with directory fallback for missing packages Handles edge cases like test files, library files, and Java/dotnet dependencies	+144/-17

qodo-code-review · 2026-01-22T05:29:03Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
⚪	Shell command injection Description: The `bazel:affected_targets` task builds a shell command using backticks with `base_rev`/`head_rev` derived from the user-supplied `commit_range` (`git diff --name-only` `#{base_rev} #{head_rev}`), which can enable command injection if a malicious range is provided (e.g., including shell metacharacters). Rakefile [1658-1669] Referred Code task :affected_targets, %i[commit_range index_file] do \|_task, args\| range = args[:commit_range] \|\| 'HEAD' index_file = args[:index_file] \|\| 'build/bazel-test-target-index.json' base_rev, head_rev = if range.include?('..') range.split('..', 2) else ["#{range}^", range] end puts "Commit range: #{base_rev}..#{head_rev}" changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?)
	Bazel query injection Description: Bazel query strings are constructed by interpolating the repository file path (via `rel`) into `attr(srcs, '#{rel}', ...)`, which could permit Bazel query injection or query breakage if a crafted filename contains quotes/special characters (e.g., `foo' ) + kind(".",` `//...`), especially when filenames are sourced from `git diff`. Rakefile [1793-1806]* Referred Code query = test_files.filter_map { \|f\| pkg = find_bazel_package(f) next if pkg.nil? # Bazel srcs often use paths relative to the package, not basenames. rel = f.sub(%r{\A#{Regexp.escape(pkg)}/}, '') "attr(srcs, '#{rel}', //#{pkg}:*)" }.join(' + ') return [] if query.empty? targets = [] Bazel.execute('query', ['--output=label'], query) do \|out\| targets = out.lines.map(&:strip).select { \|l\| l.start_with?('//') } end
	Cache path traversal Description: The workflow uses `inputs.cache-name` directly to form a filesystem `path` (`${{` `inputs.cache-name }}.gz`) for cache restore, which could allow unintended file overwrite/path traversal within the workspace if an untrusted caller supplies a value like `../somefile` to a `workflow_call`-exposed input. bazel.yml [108-114] Referred Code - name: Restore cache if: inputs.cache-name != '' uses: actions/cache/restore@v4 with: path: ${{ inputs.cache-name }}.gz key: ${{ inputs.cache-name }}- restore-keys: ${{ inputs.cache-name }}-
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
🔴	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Unhandled failures: The new tasks do not robustly handle failures from `git`/file/JSON/Bazel operations (e.g., missing/invalid refs or malformed index) and will raise or silently proceed without actionable error context. Referred Code task :affected_targets, %i[commit_range index_file] do \|_task, args\| range = args[:commit_range] \|\| 'HEAD' index_file = args[:index_file] \|\| 'build/bazel-test-target-index.json' base_rev, head_rev = if range.include?('..') range.split('..', 2) else ["#{range}^", range] end puts "Commit range: #{base_rev}..#{head_rev}" changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?) puts "Changed files: #{changed_files.size}" targets = if index_file && File.exist?(index_file) affected_targets_with_index(changed_files, index_file) else puts 'No index found, using directory-based fallback' affected_targets_fallback(changed_files) end ... (clipped 130 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Command injection risk: The `commit_range` argument is interpolated directly into a backtick shell command (`git diff` `--name-only #{base_rev} #{head_rev}`) without validation/escaping, allowing potential command injection if an attacker can influence the task arguments. Referred Code task :affected_targets, %i[commit_range index_file] do \|_task, args\| range = args[:commit_range] \|\| 'HEAD' index_file = args[:index_file] \|\| 'build/bazel-test-target-index.json' base_rev, head_rev = if range.include?('..') range.split('..', 2) else ["#{range}^", range] end puts "Commit range: #{base_rev}..#{head_rev}" changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?) puts "Changed files: #{changed_files.size}" Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Unstructured logging: The new tasks emit multiple free-form `puts` logs (including target labels and file counts) rather than structured logs, and it is unclear from the diff whether this output could include sensitive repo/internal path information in some environments. Referred Code puts "Commit range: #{base_rev}..#{head_rev}" changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?) puts "Changed files: #{changed_files.size}" targets = if index_file && File.exist?(index_file) affected_targets_with_index(changed_files, index_file) else puts 'No index found, using directory-based fallback' affected_targets_fallback(changed_files) end if targets.empty? puts 'No test targets affected' File.write('bazel-targets.txt', '') else puts "Found #{targets.size} affected test targets" File.write('bazel-targets.txt', targets.sort.join(' ')) targets.sort.each { \|t\| puts t } end end ... (clipped 42 lines) Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

.github/workflows/ci-build-index.yml

qodo-code-review · 2026-01-22T05:30:43Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
High-level	Optimize the Bazel index creation The `build_test_index` task is inefficient because it runs a `bazel query` for every test target. This can be optimized by running a single `bazel query` with a graph output format (`--output=xml`) and parsing the result in memory to build the index. Examples: Rakefile [1692-1729] task :build_test_index, [:index_file] do \|_task, args\| output = args[:index_file] \|\| 'build/bazel-test-target-index.json' index = {} tests = [] exclude_tags = %w[manual spotbugs ie] all_bindings = BINDING_TARGETS.values.join(' + ') tag_exclusions = exclude_tags.map { \|tag\| "except attr(tags, #{tag}, #{all_bindings})" }.join(' ') kind = '_test' # do not match test_suite or pytest_runner ... (clipped 28 lines) Solution Walkthrough: Before: # Rakefile task :build_test_index do # 1. Find all test targets all_tests = bazel_query("kind(_test, //...)") # 2. Loop through each test and query its dependencies individually all_tests.each do \|test\| # This runs a new bazel query for every single test dependencies = bazel_query("deps(#{test})") dependencies.each do \|dep\| # build index: index[package_of(dep)] << test end end end After: # Rakefile task :build_test_index do # 1. Get all tests and their dependencies in one go query = "deps(kind(_test, //...))" all_deps_xml = bazel_query(query, output: 'xml') # 2. Parse the XML output in memory graph = parse_xml_dependency_graph(all_deps_xml) # 3. Build the index from the parsed graph # This avoids running any more bazel commands. index = build_index_from_graph(graph) File.write(output, JSON.pretty_generate(index)) end Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies a major performance bottleneck (N+1 queries) in the `build_test_index` task and proposes a standard, effective solution using a single `bazel query`, which would significantly speed up the scheduled index creation job.	High
General	Fix array intersection check Fix a potential runtime error by replacing the non-standard `Array#intersect?` method with the correct Ruby idiom `(array1 & array2).any?` for checking array intersection. Rakefile [1814] -return BINDING_TARGETS.values if top_level_dirs.intersect?(%w[common rust]) +return BINDING_TARGETS.values if (top_level_dirs & %w[common rust]).any? Apply / Chat Suggestion importance[1-10]: 9 __ Why: This suggestion correctly identifies that `Array#intersect?` is not a standard Ruby method and would cause a runtime error, proposing a valid fix that prevents the script from crashing.	High
Possible issue	Default outputs for bindings Modify the `check_binding` function to always produce an output for each binding by adding an `else` branch that sets the output to `false` when the condition is not met. .github/workflows/ci.yml [72-77] check_binding() { local pattern=$1 tag=$2 if [[ "$targets" == "$pattern" ]] \|\| [[ "$COMMIT_MESSAGES" == "[$tag]" ]] \|\| [[ "$PR_TITLE" == "[$tag]" ]]; then echo "$tag=true" >> "$GITHUB_OUTPUT" + else + echo "$tag=false" >> "$GITHUB_OUTPUT" fi } Apply / Chat Suggestion importance[1-10]: 7 __ Why: This suggestion correctly identifies that downstream jobs check for non-empty strings, not boolean `true`. Explicitly setting an output makes the workflow's logic clearer and more robust against future changes.	Medium
Learned best practice	Avoid mutating input arrays Don’t mutate the input array with `select!`; create a filtered copy so callers can safely reuse their array without side effects. Rakefile [1789-1808] def targets_from_tests(test_files) - test_files.select! { \|f\| File.exist?(f) } - return [] if test_files.empty? + existing_files = test_files.select { \|f\| File.exist?(f) } + return [] if existing_files.empty? - query = test_files.filter_map { \|f\| + query = existing_files.filter_map { \|f\| pkg = find_bazel_package(f) next if pkg.nil? # Bazel srcs often use paths relative to the package, not basenames. rel = f.sub(%r{\A#{Regexp.escape(pkg)}/}, '') "attr(srcs, '#{rel}', //#{pkg}:)" }.join(' + ') return [] if query.empty? targets = [] Bazel.execute('query', ['--output=label'], query) do \|out\| targets = out.lines.map(&:strip).select { \|l\| l.start_with?('//') } end targets end Apply / Chat* Suggestion importance[1-10]: 5 __ Why: Relevant best practice - Avoid mutating caller-provided collections; prefer defensive/immutable handling of collection-like state passed into helpers.	Low
Update

Copilot

Pull request overview

This PR implements a hybrid approach to determining which Bazel test targets to run in CI by introducing an index-based caching mechanism. The system builds a comprehensive index of test targets and their dependencies during scheduled runs, then uses this cached index for faster lookups during regular PR and push events.

Changes:

Replaces shell-based target discovery with Ruby Rake tasks for better flexibility and performance
Implements index caching via GitHub Actions cache to enable fast target lookups
Simplifies CI workflow conditionals by consolidating binding-specific logic

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
scripts/github-actions/check-bazel-targets.sh	Deleted shell script that performed slow Bazel queries, replaced by Rake tasks
Rakefile	Added two new Rake tasks (bazel:build_test_index and bazel:affected_targets) with supporting helper functions for index creation and target lookup
rb/.rubocop.yml	Excluded Rakefile from Lint/RedundantRequireStatement check for the new 'set' requirement
.github/workflows/ci.yml	Added build-index job for scheduled runs, updated check job to use cached index, simplified per-binding conditionals
.github/workflows/ci-build-index.yml	New workflow for building and caching the test target index
.github/workflows/bazel.yml	Added cache-name input parameter to support restoring cached index files

.github/workflows/bazel.yml

Rakefile

.github/workflows/ci.yml

Rakefile

rb/.rubocop.yml

Rakefile

titusfortner · 2026-01-22T15:31:08Z

I committed 0850cd0 which does the indexing part of this. Once a scheduled CI job has run building the index without issues, I'll merge the rest of this.

titusfortner · 2026-01-23T02:05:18Z

I'm putting in draft until #16985 merges

qodo-code-review · 2026-01-24T02:58:16Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
⚪	Command injection Description: The new `bazel:affected_targets` task builds and executes a shell command via backticks (e.g., git diff --name-only #{base_rev} #{head_rev}), which can enable command injection if the user-supplied `range` argument contains shell metacharacters (e.g., `HEAD; rm -rf ~`) when run locally or in any environment where arguments are not strictly controlled. Rakefile [1367-1380] Referred Code values = args.to_a index_file = values.find { \|val\| File.exist?(val) } range = (values - [index_file]).first \|\| 'HEAD' index_file \|\|= 'build/bazel-test-target-index' base_rev, head_rev = if range.include?('..') range.split('..', 2) else ["#{range}^", range] end puts "Commit range: #{base_rev}..#{head_rev}" changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?) puts "Changed files: #{changed_files.size}"
	Cache path poisoning Description: The new cache restore uses a caller-controlled `inputs.cache-name` for both `path` and `key`, which could allow unintended file placement/overwrite in the workspace (e.g., setting `cache-name` to `.` or another sensitive path) if an untrusted caller can invoke the reusable workflow, potentially leading to executing poisoned restored content in later steps. bazel.yml [74-115] Referred Code cache-name: description: Name for cache restore (restores {name}.gz with key {name}-) required: false type: string default: '' jobs: bazel: name: ${{ inputs.name }} runs-on: ${{ contains(inputs.os, '-') && inputs.os \|\| format('{0}-latest', inputs.os) }} env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} SEL_M2_USER: ${{ secrets.SEL_M2_USER }} SEL_M2_PASS: ${{ secrets.SEL_M2_PASS }} TWINE_PASSWORD: ${{ secrets.TWINE_PASSWORD }} GEM_HOST_API_KEY: ${{ secrets.GEM_HOST_API_KEY }} NUGET_API_KEY: ${{ secrets.NUGET_API_KEY }} NODE_AUTH_TOKEN: ${{ secrets.NODE_AUTH_TOKEN }} steps: - name: Checkout source tree uses: actions/checkout@v4 ... (clipped 21 lines)
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Unhandled command failures: The new `bazel:affected_targets` task shells out to `git diff` and calls Bazel queries without clearly handling non-zero exit cases, which may cause confusing failures depending on the CI environment and revision availability. Referred Code puts "Commit range: #{base_rev}..#{head_rev}" changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?) puts "Changed files: #{changed_files.size}" targets = if index_file && File.exist?(index_file) affected_targets_with_index(changed_files, index_file) else puts 'No index found, using directory-based fallback' affected_targets_fallback(changed_files) end if targets.empty? puts 'No test targets affected' File.write('bazel-targets.txt', '') else puts "Found #{targets.size} affected test targets" File.write('bazel-targets.txt', targets.sort.join(' ')) targets.sort.each { \|t\| puts t } end end ... (clipped 6 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Unvalidated cache path: The new `cache-name` input is used directly as an `actions/cache/restore` `path`, so callers could supply unexpected paths unless constrained/validated at the workflow boundary. Referred Code cache-name: description: Name for cache restore (restores {name}.gz with key {name}-) required: false type: string default: '' jobs: bazel: name: ${{ inputs.name }} runs-on: ${{ contains(inputs.os, '-') && inputs.os \|\| format('{0}-latest', inputs.os) }} env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} SEL_M2_USER: ${{ secrets.SEL_M2_USER }} SEL_M2_PASS: ${{ secrets.SEL_M2_PASS }} TWINE_PASSWORD: ${{ secrets.TWINE_PASSWORD }} GEM_HOST_API_KEY: ${{ secrets.GEM_HOST_API_KEY }} NUGET_API_KEY: ${{ secrets.NUGET_API_KEY }} NODE_AUTH_TOKEN: ${{ secrets.NODE_AUTH_TOKEN }} steps: - name: Checkout source tree uses: actions/checkout@v4 ... (clipped 20 lines) Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

titusfortner · 2026-01-24T02:58:36Z

It should index on linux now.

qodo-code-review · 2026-01-24T02:59:40Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Prevent an infinite loop in path traversal To prevent a potential infinite loop in `find_bazel_package`, add a check to terminate the directory traversal when the root directory is reached. Rakefile [1451-1460] def find_bazel_package(filepath) path = File.dirname(filepath) until path.empty? return path if File.exist?(File.join(path, 'BUILD.bazel')) \|\| File.exist?(File.join(path, 'BUILD')) return nil if path == '.' + prev_path = path path = File.dirname(path) + return nil if path == prev_path # Reached root end nil end Apply / Chat Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies a critical bug that could cause an infinite loop when processing files at the filesystem root, and the proposed fix is correct and robust.	High
Possible issue	Correct Python pattern check In the `read-targets` job, change the Python binding check from `//py:` to `//py/` to correctly match affected targets and trigger the Python CI run. .github/workflows/ci.yml [81-85] check_binding "//java/" "java" -check_binding "//py:" "py" +check_binding "//py/" "py" check_binding "//rb/" "rb" check_binding "//dotnet/" "dotnet" check_binding "//rust/" "rust" Apply / Chat Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies a pattern mismatch (`//py:` instead of `//py/`) that would prevent Python tests from being triggered correctly, which is a significant correctness issue in the CI logic.	Medium
High-level	Consider the risk of stale indexes The dependency index can become stale because it's only updated on a schedule. To prevent missed tests, detect changes in `BUILD.bazel` files and fall back to a full dependency analysis or rebuild the index. Examples: Rakefile [1462-1483] def affected_targets_with_index(changed_files, index_file) puts "Using index: #{index_file}" begin index = JSON.parse(File.read(index_file)) rescue JSON::ParserError => e puts "Invalid JSON in index file: #{e.message}" puts 'Falling back to directory-based detection' return affected_targets_fallback(changed_files) end affected = Set.new ... (clipped 12 lines) Solution Walkthrough: Before: # Rakefile def affected_targets_with_index(changed_files, index_file) index = JSON.parse(File.read(index_file)) # ... lib_files, _ = changed_files.partition(...) lib_files.each do \|filepath\| pkg = find_bazel_package(filepath) # `index[pkg]` is used even if a BUILD file in `pkg` was modified in the PR. # This can use stale dependency data from the old index. test_targets = index[pkg] \|\| query_package_dep(pkg) affected.merge(test_targets) end # ... end After: # Rakefile def affected_targets_with_index(changed_files, index_file) build_files_changed = changed_files.any? { \|f\| f.end_with?('BUILD.bazel', 'BUILD') } if build_files_changed puts 'BUILD file changed, falling back to directory-based detection for accuracy.' return affected_targets_fallback(changed_files) end index = JSON.parse(File.read(index_file)) # ... lib_files, _ = changed_files.partition(...) lib_files.each do \|filepath\| pkg = find_bazel_package(filepath) test_targets = index[pkg] \|\| query_package_dep(pkg) affected.merge(test_targets) end # ... end Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies a critical flaw where changes to `BUILD.bazel` files can cause the cached index to be stale, leading to missed test runs and undermining the reliability of the new target selection logic.	Medium
Learned best practice	Avoid mutating input collections Don’t use in-place filters like `select!` on an argument; create a filtered local array instead so callers aren’t unexpectedly modified. Rakefile [1505-1524] def targets_from_tests(test_files) - test_files.select! { \|f\| File.exist?(f) } - return [] if test_files.empty? + existing = test_files.select { \|f\| File.exist?(f) } + return [] if existing.empty? - query = test_files.filter_map { \|f\| + query = existing.filter_map { \|f\| pkg = find_bazel_package(f) next if pkg.nil? # Bazel srcs often use paths relative to the package, not basenames. rel = f.sub(%r{\A#{Regexp.escape(pkg)}/}, '') "attr(srcs, '#{rel}', //#{pkg}:)" }.join(' + ') return [] if query.empty? targets = [] Bazel.execute('query', ['--output=label'], query) do \|out\| targets = out.lines.map(&:strip).select { \|l\| l.start_with?('//') } end targets end Apply / Chat* Suggestion importance[1-10]: 5 __ Why: Relevant best practice - When accepting collection-like inputs, avoid mutating caller-provided arrays/hashes; make a defensive copy to prevent hidden side effects.	Low
More

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.

Rakefile

.github/workflows/ci.yml

Rakefile

.github/workflows/ci.yml

Rakefile

…of building it

Now that trunk has the fix for macOS-only archive rules, the index can be built on Linux which is faster and cheaper. Co-Authored-By: Claude Opus 4.5 <[email protected]>

selenium-ci added C-rb Ruby Bindings B-build Includes scripting, bazel and CI integrations labels Jan 22, 2026

titusfortner requested review from Copilot and diemol January 22, 2026 05:28

qodo-code-review bot added the Review effort 4/5 label Jan 22, 2026

Copilot started reviewing on behalf of titusfortner January 22, 2026 05:29 View session

github-advanced-security bot found potential problems Jan 22, 2026

View reviewed changes

.github/workflows/ci-build-index.yml Fixed Show fixed Hide fixed

.github/workflows/ci-build-index.yml Fixed Show fixed Hide fixed

Copilot AI reviewed Jan 22, 2026

View reviewed changes

titusfortner added a commit that referenced this pull request Jan 22, 2026

[build] build test target index as precursor to #16974

0850cd0

titusfortner force-pushed the affected_targets branch from 9a825c6 to 9ba59a8 Compare January 22, 2026 15:29

titusfortner added a commit that referenced this pull request Jan 22, 2026

[build] build test target index as precursor to #16974

dd90b08

titusfortner force-pushed the affected_targets branch from 6a70092 to 5955cb3 Compare January 23, 2026 02:00

titusfortner marked this pull request as draft January 23, 2026 02:04

titusfortner force-pushed the affected_targets branch 5 times, most recently from 8e89e2a to f01506a Compare January 24, 2026 02:57

titusfortner marked this pull request as ready for review January 24, 2026 02:57

titusfortner requested a review from Copilot January 24, 2026 03:15

Copilot started reviewing on behalf of titusfortner January 24, 2026 03:16 View session

Copilot AI reviewed Jan 24, 2026

View reviewed changes

implement fast bazel target lookup

850a1b1

titusfortner and others added 16 commits January 23, 2026 22:45

fix the conditional logic

96a3608

run build-index on manual workflow dispatch

26294d5

add comment when check target script is skipped

8d7fcca

remove indexing when manual executed

1bae1e8

WIP - just build index again

a35251d

WIP - now try to use the indexed file with workflow dispatch instead …

a757f33

…of building it

test: trigger remote tests from index

f6e8f7a

detect index file by existence in affected_targets args

714d44c

revert test comment in RemoteWebDriver

71742de

python tests are all on //py: not nested with //py/

b20d1a4

test: trigger grid tests from index

55925c9

build: switch index generation to linux

198538c

Now that trunk has the fix for macOS-only archive rules, the index can be built on Linux which is faster and cheaper. Co-Authored-By: Claude Opus 4.5 <[email protected]>

workflow dispatch always re-indexes and always runs everything

22a6404

ubuntu not linux

a0a3f26

python tests from the root

13ff451

add explicit permissions to cache job

aff0568

titusfortner force-pushed the affected_targets branch from f5b6646 to aff0568 Compare January 24, 2026 04:54

titusfortner merged commit 5baa84b into trunk Jan 24, 2026
19 checks passed

titusfortner deleted the affected_targets branch January 24, 2026 05:53

Uh oh!

Implement fast bazel target lookup with index caching #16974

Implement fast bazel target lookup with index caching #16974

Conversation

titusfortner commented Jan 22, 2026 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

💥 What does this PR do?

🔧 Implementation Notes

💡 Additional Considerations

🔄 Types of changes

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

Uh oh!

Uh oh!

qodo-code-review bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

titusfortner commented Jan 22, 2026

Uh oh!

titusfortner commented Jan 23, 2026

Uh oh!

qodo-code-review bot commented Jan 24, 2026

PR Compliance Guide 🔍

Uh oh!

titusfortner commented Jan 24, 2026

Uh oh!

qodo-code-review bot commented Jan 24, 2026

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

titusfortner commented Jan 22, 2026 •

edited by qodo-code-review bot

Loading

qodo-code-review bot commented Jan 22, 2026 •

edited

Loading

qodo-code-review bot commented Jan 22, 2026 •

edited

Loading