Test/benchmarking by acgetchell · Pull Request #59 · acgetchell/delaunay

acgetchell · 2025-09-02T16:32:36Z

This pull request updates the performance benchmarking workflow to improve how baseline artifacts are found and handled, and clarifies the messaging and artifact retention policy. The workflow now more reliably locates baseline artifacts, provides clearer instructions when no baseline is found, and aligns artifact retention with repository limits.

Improvements to baseline artifact discovery and handling:

The baseline search logic in .github/workflows/benchmarks.yml was refactored to first look for release-tagged baseline artifacts, and if none are found, to search for any recent baseline artifacts from manual or other workflow runs. The search is now more robust and logs the source type of the found artifact (release or manual). [1] [2] [3]
The fallback step was changed: if no baseline is found, the workflow now sets status variables and logs a message, rather than attempting to generate a new baseline on the fly.

Messaging and instructions:

The messaging when no baseline is found has been improved to clarify the steps required to enable performance regression testing, including creating a release tag or manually triggering the baseline workflow.
The regression test step was simplified to always use the full comparison mode, removing the special case for generated baselines.

Artifact retention policy:

The baseline artifact retention period in .github/workflows/generate-baseline.yml was reduced from 365 days to 90 days to match the repository maximum.

Refactors the benchmark workflow to rely solely on release baselines for performance comparison. This removes the generation of fallback baselines during pull request workflows, streamlining the process and ensuring comparisons are made against accurate, full-setting benchmarks. Updates the reporting to provide clearer guidance on how to enable performance regression testing via release tags. Increases baseline artifact retention from 1 year to 90 days to match repository maximum.

Modifies the workflow to improve the way performance baseline artifacts are located. The workflow now first attempts to find baseline artifacts associated with release tags by searching recent successful workflow runs of 'generate-baseline.yml'. If no release-tagged baseline is found, it falls back to searching for any recent baseline artifact generated by the same workflow, regardless of its origin (e.g., manual trigger). This ensures that a baseline is found even if a tagged release is not available, or the baseline was manually generated. It also changes the failure message if no baseline is found, and provides improved instructions on how to generate a baseline.

coderabbitai · 2025-09-02T16:32:42Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Overhauled baseline discovery to build a paginated artifact cache (search up to 150 runs), prefer release-matched artifact, fallback to most-recent manual artifact, expose source_type, remove automatic baseline generation, and shorten uploaded baseline retention to ~90 days; run phase always uses full comparison.

Changes

Cohort / File(s)	Change Summary
Benchmark workflow overhaul `.github/workflows/benchmarks.yml`	Replaced per-release iterative search with a paginated artifact cache (fetch up to 150 successful `generate-baseline` runs). Derives expected artifact name from a sanitized release tag; if exact release artifact found emits outputs `{found, release_tag, artifact_name, run_id, source_type="release"}`. If none, selects the most-recent manual artifact and emits `{found, release_tag="manual-baseline", artifact_name, run_id, source_type="manual"}`. Removed per-release nested try/catch, removed automatic fallback baseline generation (replaced with setting `BASELINE_EXISTS=false`, `BASELINE_SOURCE=none`), clarified step names and logs, and standardized run phase to full comparison mode. Added `source_type` step output for `find_baseline`.
Baseline generator retention tweak `.github/workflows/generate-baseline.yml`	Shortened uploaded baseline artifact retention from 365 days to 90 days and updated the comment; no other workflow logic changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Dev as Developer
  participant BW as Benchmarks Workflow
  participant GH as GitHub API
  participant AC as Artifact Cache
  participant RT as Regression Test

  Dev->>BW: Trigger benchmarks
  BW->>GH: List successful runs of generate-baseline (paginate, up to 150)
  GH-->>BW: Runs + artifacts
  BW->>AC: Build artifactCache of `performance-baseline-*` artifacts

  note right of BW: Derive expectedName from sanitized release tag

  alt Exact release baseline found
    BW->>BW: emit outputs {found=true, source_type="release", release_tag, artifact_name, run_id}
  else Manual fallback found
    BW->>BW: emit outputs {found=true, source_type="manual", release_tag="manual-baseline", artifact_name, run_id}
  else None found
    BW->>BW: set BASELINE_EXISTS=false, BASELINE_SOURCE=none
  end

  opt If baseline found
    BW->>RT: Run full comparison against discovered baseline
    RT-->>BW: Results
  end

  BW-->>Dev: Summary with Baseline Origin and outcome

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

I hop through artifact lanes at night,
sniffing tags till one feels right.
Release first — then manual cheer,
else I set the baseline clear.
Ninety days to keep the prize — then off I bound to test and rise 🐇✨

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch test/benchmarking

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

Copilot

Pull Request Overview

Updates the performance benchmarking workflow to improve baseline artifact discovery and handling, clarify messaging when baselines are missing, and align artifact retention with repository limits.

Refactored baseline search to first look for release-tagged artifacts, then fall back to any recent baseline artifacts
Replaced on-the-fly baseline generation with clearer messaging about enabling performance regression testing
Reduced artifact retention from 365 to 90 days to match repository maximum

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
.github/workflows/generate-baseline.yml	Updated artifact retention period from 365 to 90 days
.github/workflows/benchmarks.yml	Refactored baseline discovery logic, improved fallback handling, and simplified regression testing workflow

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-02T16:33:47Z

.github/workflows/benchmarks.yml

+                      core.setOutput('source_type', 'release');
+                      return;


The return statement exits the entire script block abruptly. Consider using a proper control flow mechanism or setting a flag to break out of nested loops more cleanly, as this makes the code harder to follow and maintain.

Copilot · 2025-09-02T16:33:48Z

.github/workflows/benchmarks.yml

+                    core.setOutput('source_type', 'manual');
+                    return;


Same issue as above - the return statement exits the script abruptly from within nested loops. This pattern makes the control flow difficult to understand and maintain.

codacy-production · 2025-09-02T16:35:34Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.36% (target: -1.00%)	✅ ∅

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`ea87521`)	2488	1734	69.69%
Head commit (`13b1b51`)	2488 (+0)	1743 (+9)	70.06% (+0.36%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#59)	0	0	∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

codecov · 2025-09-02T16:35:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.11%. Comparing base (ea87521) to head (13b1b51).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #59      +/-   ##
==========================================
+ Coverage   69.75%   70.11%   +0.36%     
==========================================
  Files          17       17              
  Lines        2486     2486              
==========================================
+ Hits         1734     1743       +9     
+ Misses        752      743       -9

Flag	Coverage Δ
unittests	`70.11% <ø> (+0.36%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (6)

.github/workflows/generate-baseline.yml (1)

144-146: Avoid asserting “repository maximum” unless enforced; consider making retention configurable.

90 days is fine, but “repository maximum” may be plan/settings-dependent. Recommend either softening the comment or wiring retention via an input/env so it can be adjusted without code changes.

Apply:
-          retention-days: 90   # Keep baselines for 90 days (repository maximum)
+          retention-days: 90   # Keep baselines ~90 days (align with repo settings; adjust if needed)

.github/workflows/benchmarks.yml (5)

93-96: Keep tag sanitization in lockstep with generate-baseline.

Regex mirrors sed in generate-baseline; add a short comment referencing that file to prevent drift when one changes.

Apply:

-                const cleanTag = release.tag_name.replace(/[^a-zA-Z0-9._-]/g, '_');
+                // Must match sanitize step in .github/workflows/generate-baseline.yml
+                const cleanTag = release.tag_name.replace(/[^a-zA-Z0-9._-]/g, '_');

190-211: Propagate origin (“release”/“manual”) to the env for later logging.

Currently BASELINE_SOURCE is always set to artifact, losing the new source_type. Export an additional BASELINE_ORIGIN.

Apply:

       - name: Extract baseline from artifact
         if: steps.find_baseline.outputs.found == 'true'
         shell: bash
         env:
           RELEASE_TAG: ${{ steps.find_baseline.outputs.release_tag }}
+          SOURCE_TYPE: ${{ steps.find_baseline.outputs.source_type }}
         run: |
           set -euo pipefail
           if [[ -f "baseline-artifact/baseline_results.txt" ]]; then
             echo "📦 Extracted baseline from artifact for release ${RELEASE_TAG:-unknown}"
             mkdir -p "benches"
             cp "baseline-artifact/baseline_results.txt" "benches/baseline_results.txt"
             echo "BASELINE_EXISTS=true" >> "$GITHUB_ENV"
             echo "BASELINE_SOURCE=artifact" >> "$GITHUB_ENV"
+            echo "BASELINE_ORIGIN=${SOURCE_TYPE:-unknown}" >> "$GITHUB_ENV"

213-221: Also set BASELINE_ORIGIN when none found.

Keeps summary consistent.

Apply:

       - name: Set baseline status if none found
         if: steps.find_baseline.outputs.found != 'true'
         shell: bash
         run: |
           set -euo pipefail
           echo "📈 No baseline artifact found for performance comparison"
           echo "BASELINE_EXISTS=false" >> "$GITHUB_ENV"
           echo "BASELINE_SOURCE=none" >> "$GITHUB_ENV"
+          echo "BASELINE_ORIGIN=none" >> "$GITHUB_ENV"

292-299: Fix message: it may be a manual baseline; use origin in logs.

Current log says “release baseline” unconditionally.

Apply:

           echo "🚀 Running performance regression test..."
-          echo "   Baseline source: ${BASELINE_SOURCE:-unknown}"
+          echo "   Baseline source: ${BASELINE_SOURCE:-unknown}"
+          echo "   Baseline origin: ${BASELINE_ORIGIN:-unknown}"
 
           # This will exit with code 1 if significant regressions are found
-          echo "   Using full comparison mode against release baseline"
+          echo "   Using full comparison mode against ${BASELINE_ORIGIN:-unknown} baseline"
           uv run benchmark-utils compare --baseline benches/baseline_results.txt \
             || uv run python -m scripts.benchmark_utils compare --baseline benches/baseline_results.txt

326-333: Include origin in the summary.

Surfaces the new source_type signal that the PR introduces.

Apply:

           echo "Baseline source: ${BASELINE_SOURCE:-none}"
+          echo "Baseline origin: ${BASELINE_ORIGIN:-unknown}"
           echo "Baseline exists: ${BASELINE_EXISTS:-false}"

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ea87521 and 2ae746c.

📒 Files selected for processing (2)

.github/workflows/benchmarks.yml (5 hunks)
.github/workflows/generate-baseline.yml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Codacy Static Code Analysis
GitHub Check: Code Coverage
GitHub Check: build (windows-latest)
GitHub Check: build (macos-latest)
GitHub Check: Codacy Security Scan

.github/workflows/benchmarks.yml

Ensures that expired artifacts are ignored when searching for performance baselines. This prevents the benchmark workflow from erroneously using outdated data. Adds more detailed logging about the baseline source, differentiating between baselines from artifacts and releases. The origin of the baseline (release or artifact) is now included in the comparison output and summary.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

.github/workflows/benchmarks.yml (2)

124-129: Early returns inside github-script.

Previous reviews flagged early returns. Here they’re acceptable for control flow in this isolated script; no action needed unless you prefer a single-exit style.

169-171: Same note about early return.

Same rationale as above.

🧹 Nitpick comments (1)

.github/workflows/benchmarks.yml (1)

97-131: Reduce API calls by caching run artifacts once.

Current approach lists artifacts for up to 3×50 runs per release, multiplying API calls by number of releases. Fetch runs and their artifacts once, build a Set/map of artifact names → run_id, then match expectedName per release. This cuts calls from O(releases × runs) to O(runs).

Also applies to: 136-173

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae746c and e35f04c.

📒 Files selected for processing (2)

.github/workflows/benchmarks.yml (8 hunks)
.github/workflows/generate-baseline.yml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/generate-baseline.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Codacy Static Code Analysis
GitHub Check: build (windows-latest)
GitHub Check: Code Coverage
GitHub Check: Codacy Security Scan

🔇 Additional comments (5)

.github/workflows/benchmarks.yml (5)

115-117: Good fix: exclude expired artifacts.

Guarding on a.expired !== true prevents 404s when downloading expired artifacts. This addresses prior feedback.

155-158: Good fix in fallback: also exclude expired artifacts.

Consistent with the primary search; avoids dead downloads.

192-207: Nice: propagate artifact origin.

BASELINE_SOURCE=artifact and BASELINE_ORIGIN={release|manual} improve observability.

217-226: Clear “no baseline” path.

Explicitly setting BASELINE_EXISTS=false/SOURCE=none/ORIGIN=none is clean and predictable.

299-300: Consistent origin reporting.

Surfacing Baseline origin in run and summary output is helpful for triage.

Also applies to: 335-335

coderabbitai · 2025-09-02T17:04:14Z

.github/workflows/benchmarks.yml

+          echo "   Using full comparison mode against ${BASELINE_ORIGIN:-unknown} baseline"
+          uv run benchmark-utils compare --baseline benches/baseline_results.txt \
+            || uv run python -m scripts.benchmark_utils compare --baseline benches/baseline_results.txt



⚠️ Potential issue

Bug: ‘||’ fallback masks real regressions (exit code 1).

If benchmark-utils correctly exits 1 on regression, the right-hand command runs and can overwrite the outcome. Gate the fallback on command availability, not on exit status.

Apply this diff:

- echo " Using full comparison mode against ${BASELINE_ORIGIN:-unknown} baseline" - uv run benchmark-utils compare --baseline benches/baseline_results.txt \ - || uv run python -m scripts.benchmark_utils compare --baseline benches/baseline_results.txt + echo " Using full comparison mode against ${BASELINE_ORIGIN:-unknown} baseline" + if uv run benchmark-utils --help >/dev/null 2>&1; then + uv run benchmark-utils compare --baseline benches/baseline_results.txt + elif uv run python - <<'PY' >/dev/null 2>&1 +import importlib; importlib.import_module("scripts.benchmark_utils") +PY + then + uv run python -m scripts.benchmark_utils compare --baseline benches/baseline_results.txt + else + echo "❌ benchmark-utils entrypoint and module not found" >&2 + exit 2 + fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

echo " Using full comparison mode against ${BASELINE_ORIGIN:-unknown} baseline"

uv run benchmark-utils compare --baseline benches/baseline_results.txt \

|| uv run python -m scripts.benchmark_utils compare --baseline benches/baseline_results.txt

echo " Using full comparison mode against ${BASELINE_ORIGIN:-unknown} baseline"

if uv run benchmark-utils --help >/dev/null 2>&1; then

uv run benchmark-utils compare --baseline benches/baseline_results.txt

elif uv run python - <<'PY' >/dev/null 2>&1

import importlib; importlib.import_module("scripts.benchmark_utils")

PY

then

uv run python -m scripts.benchmark_utils compare --baseline benches/baseline_results.txt

else

echo "❌ benchmark-utils entrypoint and module not found" >&2

exit 2

fi

🤖 Prompt for AI Agents

.github/workflows/benchmarks.yml around lines 302 to 305: the current use of "||" to fall back to the python runner will execute the python fallback whenever the first command exits non-zero (including a legitimate regression exit code 1), masking real failures; change the logic to only run the python fallback when the primary command is unavailable by checking for the presence of the 'uv' command (or whichever runner you intend) and only invoking the python fallback if that check fails, ensuring that a non-zero exit from the primary runner is propagated and not overwritten.

Optimizes the benchmark workflow by caching baseline artifacts from recent successful workflow runs. This significantly reduces the number of API calls required to locate the correct baseline, particularly for release baselines. It also falls back to the most recent baseline artifact if no release baseline is found. Also, the `compare` command is executed using an `if` statement that determines whether the benchmark utils is installed as a module or an entrypoint to avoid errors.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e35f04c and 7b64982.

📒 Files selected for processing (1)

.github/workflows/benchmarks.yml (7 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Codacy Static Code Analysis
GitHub Check: build (windows-latest)
GitHub Check: build (ubuntu-latest)
GitHub Check: Code Coverage
GitHub Check: Codacy Security Scan

🔇 Additional comments (3)

.github/workflows/benchmarks.yml (3)

142-145: Confirm tag sanitization matches generate-baseline.

The regex must be identical to the producer to avoid misses. Please verify .github/workflows/generate-baseline.yml uses the same /[^a-zA-Z0-9._-]/g replacement.

152-158: Early returns are appropriate here.

Exiting after setting outputs keeps control flow simple in actions/github-script. No change needed.

Also applies to: 182-188

319-327: Good fix: don’t mask regressions with “||” fallback.

Tool detection before invocation preserves non-zero exit codes from real regressions. Solid.

.github/workflows/benchmarks.yml

Limits the number of workflow runs fetched to avoid excessive API calls and ensures that only unique, non-expired baseline artifacts are cached, enhancing the reliability and efficiency of the benchmark workflow. Fixes a bug where oldest benchmarks were used - GitHub lists newest first, so first one should be preferred.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

.github/workflows/benchmarks.yml (1)

320-329: Fix preserves true regression exit codes (no ‘||’ masking).

Availability-gated fallback addresses the prior concern about masking exit code 1 on regressions.

🧹 Nitpick comments (3)

.github/workflows/benchmarks.yml (3)

91-96: Cap pagination to exactly 150 items (avoid overshooting).

Currently you may collect >150 runs from the last page. Slice the current page to the remaining budget and update the counter from the slice length.

-                (response, done) => {
-                  // Limit to 150 runs total across pages
-                  count += response.data.length;
-                  if (count >= 150) done();
-                  return response.data;
-                }
+                (response, done) => {
+                  // Limit to 150 runs total across pages (no overshoot)
+                  const remaining = Math.max(0, 150 - count);
+                  if (remaining === 0) { done(); return []; }
+                  const slice = response.data.slice(0, remaining);
+                  count += slice.length;
+                  if (count >= 150) done();
+                  return slice;
+                }

218-225: Surface the baseline tag in the Summary.

Persist the release tag into GITHUB_ENV when present and show it in the summary.

           cp "baseline-artifact/baseline_results.txt" "benches/baseline_results.txt"
           echo "BASELINE_EXISTS=true" >> "$GITHUB_ENV"
           echo "BASELINE_SOURCE=artifact" >> "$GITHUB_ENV"
           echo "BASELINE_ORIGIN=${SOURCE_TYPE:-unknown}" >> "$GITHUB_ENV"
+          if [[ -n "${RELEASE_TAG:-}" ]]; then
+            echo "BASELINE_TAG=${RELEASE_TAG}" >> "$GITHUB_ENV"
+          fi

-          echo "Baseline origin: ${BASELINE_ORIGIN:-unknown}"
+          echo "Baseline origin: ${BASELINE_ORIGIN:-unknown}"
+          echo "Baseline tag: ${BASELINE_TAG:-n/a}"

Also applies to: 359-361

76-99: Optional: reduce API calls by early-hit strategy.

Precompute a Set of expected baseline names from releases, then stop artifact enumeration as soon as you cache any exact match. This can short-circuit before walking all runs.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7b64982 and 174b8c4.

📒 Files selected for processing (1)

.github/workflows/benchmarks.yml (7 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Codacy Static Code Analysis
GitHub Check: Code Coverage
GitHub Check: Codacy Security Scan
GitHub Check: build (windows-latest)

🔇 Additional comments (3)

.github/workflows/benchmarks.yml (3)

110-118: Good: filters expired artifacts and preserves newest cache entry.

The expired !== true guard plus “set-if-absent” avoids 404s and prevents older runs from overwriting newer ones.

148-160: Release-first match logic looks solid.

Tag sanitization mirrors the generator, outputs are complete, and early exit is appropriate here.

165-189: Sane fallback: most-recent non-expired baseline.

Clear logging and outputs; good tie-break via run_created_at.

Optimizes the retrieval of baseline artifacts by pre-computing expected release baseline names and short-circuiting the search when a release baseline is found. This significantly reduces the number of API calls and speeds up the process, especially when release baselines are available. Also exports the `BASELINE_TAG` to the environment, making it available for subsequent steps.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

.github/workflows/benchmarks.yml (2)

170-182: Early return from github-script is acceptable here.

Setting outputs then return; is the right way to short-circuit in actions/github-script despite prior comments.

344-354: Great fix: no ‘||’ fallback masking regressions.

Capability checks gate the fallback runner, preserving exit code 1 on real regressions as intended.

🧹 Nitpick comments (2)

.github/workflows/benchmarks.yml (2)

246-249: Only set BASELINE_TAG for release-sourced baselines.

Avoid writing a synthetic tag like “manual-baseline” into BASELINE_TAG; reserve it for actual release tags.

Apply:

-            echo "BASELINE_ORIGIN=${SOURCE_TYPE:-unknown}" >> "$GITHUB_ENV"
-            if [[ -n "${RELEASE_TAG:-}" ]]; then
-              echo "BASELINE_TAG=${RELEASE_TAG}" >> "$GITHUB_ENV"
-            fi
+            echo "BASELINE_ORIGIN=${SOURCE_TYPE:-unknown}" >> "$GITHUB_ENV"
+            if [[ "${SOURCE_TYPE:-}" == "release" && -n "${RELEASE_TAG:-}" ]]; then
+              echo "BASELINE_TAG=${RELEASE_TAG}" >> "$GITHUB_ENV"
+            fi

326-334: Clarify retention window in guidance.

Mention 90-day artifact retention to explain why older baselines might be missing.

Apply:

           echo "⚠️ No performance baseline available for comparison."
           echo "   - No baseline artifacts found in recent workflow runs"
           echo "   - Performance regression testing requires a baseline"
           echo ""
           echo "💡 To enable performance regression testing:"
           echo "   1. Create a release tag (e.g., v0.4.3), or"
           echo "   2. Manually trigger the 'Generate Performance Baseline' workflow"
           echo "   3. Future PRs and pushes will use that baseline for comparison"
-          echo "   4. Baselines use full benchmark settings for accurate comparisons"
+          echo "   4. Baselines use full benchmark settings for accurate comparisons"
+          echo "   5. Baseline artifacts are retained for ~90 days; older ones may expire"

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 174b8c4 and 13b1b51.

📒 Files selected for processing (1)

.github/workflows/benchmarks.yml (7 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Codacy Static Code Analysis
GitHub Check: build (windows-latest)
GitHub Check: build (ubuntu-latest)
GitHub Check: Codacy Security Scan
GitHub Check: Code Coverage

🔇 Additional comments (5)

.github/workflows/benchmarks.yml (5)
48-49: Confirm macOS 15 runner availability or use macos-latest.

macos-15 may not be available in all orgs/tiers. Consider macos-latest to avoid queued failures.

Apply if desired:
-    runs-on: macos-15
+    runs-on: macos-latest
94-114: Good pagination cap fix.

The aggregate counter with slice prevents over-fetching and keeps API calls bounded. Nicely done.

129-148: Artifact cache logic looks solid.

Filtering expired artifacts and only setting when absent (to preserve newest) prevents 404s and stale hits. Early-hit flag plus outer-loop break is clean.

201-212: Sane fallback to most-recent manual baseline.

Selecting by run_created_at gives deterministic behavior when no release-tagged baseline exists.

283-307: Skip logic reads correctly and is safe under set -euo pipefail.

Covers unknown baseline, identical commit, missing commit, and relevant-change detection without false exits in conditionals.

acgetchell added 2 commits September 2, 2025 09:19

Copilot AI review requested due to automatic review settings September 2, 2025 16:32

Copilot AI reviewed Sep 2, 2025

View reviewed changes

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

.github/workflows/benchmarks.yml Outdated Show resolved Hide resolved

.github/workflows/benchmarks.yml Outdated Show resolved Hide resolved

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

.github/workflows/benchmarks.yml Show resolved Hide resolved

.github/workflows/benchmarks.yml Show resolved Hide resolved

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

acgetchell merged commit fda43d7 into main Sep 2, 2025
11 checks passed

acgetchell deleted the test/benchmarking branch September 2, 2025 17:48

This was referenced Sep 3, 2025

Docs/release v0.4.2 #60

Merged

Refactor/benchmarks #65

Merged

Refactor/profiling #67

Closed

feat: Memory profiling system and performance optimizations (v0.4.3) #68

Merged

coderabbitai bot mentioned this pull request Sep 13, 2025

Fix/benchmarks #76

Merged

coderabbitai bot mentioned this pull request Dec 13, 2025

perf: make DenseSlotMap default storage backend #146

Merged

coderabbitai bot mentioned this pull request Feb 25, 2026

docs: update docs for DelaunayTriangulationBuilder and toroidal topology #215

Merged

Conversation

acgetchell commented Sep 2, 2025

Uh oh!

coderabbitai bot commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

codacy-production bot commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

Uh oh!

codecov bot commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Sep 2, 2025 •

edited

Loading

codacy-production bot commented Sep 2, 2025 •

edited

Loading

codecov bot commented Sep 2, 2025 •

edited

Loading