[CI] Add manifest diff report generation tool (Part 1/3)#3295
[CI] Add manifest diff report generation tool (Part 1/3)#3295amd-hsivasun wants to merge 3 commits intomainfrom
Conversation
This script generates HTML reports comparing submodule versions between two TheRock commits, with support for: - Superrepo component-level analysis (rocm-libraries, rocm-systems) - Submodule - Status detection: added, removed, changed, unchanged, reverted
- Add 24 tests covering pure functions, mocked API calls, and integration scenarios for the generate_manifest_diff_report.py script - Fix ImportError: use gha_query_workflow_run_by_id (correct function name) instead of gha_query_workflow_run_information Co-authored-by: Cursor <cursoragent@cursor.com>
Allow specifying a custom output directory for the generated HTML report. When --output-dir is provided, the script creates the directory (if needed) and writes TheRockReport.html there. This makes integration with CI workflows easier, as the report can be written directly to a reports/ directory for upload scripts. Co-authored-by: Cursor <cursoragent@cursor.com>
b62701d to
b84cd7c
Compare
| # ============================================================================= | ||
| # Manifest Loading | ||
| # ============================================================================= | ||
|
|
||
|
|
||
| def load_submodules_at_commit(commit_sha: str) -> dict[str, dict[str, str]]: | ||
| """Load all submodules from TheRock at a specific commit.""" | ||
| try: |
There was a problem hiding this comment.
I can review this PR in more detail after another developer has given it a first pass review (tagged @erman-gurses ), but this is still recomputing information that should already be populated by https://github.com/ROCm/TheRock/blob/main/build_tools/generate_therock_manifest.py and included in share/therock/ with this code:
TheRock/base/aux-overlay/CMakeLists.txt
Lines 39 to 59 in a658818
We should be reusing data formats across the project, not inventing new schemas for each feature.
There was a problem hiding this comment.
+1 we can use the same schema.
There was a problem hiding this comment.
Development for this script started before this was in TheRock. Thanks for pointing it out, I will refactor the code and switch to using this Thanks.
| # Pagination constants | ||
| MAX_PAGES = 20 | ||
| PER_PAGE = 100 |
There was a problem hiding this comment.
Can you add some comments where those numbers come from? Any specific reason for those numbers.
There was a problem hiding this comment.
These numbers were moreso used as a "limit" for the number of commits to fetch set to 2000. Ill add a comment to clarify.
| def generate_html_report(diff: ManifestDiff, output_dir: Path | None = None) -> Path: | ||
| """Generate TheRockReport.html from template. | ||
|
|
||
| Args: | ||
| diff: The manifest diff to generate the report from. | ||
| output_dir: Optional output directory. If provided, creates the directory | ||
| and writes the report there. Otherwise writes to TheRock root. | ||
|
|
||
| Returns: | ||
| Path to the generated report file. | ||
| """ | ||
| print("\n=== Writing HTML Report ===") | ||
|
|
||
| # Determine output path | ||
| if output_dir: | ||
| output_dir.mkdir(parents=True, exist_ok=True) | ||
| report_path = output_dir / "TheRockReport.html" | ||
| else: | ||
| report_path = HTML_REPORT_PATH | ||
|
|
||
| if not HTML_TEMPLATE_PATH.exists(): | ||
| print(f" ERROR: Template not found at {HTML_TEMPLATE_PATH}") | ||
| raise FileNotFoundError(f"Template not found at {HTML_TEMPLATE_PATH}") | ||
|
|
||
| html = HTML_TEMPLATE_PATH.read_text(encoding="utf-8") | ||
|
|
||
| # Inject TheRock commit range | ||
| html = html.replace( | ||
| '<span id="therock-start-commit">START_COMMIT</span>', | ||
| f'<span id="therock-start-commit">{diff.start_commit[:8]}</span>', | ||
| ) | ||
| html = html.replace( | ||
| '<span id="therock-end-commit">END_COMMIT</span>', | ||
| f'<span id="therock-end-commit">{diff.end_commit[:8]}</span>', | ||
| ) | ||
|
|
||
| # Inject submodule summary | ||
| status_groups = diff.get_status_groups() | ||
| submodule_summary = generate_summary_html(status_groups, "submodules") | ||
| html = html.replace( | ||
| '<div id="submodule-content"></div>', | ||
| f'<div id="submodule-content">{submodule_summary}</div>', | ||
| ) | ||
|
|
||
| # Inject superrepo summaries | ||
| superrepo_parts = [] | ||
| for superrepo in diff.superrepos.values(): | ||
| comp_groups = { | ||
| "added": superrepo.added_components, | ||
| "removed": superrepo.removed_components, | ||
| "changed": superrepo.changed_components, | ||
| "unchanged": superrepo.unchanged_components, | ||
| } | ||
| comp_summary = generate_summary_html(comp_groups, "components") | ||
| if comp_summary: | ||
| superrepo_parts.append( | ||
| f"<div class='summary-section'><h2>{superrepo.name.title()} Components</h2>" | ||
| f"{comp_summary}</div>" | ||
| ) | ||
| html = html.replace( | ||
| '<div id="superrepo-content"></div>', | ||
| f'<div id="superrepo-content">{"".join(superrepo_parts)}</div>', | ||
| ) | ||
|
|
||
| # Inject section commit badges | ||
| for repo_name in ["rocm-libraries", "rocm-systems"]: | ||
| superrepo = diff.superrepos.get(repo_name) | ||
| start_sha = superrepo.start_sha if superrepo else "" | ||
| end_sha = superrepo.end_sha if superrepo else "" | ||
| start_badge = create_commit_badge(start_sha, repo_name) if start_sha else "N/A" | ||
| end_badge = create_commit_badge(end_sha, repo_name) if end_sha else "N/A" | ||
| html = html.replace( | ||
| f'<span id="commit-diff-start-{repo_name}-superrepo"></span>', | ||
| f'<span id="commit-diff-start-{repo_name}-superrepo">{start_badge}</span>', | ||
| ) | ||
| html = html.replace( | ||
| f'<span id="commit-diff-end-{repo_name}-superrepo"></span>', | ||
| f'<span id="commit-diff-end-{repo_name}-superrepo">{end_badge}</span>', | ||
| ) | ||
|
|
||
| # Non-superrepo section badges | ||
| start_badge = create_commit_badge(diff.start_commit, THEROCK_REPO) | ||
| end_badge = create_commit_badge(diff.end_commit, THEROCK_REPO) | ||
| html = html.replace( | ||
| '<span id="commit-diff-start-non-superrepo"></span>', | ||
| f'<span id="commit-diff-start-non-superrepo">{start_badge}</span>', | ||
| ) | ||
| html = html.replace( | ||
| '<span id="commit-diff-end-non-superrepo"></span>', | ||
| f'<span id="commit-diff-end-non-superrepo">{end_badge}</span>', | ||
| ) | ||
|
|
||
| # Inject section content | ||
| for repo_name in ["rocm-libraries", "rocm-systems"]: | ||
| superrepo = diff.superrepos.get(repo_name) | ||
| if superrepo: | ||
| # Pass removed submodules if this superrepo was newly added | ||
| removed_subs = diff.removed if superrepo.status == "added" else None | ||
| content = generate_superrepo_html( | ||
| superrepo, removed_submodules=removed_subs | ||
| ) | ||
| else: | ||
| content = "" | ||
| html = html.replace( | ||
| f'<div id="commit-diff-job-content-{repo_name}-superrepo" style="margin-top:8px;">\n </div>', | ||
| f'<div id="commit-diff-job-content-{repo_name}-superrepo" style="margin-top:8px;">\n {content}\n </div>', | ||
| ) | ||
|
|
||
| non_superrepo_html = generate_non_superrepo_html(diff) | ||
| html = html.replace( | ||
| '<div id="commit-diff-job-content-non-superrepo" style="margin-top:8px;">\n </div>', | ||
| f'<div id="commit-diff-job-content-non-superrepo" style="margin-top:8px;">\n {non_superrepo_html}\n </div>', | ||
| ) | ||
|
|
||
| report_path.write_text(html, encoding="utf-8") | ||
| if not report_path.exists() or report_path.stat().st_size == 0: | ||
| raise RuntimeError(f"Failed to write HTML report to {report_path}") | ||
| print(f" Report written to: {report_path}") | ||
| return report_path |
There was a problem hiding this comment.
It is a long function - maybe we can make it as class with functions?
Some examples:
inject_commit_range(html, diff)
inject_summaries(html, diff)
inject_section_content(html, diff)
Same for the other long functions.
There was a problem hiding this comment.
Sure I will split this up.
Motivation
With submodule bumps from the monorepos, it's difficult to determine which components changed between builds. This tool generates comprehensive HTML reports showing:
This is Part 1/3 of the manifest diff report feature (split from #2161 per review feedback). This PR must be merged first before the subsequent PRs.
Technical Details
New files:
build_tools/generate_manifest_diff_report.py- Main script for generating diff reportsbuild_tools/report_template.html- HTML/CSS template for styled reportsbuild_tools/tests/generate_manifest_diff_report_test.py- Test suite (24 tests)Features:
--startand--endflags--find-last-successful <workflow_file.yml>--workflow-mode--output-dirUsage:
Basic commit comparison:
python build_tools/generate_manifest_diff_report.py --start abc123 --end def456Compare with last successful CI run:
python build_tools/generate_manifest_diff_report.py --end $GITHUB_SHA --find-last-successful ci_nightly.ymlCompare two workflow runs by their run IDs:
python build_tools/generate_manifest_diff_report.py --workflow-mode --start 21029550570 --end 21071591566Test Plan
python -m pytest build_tools/tests/generate_manifest_diff_report_test.py -v--find-last-successful ci_nightly.ymlagainst real GitHub data--workflow-modewith real workflow run IDsTest Result
All 24 tests pass. Manual testing with real GitHub API confirmed
--find-last-successfuland--workflow-modework correctly.