Skip to content

[CI] Add manifest diff report generation tool (Part 1/3)#3295

Open
amd-hsivasun wants to merge 3 commits intomainfrom
amd/hsivasun/add-manifest-diff-scripts
Open

[CI] Add manifest diff report generation tool (Part 1/3)#3295
amd-hsivasun wants to merge 3 commits intomainfrom
amd/hsivasun/add-manifest-diff-scripts

Conversation

@amd-hsivasun
Copy link
Contributor

@amd-hsivasun amd-hsivasun commented Feb 6, 2026

Motivation

With submodule bumps from the monorepos, it's difficult to determine which components changed between builds. This tool generates comprehensive HTML reports showing:

  • Which submodules changed, were added, removed, or reverted
  • For superrepos (rocm-libraries, rocm-systems): component-level commit analysis
  • Visual indicators and commit links for easy navigation

This is Part 1/3 of the manifest diff report feature (split from #2161 per review feedback). This PR must be merged first before the subsequent PRs.

Technical Details

New files:

  • build_tools/generate_manifest_diff_report.py - Main script for generating diff reports
  • build_tools/report_template.html - HTML/CSS template for styled reports
  • build_tools/tests/generate_manifest_diff_report_test.py - Test suite (24 tests)

Features:

  • Compare any two TheRock commits via --start and --end flags
  • Find last successful workflow run via --find-last-successful <workflow_file.yml>
  • Support workflow run IDs via --workflow-mode
  • Output to custom directory via --output-dir
  • Revert detection using GitHub compare API
  • Component-level analysis for superrepos (projects/ and shared/ directories)

Usage:

Basic commit comparison:
python build_tools/generate_manifest_diff_report.py --start abc123 --end def456

Compare with last successful CI run:
python build_tools/generate_manifest_diff_report.py --end $GITHUB_SHA --find-last-successful ci_nightly.yml

Compare two workflow runs by their run IDs:
python build_tools/generate_manifest_diff_report.py --workflow-mode --start 21029550570 --end 21071591566

Test Plan

  • Run test suite: python -m pytest build_tools/tests/generate_manifest_diff_report_test.py -v
  • Test --find-last-successful ci_nightly.yml against real GitHub data
  • Test --workflow-mode with real workflow run IDs
  • Verify generated HTML report renders correctly

Test Result

All 24 tests pass. Manual testing with real GitHub API confirmed --find-last-successful and --workflow-mode work correctly.

@amd-hsivasun amd-hsivasun changed the title Amd/hsivasun/add manifest diff scripts [CI] Add manifest diff report generation tool (Part 1/3) Feb 6, 2026
amd-hsivasun and others added 3 commits February 6, 2026 17:35
This script generates HTML reports comparing submodule versions
between two TheRock commits, with support for:
- Superrepo component-level analysis (rocm-libraries, rocm-systems)
- Submodule
- Status detection: added, removed, changed, unchanged, reverted
- Add 24 tests covering pure functions, mocked API calls, and integration
  scenarios for the generate_manifest_diff_report.py script
- Fix ImportError: use gha_query_workflow_run_by_id (correct function name)
  instead of gha_query_workflow_run_information

Co-authored-by: Cursor <cursoragent@cursor.com>
Allow specifying a custom output directory for the generated HTML
report. When --output-dir is provided, the script creates the
directory (if needed) and writes TheRockReport.html there.

This makes integration with CI workflows easier, as the report can
be written directly to a reports/ directory for upload scripts.

Co-authored-by: Cursor <cursoragent@cursor.com>
Comment on lines +463 to +470
# =============================================================================
# Manifest Loading
# =============================================================================


def load_submodules_at_commit(commit_sha: str) -> dict[str, dict[str, str]]:
"""Load all submodules from TheRock at a specific commit."""
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can review this PR in more detail after another developer has given it a first pass review (tagged @erman-gurses ), but this is still recomputing information that should already be populated by https://github.com/ROCm/TheRock/blob/main/build_tools/generate_therock_manifest.py and included in share/therock/ with this code:

# ---------------------------------------------------------------------------
# TheRock manifest generation
# ---------------------------------------------------------------------------
find_package(Python3 COMPONENTS Interpreter REQUIRED)
set(THEROCK_MANIFEST_JSON "${CMAKE_CURRENT_BINARY_DIR}/therock_manifest.json")
set(THEROCK_MANIFEST_SCRIPT "${THEROCK_SOURCE_DIR}/build_tools/generate_therock_manifest.py")
add_custom_command(
OUTPUT "${THEROCK_MANIFEST_JSON}"
COMMAND "${Python3_EXECUTABLE}" "${THEROCK_MANIFEST_SCRIPT}" -o "${THEROCK_MANIFEST_JSON}"
WORKING_DIRECTORY "${THEROCK_SOURCE_DIR}"
DEPENDS "${THEROCK_MANIFEST_SCRIPT}" "${THEROCK_SOURCE_DIR}/.gitmodules"
COMMENT "Generating TheRock manifest"
VERBATIM
)
add_custom_target(aux-overlay-manifest ALL DEPENDS "${THEROCK_MANIFEST_JSON}")
install(FILES "${THEROCK_MANIFEST_JSON}" DESTINATION "share/therock")

We should be reusing data formats across the project, not inventing new schemas for each feature.

Copy link
Contributor

@erman-gurses erman-gurses Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 we can use the same schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Development for this script started before this was in TheRock. Thanks for pointing it out, I will refactor the code and switch to using this Thanks.

Comment on lines +52 to +54
# Pagination constants
MAX_PAGES = 20
PER_PAGE = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments where those numbers come from? Any specific reason for those numbers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These numbers were moreso used as a "limit" for the number of commits to fetch set to 2000. Ill add a comment to clarify.

Comment on lines +1106 to +1224
def generate_html_report(diff: ManifestDiff, output_dir: Path | None = None) -> Path:
"""Generate TheRockReport.html from template.

Args:
diff: The manifest diff to generate the report from.
output_dir: Optional output directory. If provided, creates the directory
and writes the report there. Otherwise writes to TheRock root.

Returns:
Path to the generated report file.
"""
print("\n=== Writing HTML Report ===")

# Determine output path
if output_dir:
output_dir.mkdir(parents=True, exist_ok=True)
report_path = output_dir / "TheRockReport.html"
else:
report_path = HTML_REPORT_PATH

if not HTML_TEMPLATE_PATH.exists():
print(f" ERROR: Template not found at {HTML_TEMPLATE_PATH}")
raise FileNotFoundError(f"Template not found at {HTML_TEMPLATE_PATH}")

html = HTML_TEMPLATE_PATH.read_text(encoding="utf-8")

# Inject TheRock commit range
html = html.replace(
'<span id="therock-start-commit">START_COMMIT</span>',
f'<span id="therock-start-commit">{diff.start_commit[:8]}</span>',
)
html = html.replace(
'<span id="therock-end-commit">END_COMMIT</span>',
f'<span id="therock-end-commit">{diff.end_commit[:8]}</span>',
)

# Inject submodule summary
status_groups = diff.get_status_groups()
submodule_summary = generate_summary_html(status_groups, "submodules")
html = html.replace(
'<div id="submodule-content"></div>',
f'<div id="submodule-content">{submodule_summary}</div>',
)

# Inject superrepo summaries
superrepo_parts = []
for superrepo in diff.superrepos.values():
comp_groups = {
"added": superrepo.added_components,
"removed": superrepo.removed_components,
"changed": superrepo.changed_components,
"unchanged": superrepo.unchanged_components,
}
comp_summary = generate_summary_html(comp_groups, "components")
if comp_summary:
superrepo_parts.append(
f"<div class='summary-section'><h2>{superrepo.name.title()} Components</h2>"
f"{comp_summary}</div>"
)
html = html.replace(
'<div id="superrepo-content"></div>',
f'<div id="superrepo-content">{"".join(superrepo_parts)}</div>',
)

# Inject section commit badges
for repo_name in ["rocm-libraries", "rocm-systems"]:
superrepo = diff.superrepos.get(repo_name)
start_sha = superrepo.start_sha if superrepo else ""
end_sha = superrepo.end_sha if superrepo else ""
start_badge = create_commit_badge(start_sha, repo_name) if start_sha else "N/A"
end_badge = create_commit_badge(end_sha, repo_name) if end_sha else "N/A"
html = html.replace(
f'<span id="commit-diff-start-{repo_name}-superrepo"></span>',
f'<span id="commit-diff-start-{repo_name}-superrepo">{start_badge}</span>',
)
html = html.replace(
f'<span id="commit-diff-end-{repo_name}-superrepo"></span>',
f'<span id="commit-diff-end-{repo_name}-superrepo">{end_badge}</span>',
)

# Non-superrepo section badges
start_badge = create_commit_badge(diff.start_commit, THEROCK_REPO)
end_badge = create_commit_badge(diff.end_commit, THEROCK_REPO)
html = html.replace(
'<span id="commit-diff-start-non-superrepo"></span>',
f'<span id="commit-diff-start-non-superrepo">{start_badge}</span>',
)
html = html.replace(
'<span id="commit-diff-end-non-superrepo"></span>',
f'<span id="commit-diff-end-non-superrepo">{end_badge}</span>',
)

# Inject section content
for repo_name in ["rocm-libraries", "rocm-systems"]:
superrepo = diff.superrepos.get(repo_name)
if superrepo:
# Pass removed submodules if this superrepo was newly added
removed_subs = diff.removed if superrepo.status == "added" else None
content = generate_superrepo_html(
superrepo, removed_submodules=removed_subs
)
else:
content = ""
html = html.replace(
f'<div id="commit-diff-job-content-{repo_name}-superrepo" style="margin-top:8px;">\n </div>',
f'<div id="commit-diff-job-content-{repo_name}-superrepo" style="margin-top:8px;">\n {content}\n </div>',
)

non_superrepo_html = generate_non_superrepo_html(diff)
html = html.replace(
'<div id="commit-diff-job-content-non-superrepo" style="margin-top:8px;">\n </div>',
f'<div id="commit-diff-job-content-non-superrepo" style="margin-top:8px;">\n {non_superrepo_html}\n </div>',
)

report_path.write_text(html, encoding="utf-8")
if not report_path.exists() or report_path.stat().st_size == 0:
raise RuntimeError(f"Failed to write HTML report to {report_path}")
print(f" Report written to: {report_path}")
return report_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a long function - maybe we can make it as class with functions?
Some examples:

inject_commit_range(html, diff)
inject_summaries(html, diff)
inject_section_content(html, diff)

Same for the other long functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will split this up.

Copy link
Contributor

@erman-gurses erman-gurses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add an example of the report?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: TODO

Development

Successfully merging this pull request may close these issues.

3 participants