You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The gh-aw-firewall repository has a mature and security-focused agentic workflow portfolio of 21 workflows — well above average for its size — with strong coverage of security scanning, smoke testing, CI fault investigation, and documentation maintenance. The primary gaps are in meta-monitoring (no workflow health manager despite multiple recent workflow failures), issue triage (issues are auto-assigned to Copilot but never labeled/categorized first), and code quality automation (no simplifier or schema drift checker for this structurally complex codebase).
Specialization over generalism: 100+ focused workflows outperform monolithic agents. This repo already does this well.
Meta-agents are essential at scale: When running 20+ workflows, you need agents that monitor other agents. Workflow Health Manager (from Pelis) created 40 issues, 14 PRs, 5 direct merges.
Triage before dispatch: Issue triage should happen before the Issue Monster dispatches work — labeling, categorizing, and assessing duplicates first produces better Copilot agent outcomes.
Observability layer: Metrics Collector + Audit Workflows = nervous system for an agent factory.
CI Coach pattern: Not just diagnosing failures (CI Doctor) but actively optimizing CI pipelines — 100% merge rate in Pelis.
Schema consistency checking: Type drift between code, config, and docs is a silent quality killer.
Breaking change awareness: Especially critical for a tool with a CLI API used by external CI pipelines.
From the githubnext/agentics Repository
The agentics repo showcases:
daily-repo-goals: Goal-driven daily automation that tracks progress toward explicit targets
daily-workflow-sync: Keeps workflow templates up to date from upstream sources
import-workflow: Slash command to import workflows from registries
link-checker: Automated broken link detection across docs
maintainer: Generalist daily maintenance agent
Comparison to This Repo
Pattern
Pelis Factory
This Repo
Issue triage
✅ Issue Triage Agent
❌ Missing
Issue dispatch
✅ Issue Monster
✅ Issue Monster
CI failure investigation
✅ CI Doctor
✅ CI Doctor
CI optimization
✅ CI Coach (100% merge rate)
❌ Missing
Meta-monitoring
✅ Workflow Health Manager
❌ Missing
Agent metrics
✅ Metrics Collector
❌ Missing
Code simplifier
✅ Code Simplifier (83%)
❌ Missing
Schema drift
✅ Schema Consistency Checker
❌ Missing
Breaking changes
✅ Breaking Change Checker
❌ Missing
Security scanning
✅ Multiple
✅ Very strong
Secret scanning
✅ Daily Secrets Analysis
✅ Hourly (3 engines!)
Smoke testing
✅ Firewall tests
✅ 4 smoke workflows
Test coverage
✅ Daily Test Improver
✅ Weekly Coverage Improver
Release automation
✅ Changeset
⚠️ Partial (update-release-notes)
📋 Current Agentic Workflow Inventory
Workflow
Purpose
Trigger
Assessment
build-test
Run build test suite with Copilot
PR + dispatch
✅ Well-configured
ci-cd-gaps-assessment
Identify CI/CD coverage gaps
Daily
✅ Good insight tool
ci-doctor
Investigate CI failures
workflow_run failure
✅ Highly effective
cli-flag-consistency-checker
Detect CLI doc drift
Weekly
✅ Domain-specific gem
dependency-security-monitor
Vulnerability monitoring + safe updates
Daily
✅ Creates PRs, very useful
doc-maintainer
Sync docs with code changes
Daily
✅ Active, skip-if-match guarded
issue-duplication-detector
Detect duplicate issues
Issue opened
✅ Cache-memory powered
issue-monster
Auto-assign issues to Copilot
Issue opened + hourly
✅ Core dispatch agent
pelis-agent-factory-advisor
Workflow opportunity analysis
Daily (this!)
✅ Meta-advisory
plan
/plan slash command for task breakdown
Slash command
✅ Interactive planning
secret-digger-*
Red team secret scanning (3 engines)
Hourly
✅ Unique security asset
security-guard
PR security review
PR opened/updated
✅ Claude-powered guard
security-review
Daily threat modeling
Daily
✅ Comprehensive
smoke-chroot
Validate chroot mode
PR (path-filtered)
✅ Efficient path filter
smoke-claude
Claude engine smoke test
Every 12h + PR
✅ Reaction-triggered too
smoke-codex
Codex engine smoke test
Every 12h + PR
✅ Multi-engine coverage
smoke-copilot
Copilot engine smoke test
Every 12h + PR
✅ Multi-engine coverage
test-coverage-improver
Improve test coverage (PRs)
Weekly
⚠️ Coverage only 38% — needs more frequency
update-release-notes
Enhance release notes
Release published
✅ Good automation
🚀 Actionable Recommendations
P0 — Implement Immediately
🏥 Issue Triage Agent
What: Automatically label incoming issues with categories like bug, feature, documentation, question, security, performance based on content analysis.
Why: The Issue Monster currently dispatches every open issue to Copilot agents, but without prior triage, agents work on unlabeled issues with no priority signal. Multiple recent failures (#1308, #1291, #1287, #1284, #1283, #1282) would benefit from the bug label applied immediately for faster triage. In Pelis Factory, triage is the foundational workflow everything else builds on.
How: Classic triage pattern — trigger on issues: opened, analyze title+body, apply label, comment with reasoning.
What: A meta-monitoring workflow that inspects all 21+ agentic workflows, detects unhealthy patterns (repeated failures, no-ops, high cost, zero PRs), and creates targeted issues.
Why: The open issues list shows 6+ recent workflow failures (#1308, #1291, #1287, #1284, #1283, #1282) but there's no automated agent tracking these patterns over time. In Pelis Factory, the Workflow Health Manager created 40 issues, directly caused 14 PRs, and caught problems like missing runtime files and configuration drift. At 21 workflows and growing, this is now needed.
How: Use agentic-workflows tool to get status + recent logs. Analyze failure patterns using cache-memory to track trends across runs. Create issues for workflows with 3+ consecutive failures or zero successful runs in 7 days.
What: On each PR, detect changes that could break backward compatibility: CLI flag removals/renames, API changes, Docker Compose schema changes, configuration file format changes.
Why: This is a security/firewall tool used in CI pipelines by other teams. Breaking changes to awf CLI flags or Docker Compose structure could silently break downstream workflows. The codebase has src/types.ts (WrapperConfig), src/cli.ts (flag definitions), and containers/*/Dockerfile — all prime sources of breaking changes.
How: Trigger on PRs, use bash to diff flag definitions and type signatures against main, create alert issues for backward-incompatible changes.
What: Daily workflow that runs security scanners (zizmor for GitHub Actions security, actionlint for syntax, poutine for supply chain) on all .lock.yml files and posts findings as a discussion.
Why: This repo is a security tool — its own workflow security should be exemplary. The security-review workflow is comprehensive but narrative; a dedicated scanner workflow provides machine-readable, reproducible findings. Pelis Factory's Static Analysis Report created 57 discussions + 12 Zizmor security reports. The agenticworkflows-compile tool already supports --zizmor, --poutine, and --actionlint.
How: Daily schedule, run gh aw compile --zizmor --poutine --actionlint, post structured findings.
Effort: Low
⚡ CI Optimization Coach
What: Weekly analysis of CI workflow performance (build times, cache hit rates, flakiness, redundant steps) with PRs to optimize.
Why: CI Doctor fixes failures reactively; CI Coach improves proactively. With 4 smoke workflows running every 12 hours and multiple integration test suites, there are likely optimization opportunities. In Pelis Factory, CI Coach had a 100% merge rate on 9 PRs.
How: Analyze GitHub Actions timing data, identify slowest steps, propose caching improvements, parallelization, or test deduplication.
Effort: Medium
🏗️ Code Simplifier
What: Daily agent that reviews recently-changed code and proposes simplification PRs without changing functionality.
Why: docker-manager.ts has only 18% test coverage and is the most complex file in the codebase (250 statements). cli.ts has 0% coverage. After rapid development, complexity accumulates. The Code Simplifier in Pelis Factory achieved 83% merge rate on PRs. For a security tool, simpler code = fewer audit surface area issues.
How: Daily schedule, look at commits from last 3 days, identify TypeScript files with complexity issues, propose refactoring PRs.
Effort: Medium
P2 — Consider for Roadmap
🗂️ Schema Consistency Checker
What: Weekly check that WrapperConfig interface in src/types.ts, CLI flag definitions in src/cli.ts, documentation in docs/, and the action.yml schema all stay in sync.
Why: Types drift is common in TypeScript codebases. The AGENTS.md already documents a complex config model; drift between implementation and docs could mislead contributors. Pelis Factory's Schema Checker created 55 analysis discussions.
Effort: Medium
📦 Changeset Auto-Generator
What: When a release is tagged, automatically generate a structured changeset (version bump type + categorized changelog) as a PR, complementing the existing update-release-notes workflow.
Why: update-release-notes enhances existing release notes but a full changeset workflow would propose the version bump (major/minor/patch) based on commit analysis. Pelis Factory's Changeset had 78% PR merge rate.
Effort: Low-Medium
🔭 MCP Inspector
What: Weekly validation that all MCP servers configured in workflow files (GitHub, playwright, etc.) are reachable and exposing the expected tools.
Why: The smoke workflows use MCP servers (ghcr.io/github/gh-aw-mcpg); a silent MCP misconfiguration would fail all smoke tests. The agenticworkflows-mcp-inspect tool makes this easy.
Effort: Low
🔄 Mergefest (Auto-Merge Main into PRs)
What: Hourly workflow that merges the main branch into stale PR branches to prevent large merge conflicts.
Why: The repo has multiple long-running PRs (dependency updates, docs). Keeping them current reduces review friction. Pelis Factory called this an "orchestrator workflow."
Effort: Low
P3 — Future Ideas
📈 Agent Performance Metrics Collector
Track daily performance across all workflows: cost, turn counts, merge rates, no-op rates. Use agentic-workflows.logs + cache-memory to build trend data. Pelis Factory's Metrics Collector created 41 daily discussions.
🌳 Issue Arborist
Automatically link related issues as sub-issues (e.g., link all CI failure issues under a parent tracking issue). Pelis created 18 parent issues and 77 reports.
🤝 PR Review Comment Responder
Slash command /address that reads unresolved PR review comments and creates sub-issues or draft PRs to address them.
🎪 Daily Multi-Device Docs Tester
Use Playwright to test the Astro/Starlight docs site (at docs-site/) on mobile and desktop viewports. The docs site deploys to GitHub Pages — ensuring it renders correctly on all devices adds quality. Pelis achieved 100% merge rate on 2 PRs from this workflow.
Only test coverage improver; missing simplifier, duplicate detector
Meta-monitoring
1/5
No workflow health manager, no metrics collector
Release automation
3/5
update-release-notes exists; missing full changeset generator
Current Overall Level: 3.5/5 — "Accomplished Practitioner"
Strong security and testing automation. Core dev loop workflows in place. Gap is in meta-monitoring and code quality automation.
Target Level: 4.5/5 — "Advanced Practitioner"
Gap to close: Add workflow health manager (P0), issue triage (P0), breaking change checker (P1), CI coach (P1), and static analysis report (P1) to reach level 4.5.
🔄 Comparison with Best Practices
What This Repository Does Exceptionally Well
Multi-engine smoke testing: Running smoke tests on Claude, Codex, and Copilot simultaneously at 12h intervals is ahead of most repositories.
Hourly secret scanning: 3 engines scanning every hour is more thorough than Pelis Factory's daily scan.
Domain-specific workflows: secret-digger, cli-flag-consistency-checker, and security-guard are all tailored specifically to this security tool's domain.
Reaction-triggered workflows: Using emoji reactions (❤️ 🚀 🎉 👀) to trigger smoke tests on demand is elegant UX.
Skip-if-match guards: Preventing workflow pile-up is a mature pattern.
What Could Improve
Meta-monitoring gap: With 6 recent workflow failures in open issues, a Workflow Health Manager would provide systematic oversight rather than ad-hoc issue creation.
Issue triage before dispatch: Issue Monster would perform better if issues were labeled and triaged first.
Test coverage trajectory: 38% is above threshold but docker-manager.ts (18%) and cli.ts (0%) represent the most critical paths in a security tool. Weekly coverage improver may not be aggressive enough.
Static analysis automation: The compile tool supports --zizmor --poutine --actionlint — not using these in an automated daily report is a missed opportunity for a security-first repo.
Unique Opportunities Given This Repository's Domain
Firewall escape attempt reporting: The secret-digger and smoke workflows already test containment. A daily "Firewall Escape Report" discussion summarizing all escape attempts would be valuable public documentation of security posture.
Domain allowlist governance: An agent that reviews PRs changing domain allowlists in tests (tests/fixtures/) to ensure only legitimate domains are being whitelisted.
Container image security: Weekly analysis of GHCR image freshness and vulnerability scanning for the published squid and agent images.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
The
gh-aw-firewallrepository has a mature and security-focused agentic workflow portfolio of 21 workflows — well above average for its size — with strong coverage of security scanning, smoke testing, CI fault investigation, and documentation maintenance. The primary gaps are in meta-monitoring (no workflow health manager despite multiple recent workflow failures), issue triage (issues are auto-assigned to Copilot but never labeled/categorized first), and code quality automation (no simplifier or schema drift checker for this structurally complex codebase).🎓 Patterns Learned from Pelis Agent Factory
Key Patterns from the Documentation Site
From crawling the Pelis Agent Factory and its 19-part series:
From the
githubnext/agenticsRepositoryThe agentics repo showcases:
daily-repo-goals: Goal-driven daily automation that tracks progress toward explicit targetsdaily-workflow-sync: Keeps workflow templates up to date from upstream sourcesimport-workflow: Slash command to import workflows from registrieslink-checker: Automated broken link detection across docsmaintainer: Generalist daily maintenance agentComparison to This Repo
📋 Current Agentic Workflow Inventory
build-testci-cd-gaps-assessmentci-doctorcli-flag-consistency-checkerdependency-security-monitordoc-maintainerissue-duplication-detectorissue-monsterpelis-agent-factory-advisorplansecret-digger-*security-guardsecurity-reviewsmoke-chrootsmoke-claudesmoke-codexsmoke-copilottest-coverage-improverupdate-release-notes🚀 Actionable Recommendations
P0 — Implement Immediately
🏥 Issue Triage Agent
What: Automatically label incoming issues with categories like
bug,feature,documentation,question,security,performancebased on content analysis.Why: The Issue Monster currently dispatches every open issue to Copilot agents, but without prior triage, agents work on unlabeled issues with no priority signal. Multiple recent failures (#1308, #1291, #1287, #1284, #1283, #1282) would benefit from the
buglabel applied immediately for faster triage. In Pelis Factory, triage is the foundational workflow everything else builds on.How: Classic triage pattern — trigger on
issues: opened, analyze title+body, apply label, comment with reasoning.Effort: Low (1-2 hours)
Example:
📊 Workflow Health Manager
What: A meta-monitoring workflow that inspects all 21+ agentic workflows, detects unhealthy patterns (repeated failures, no-ops, high cost, zero PRs), and creates targeted issues.
Why: The open issues list shows 6+ recent workflow failures (#1308, #1291, #1287, #1284, #1283, #1282) but there's no automated agent tracking these patterns over time. In Pelis Factory, the Workflow Health Manager created 40 issues, directly caused 14 PRs, and caught problems like missing runtime files and configuration drift. At 21 workflows and growing, this is now needed.
How: Use
agentic-workflowstool to get status + recent logs. Analyze failure patterns usingcache-memoryto track trends across runs. Create issues for workflows with 3+ consecutive failures or zero successful runs in 7 days.Effort: Medium (2-4 hours)
Example trigger:
P1 — Plan for Near-Term
🔍 Breaking Change Checker
What: On each PR, detect changes that could break backward compatibility: CLI flag removals/renames, API changes, Docker Compose schema changes, configuration file format changes.
Why: This is a security/firewall tool used in CI pipelines by other teams. Breaking changes to
awfCLI flags or Docker Compose structure could silently break downstream workflows. The codebase hassrc/types.ts(WrapperConfig),src/cli.ts(flag definitions), andcontainers/*/Dockerfile— all prime sources of breaking changes.How: Trigger on PRs, use bash to diff flag definitions and type signatures against main, create alert issues for backward-incompatible changes.
Effort: Medium
🎭 Static Analysis Report (zizmor/actionlint/poutine)
What: Daily workflow that runs security scanners (zizmor for GitHub Actions security, actionlint for syntax, poutine for supply chain) on all
.lock.ymlfiles and posts findings as a discussion.Why: This repo is a security tool — its own workflow security should be exemplary. The
security-reviewworkflow is comprehensive but narrative; a dedicated scanner workflow provides machine-readable, reproducible findings. Pelis Factory's Static Analysis Report created 57 discussions + 12 Zizmor security reports. Theagenticworkflows-compiletool already supports--zizmor,--poutine, and--actionlint.How: Daily schedule, run
gh aw compile --zizmor --poutine --actionlint, post structured findings.Effort: Low
⚡ CI Optimization Coach
What: Weekly analysis of CI workflow performance (build times, cache hit rates, flakiness, redundant steps) with PRs to optimize.
Why: CI Doctor fixes failures reactively; CI Coach improves proactively. With 4 smoke workflows running every 12 hours and multiple integration test suites, there are likely optimization opportunities. In Pelis Factory, CI Coach had a 100% merge rate on 9 PRs.
How: Analyze GitHub Actions timing data, identify slowest steps, propose caching improvements, parallelization, or test deduplication.
Effort: Medium
🏗️ Code Simplifier
What: Daily agent that reviews recently-changed code and proposes simplification PRs without changing functionality.
Why:
docker-manager.tshas only 18% test coverage and is the most complex file in the codebase (250 statements).cli.tshas 0% coverage. After rapid development, complexity accumulates. The Code Simplifier in Pelis Factory achieved 83% merge rate on PRs. For a security tool, simpler code = fewer audit surface area issues.How: Daily schedule, look at commits from last 3 days, identify TypeScript files with complexity issues, propose refactoring PRs.
Effort: Medium
P2 — Consider for Roadmap
🗂️ Schema Consistency Checker
What: Weekly check that
WrapperConfiginterface insrc/types.ts, CLI flag definitions insrc/cli.ts, documentation indocs/, and theaction.ymlschema all stay in sync.Why: Types drift is common in TypeScript codebases. The AGENTS.md already documents a complex config model; drift between implementation and docs could mislead contributors. Pelis Factory's Schema Checker created 55 analysis discussions.
Effort: Medium
📦 Changeset Auto-Generator
What: When a release is tagged, automatically generate a structured changeset (version bump type + categorized changelog) as a PR, complementing the existing
update-release-notesworkflow.Why:
update-release-notesenhances existing release notes but a full changeset workflow would propose the version bump (major/minor/patch) based on commit analysis. Pelis Factory's Changeset had 78% PR merge rate.Effort: Low-Medium
🔭 MCP Inspector
What: Weekly validation that all MCP servers configured in workflow files (GitHub, playwright, etc.) are reachable and exposing the expected tools.
Why: The smoke workflows use MCP servers (ghcr.io/github/gh-aw-mcpg); a silent MCP misconfiguration would fail all smoke tests. The
agenticworkflows-mcp-inspecttool makes this easy.Effort: Low
🔄 Mergefest (Auto-Merge Main into PRs)
What: Hourly workflow that merges the main branch into stale PR branches to prevent large merge conflicts.
Why: The repo has multiple long-running PRs (dependency updates, docs). Keeping them current reduces review friction. Pelis Factory called this an "orchestrator workflow."
Effort: Low
P3 — Future Ideas
📈 Agent Performance Metrics Collector
Track daily performance across all workflows: cost, turn counts, merge rates, no-op rates. Use
agentic-workflows.logs+cache-memoryto build trend data. Pelis Factory's Metrics Collector created 41 daily discussions.🌳 Issue Arborist
Automatically link related issues as sub-issues (e.g., link all CI failure issues under a parent tracking issue). Pelis created 18 parent issues and 77 reports.
🤝 PR Review Comment Responder
Slash command
/addressthat reads unresolved PR review comments and creates sub-issues or draft PRs to address them.🎪 Daily Multi-Device Docs Tester
Use Playwright to test the Astro/Starlight docs site (at
docs-site/) on mobile and desktop viewports. The docs site deploys to GitHub Pages — ensuring it renders correctly on all devices adds quality. Pelis achieved 100% merge rate on 2 PRs from this workflow.📈 Maturity Assessment
Current Overall Level: 3.5/5 — "Accomplished Practitioner"
Target Level: 4.5/5 — "Advanced Practitioner"
Gap to close: Add workflow health manager (P0), issue triage (P0), breaking change checker (P1), CI coach (P1), and static analysis report (P1) to reach level 4.5.
🔄 Comparison with Best Practices
What This Repository Does Exceptionally Well
secret-digger,cli-flag-consistency-checker, andsecurity-guardare all tailored specifically to this security tool's domain.What Could Improve
docker-manager.ts(18%) andcli.ts(0%) represent the most critical paths in a security tool. Weekly coverage improver may not be aggressive enough.--zizmor --poutine --actionlint— not using these in an automated daily report is a missed opportunity for a security-first repo.Unique Opportunities Given This Repository's Domain
secret-diggerand smoke workflows already test containment. A daily "Firewall Escape Report" discussion summarizing all escape attempts would be valuable public documentation of security posture.tests/fixtures/) to ensure only legitimate domains are being whitelisted.squidandagentimages.Generated by Pelis Agent Factory Advisor — March 16, 2026. Sources: Pelis Agent Factory docs, githubnext/agentics, repository workflow analysis.
Beta Was this translation helpful? Give feedback.
All reactions