Migrate Kubernetes workloads to Azure AKS using GitHub Copilot's multi-agent consensus architecture
Important
This project is a proof-of-concept demonstrating the potential of GitHub Copilot's multi-agent workflow automation for complex infrastructure tasks. It showcases usage patterns for agent orchestration, sub-agent consensus, skill-based domain knowledge, and structured sign-off protocols.
This is NOT production-ready migration tooling. LLM outputs are non-deterministic — the same input may produce different results across runs. All generated YAML and reports require human review before use. Do not apply converted manifests to production clusters without thorough validation.
A plugin pack for GitHub Copilot that converts Kubernetes configurations from any source platform (EKS, GKE, OpenShift, Rancher, Tanzu, on-premises) to Azure Kubernetes Service (AKS) using multi-agent consensus — with expert sign-off at every phase.
The Container Migration Solution Accelerator v2 introduced a multi-agent consensus approach for Kubernetes migrations — multiple AI agents (EKS expert, AKS expert, Chief Architect, etc.) independently analyze source manifests and reach consensus through structured discussion. This consensus pattern produces significantly better results than single-agent approaches because it catches design issues that any individual expert would miss.
However, while working with v2, I identified several pain points in its architecture:
- Slow agent discussions — Every agent reads and writes files through Azure Blob Storage MCP tools, adding 50-200ms per I/O operation. A single consensus round involves dozens of these calls.
- Costly infrastructure — Running the full stack (Container Apps, Cosmos DB, Storage Queues, Azure OpenAI, VNet, ACR) costs ~$500-1000/month even for development.
- Long setup time — Provisioning all Azure resources with
azd uptakes 30-60 minutes, plus quota approvals. - Serial broadcast — The GroupChat pattern sends every message to all agents sequentially, leading to up to 100 rounds per phase.
I realized that GitHub Copilot's sub-agent architecture could preserve the multi-agent consensus quality while solving all of these issues — agents run in parallel, files live on local disk, and there's no infrastructure to manage. This project is my experiment to validate that hypothesis.
This project is a proof-of-concept that explores:
- Copilot agent orchestration — using
migrator.agent.mdas a hub that delegates to specialized sub-agents - Multi-agent consensus in Copilot — parallel independent assessment → synthesis → structured sign-off (PASS/FAIL gating)
- SKILL.md as domain knowledge — encoding expert knowledge (platform detection, conversion rules, quality standards) in skill files
- Quality enforcement patterns — sign-off protocols, blocker boards, evidence-based review, retry cycles
| v2 (Azure Cloud) | This Project (Copilot On-Premise) |
|---|---|
| Azure Blob Storage I/O (~50-200ms/op) | Local filesystem (~1ms/op) |
| GroupChat broadcast (serial rounds) | Parallel sub-agents |
| ~$500-1000/month infrastructure | $0 additional (Copilot subscription) |
| 30-60 min setup (azd up + quotas) | Minutes (clone + place YAMLs) |
| Up to 100 rounds per phase | 5-10 targeted sub-agent calls |
graph TB
User([fa:fa-user User]) --> |"Place YAMLs in migration/source/"| Migrator
subgraph "GitHub Copilot Agent"
Migrator["🤖 Migrator Agent<br/>(Orchestrator)"]
subgraph "Phase 1: Analysis"
A1["🔍 Platform Expert<br/>(EKS/GKE/etc.)"]
A2["🔍 AKS Expert"]
A3["🔍 Chief Architect"]
AS["🔍 Synthesizer"]
end
subgraph "Phase 2: Design"
D1["📐 Design Agent"]
D2["📐 AKS Expert"]
D3["📐 EKS Expert"]
D4["📐 Chief Architect"]
end
subgraph "Phase 3: Convert"
C1["⚙️ YAML Converter"]
C2["⚙️ YAML Expert"]
C3["⚙️ QA Engineer"]
C4["⚙️ AKS Expert"]
C5["⚙️ Azure Architect"]
C6["⚙️ Chief Architect"]
end
subgraph "Phase 4: Documentation"
Doc1["📝 Documentation Agent"]
Doc2["📝 AKS Expert"]
Doc3["📝 Azure Architect"]
Doc4["📝 Chief Architect"]
end
end
Migrator --> A1 & A2 & A3
A1 & A2 & A3 --> AS
Migrator --> D1
D1 --> D2 & D3 & D4
Migrator --> C1
C1 --> C2 & C3 & C4 & C5 & C6
Migrator --> Doc1
Doc1 --> Doc2 & Doc3 & Doc4
AS --> |analysis_report.md| D1
D4 --> |design_report.md| C1
C6 --> |converted YAMLs| Doc1
Doc4 --> |migration_report.md| Output([fa:fa-file-alt Complete Migration Package])
Each phase uses an independent assessment → synthesis → sign-off consensus model:
- Independent Assessment — Expert sub-agents run in parallel isolated contexts (no anchoring bias)
- Synthesis — A synthesizer merges findings, identifies agreements and conflicts
- Sign-off Review — Each expert reviews the draft and produces
SIGN-OFF: PASSorSIGN-OFF: FAIL - Resolution — FAILs are fixed and re-reviewed (max 2 cycles), then the report is finalized
This ensures collective intelligence validates every output — a real design issue (like deletionPolicy: Delete contradicting DR goals) was caught and fixed by the Chief Architect reviewer during our E2E testing.
- GitHub Copilot (Business or Enterprise with agent mode)
- GitHub Copilot CLI or VS Code with Copilot Chat
- A repository with Kubernetes manifests to migrate
# Windows
git clone https://github.com/YOUR_ORG/container-migration-copilot.git
cd container-migration-copilot
.\install.ps1 -TargetRepo "C:\path\to\your\repo"# Linux / macOS
git clone https://github.com/YOUR_ORG/container-migration-copilot.git
cd container-migration-copilot
chmod +x install.sh
./install.sh /path/to/your/repoThis copies the agent, skills, prompts, and instructions into your repo's .github/ directory.
git clone https://github.com/YOUR_ORG/container-migration-copilot.git
cd container-migration-copilot
# Place your source K8s YAML files
cp /path/to/your/*.yaml migration/source/
# Open Copilot and run the migrationFull pipeline (all 4 phases):
@migrator Analyze and migrate my Kubernetes files to AKS
Phase by phase (recommended for first run):
# Phase 1: Analyze source manifests
Use prompt: .github/prompts/01-analysis.prompt.md
# Phase 2: Design AKS architecture
Use prompt: .github/prompts/02-design.prompt.md
# Phase 3: Convert YAML files
Use prompt: .github/prompts/03-convert.prompt.md
# Phase 4: Generate documentation
Use prompt: .github/prompts/04-documentation.prompt.md
| Source Platform | Detection | Sample Files |
|---|---|---|
| Amazon EKS | ebs.csi.aws.com, eks.amazonaws.com annotations |
data/samples/eks/ |
| Google GKE | pd.csi.storage.gke.io, iam.gke.io annotations |
data/samples/gke/ |
| Red Hat OpenShift | DeploymentConfig, Route, SCCs |
data/samples/openshift/ |
| Rancher | cattle.io CRDs, Fleet resources |
data/samples/rancher/ |
| VMware Tanzu | TanzuKubernetesCluster, Pinniped, Carvel |
data/samples/tanzu/ |
| On-Premises | MetalLB, Rook-Ceph, Harbor, bare-metal ingress | data/samples/onprem/ |
Target Platform: Azure Kubernetes Service (AKS)
After a complete migration run, the migration/ directory contains:
migration/
├── source/ # Your input K8s manifests (untouched)
├── analysis/
│ └── analysis_report.md # Platform detection, risk register, complexity scoring
├── design/
│ ├── design_report.md # AKS architecture, service mapping, WAF assessment
│ └── architecture.mermaid # Visual architecture diagram
├── converted/
│ ├── aks-*.yaml # AKS-ready YAML files with mandatory headers
│ ├── conversion_summary.md # Conversion decisions and expert sign-offs
│ └── converted_yaml_inventory.json # Machine-readable inventory
└── docs/
└── migration_report.md # 19-section comprehensive report (operator-ready)
See examples/eks-dr-pipeline/ for a complete reference output from an EKS DR snapshot pipeline migration.
.github/
├── agents/
│ └── migrator.agent.md # Main orchestrator — consensus protocol, quality enforcement
├── plugins/container-migration/
│ └── skills/
│ ├── k8s-analysis/ # Phase 1: platform detection, complexity scoring
│ ├── k8s-design/ # Phase 2: AKS architecture, WAF pillars
│ ├── k8s-yaml-convert/ # Phase 3: YAML transformation rules, self-test
│ ├── k8s-documentation/ # Phase 4: 19-section report structure
│ ├── platform-eks/ # EKS-specific knowledge
│ ├── platform-gke/ # GKE-specific knowledge
│ ├── platform-openshift/ # OpenShift-specific knowledge
│ ├── platform-rancher/ # Rancher-specific knowledge
│ ├── platform-tanzu/ # Tanzu-specific knowledge
│ ├── platform-onprem/ # On-prem/bare-metal knowledge
│ ├── aks-expert/ # AKS target platform expertise
│ └── yaml-inventory/ # Manifest inventory and ordering
├── prompts/ # Phase trigger prompts (01-04)
└── instructions/ # Auto-applied quality rules
data/samples/ # Sample source manifests per platform
examples/eks-dr-pipeline/ # Reference E2E output
migration/ # Working directory (your migration runs here)
install.ps1 / install.sh # Plugin installer scripts
This workflow enforces the same quality standards as the v2 Solution Accelerator:
- Complexity scoring (1-5 scale) per resource category
- Priority classification (P0-P3) for all findings
- Assumptions table with 5 columns (Assumption, Rationale, Impact if Wrong, What to Confirm, Owner)
- Mandatory YAML header on every converted file (source platform, date, author, notes, AI disclaimer)
- Self-test validation — re-reads converted files to verify structure
- Expert sign-off at every phase with ≥2 verification bullets required for PASS
- Blocker resolution protocol — FAILs must include evidence and acceptance criteria
- Multi-paragraph expert insights — narrative format, not bullet lists
- Azure references — minimum 3-5 validated Microsoft Learn URLs per report
- WAF 5-pillar assessment — Reliability, Security, Cost, Ops Excellence, Performance
From our test migration of an EKS DR snapshot pipeline (4 source files, 7 resources):
| Phase | Sign-offs | Result | Key Outcome |
|---|---|---|---|
| Analysis | 3/3 | ✅ All PASS | 13 risk register entries, 4/5 storage complexity |
| Design | 3/3 | ✅ All PASS | Chief Architect caught deletionPolicy: Delete → fixed to Retain |
| Convert | 5/5 | ✅ All PASS | 5 AKS-ready files, 8 resources, zero AWS remnants |
| Documentation | 3/3 | ✅ All PASS | 50KB report, 19 sections, 10 MS Learn references |
14 total sign-off reviews, 14 PASS — the consensus protocol caught a real design issue and resolved it within the workflow.
| Source (EKS) | Converted (AKS) |
|---|---|
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: disaster-recovery
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-aws-vsc
driver: ebs.csi.aws.com
deletionPolicy: Delete |
# Converted from Amazon EKS to Azure AKS
# Author: GitHub Copilot Container Migration
# AI GENERATED CONTENT - REVIEW BEFORE USE
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azure-disk-dr-sc
annotations:
migration.azure.com/source-platform: "eks"
provisioner: disk.csi.azure.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Delete
parameters:
skuName: StandardSSD_LRS
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: azure-disk-dr-vsc
annotations:
migration.azure.com/source-platform: "eks"
driver: disk.csi.azure.com
deletionPolicy: Retain # Fixed by Chief Architect
parameters:
incremental: "true" |
# Step 0: Verify AKS prerequisites
kubectl get csidriver disk.csi.azure.com
kubectl get crd volumesnapshots.snapshot.storage.k8s.io
# Step 1-5: Apply converted manifests in dependency order
kubectl apply -f aks-namespace.yaml # Namespace first
kubectl apply -f aks-dr-classes.yaml # StorageClass + VolumeSnapshotClass
kubectl apply -f aks-dr-writer.yaml # PVC + Writer Pod
kubectl apply -f aks-dr-snapshot.yaml # VolumeSnapshot (wait for data)
kubectl apply -f aks-dr-restore.yaml # Restored PVC + Recovery Pod
# Step 6: Validate data integrity
kubectl exec appserver -n disaster-recovery -- wc -l /data/out.txt
kubectl exec appserver-recovery -n disaster-recovery -- wc -l /data/out.txt✅ YAML Expert: PASS — Valid syntax, proper multi-doc separators, K8s schema compliance
✅ QA Engineer: PASS — All 7 source resources accounted for, security contexts preserved
✅ AKS Expert: PASS — Azure Disk CSI parameters verified, StandardSSD_LRS supported
✅ Azure Architect: PASS — WAF pillars ≥4⭐, naming conventions followed, Retain policy correct
✅ Chief Architect: PASS — Dependency order verified, all cross-references valid, zero gaps
📂 See
examples/eks-dr-pipeline/for the complete output of all 4 phases.
Auto-generated platform detection, file classification, and complexity scoring:
| Process ID | Platform | Confidence | File Count | Resource Count | Readiness |
|---|---|---|---|---|---|
| EKS-DR-001 | Amazon EKS | High (95%) | 4 | 7 | Migration-Ready with Modifications |
| Filename | Resource Types | Complexity (1–5) | Azure Mapping |
|---|---|---|---|
ebs-kc-classes.yaml |
StorageClass, VolumeSnapshotClass | 4 | disk.csi.azure.com provisioner |
ebs-kc.yaml |
PVC, Pod | 3 | Azure Disk PVC; MCR/ACR image |
ebs-kc-snapshot.yaml |
VolumeSnapshot | 2 | Azure Disk VolumeSnapshot |
ebs-kc-restore.yaml |
PVC (restore), Pod (recovery) | 3 | Snapshot dataSource; MCR image |
Auto-generated 50KB report with 19 sections. Here's the executive summary:
This report documents the complete migration of a disaster-recovery snapshot pipeline from Amazon EKS with the EBS CSI driver to Azure AKS with the Azure Disk CSI driver. The migration scope encompassed 4 source files containing 7 resources that implement a write-snapshot-restore DR chain. All resources were successfully converted to AKS-native equivalents, yielding 5 output files with 8 resources, achieving a 100% conversion success rate with zero blockers.
A key architectural decision — changing the VolumeSnapshotClass
deletionPolicyfromDeletetoRetain— was made during the design phase by Chief Architect review to prevent accidental snapshot loss in DR scenarios.Recommendation: Proceed to deploy the converted manifests to a staging AKS cluster, execute the end-to-end DR drill (write → snapshot → restore → verify), and upon successful validation, promote to production.
| ID | Priority | Finding | Status | Detail |
|---|---|---|---|---|
| R-001 | P0 | EBS CSI provisioner incompatible with AKS | ✅ Mitigated | Replaced with disk.csi.azure.com |
| R-002 | P0 | EBS CSI snapshot driver incompatible | ✅ Mitigated | Replaced with disk.csi.azure.com |
| R-003 | P0 | DR RTO/RPO unvalidated on Azure | ⏳ Pending | Requires E2E DR drill on AKS |
| R-004 | P1 | ECR registry cross-cloud dependency | ✅ Mitigated | Image changed to MCR busybox |
| R-005 | P1 | Default namespace pollution | ✅ Mitigated | Dedicated disaster-recovery namespace |
| ... | + 8 more entries with full mitigation tracking |
Q: What models does this work with? A: Any model available through GitHub Copilot. Tested with Claude Opus 4.6, Sonnet, and GPT-5.x family.
Q: Can I use this without GitHub Copilot CLI?
A: Yes — it also works with VS Code Copilot Chat in agent mode. The .github/agents/ and prompts are compatible with both.
Q: How do I add support for a new source platform?
A: Create a new skill at .github/plugins/container-migration/skills/platform-<name>/SKILL.md with detection signals and conversion mappings. See CONTRIBUTING.md.
Q: Does this handle stateful workloads with data migration? A: This tool handles manifest conversion (YAML files). Actual data migration (PV contents, databases) requires separate tooling like Velero or Azure Migrate.
Q: How long does a full migration take? A: Typically 5-15 minutes for a small workload (< 20 files). Larger workloads with more resources take proportionally longer due to consensus review rounds.
- v1.0 — Plugin Pack — Agent, skills, prompts, install scripts (current release)
- v1.1 — Copilot Extension (Skillset) — GitHub App with API endpoints wrapping migration skills, installable without copying files
- v1.2 — Copilot Extension (Agent) — Full backend server with orchestration engine, multi-agent consensus as a service, Marketplace listing
See CONTRIBUTING.md for guidelines on adding platforms, improving prompts, and reporting issues.