Skip to content

Dongbumlee/container-migration-copilot

Repository files navigation

Container Migration Copilot

Migrate Kubernetes workloads to Azure AKS using GitHub Copilot's multi-agent consensus architecture

License: MIT Platform Target

Important

This project is a proof-of-concept demonstrating the potential of GitHub Copilot's multi-agent workflow automation for complex infrastructure tasks. It showcases usage patterns for agent orchestration, sub-agent consensus, skill-based domain knowledge, and structured sign-off protocols.

This is NOT production-ready migration tooling. LLM outputs are non-deterministic — the same input may produce different results across runs. All generated YAML and reports require human review before use. Do not apply converted manifests to production clusters without thorough validation.

A plugin pack for GitHub Copilot that converts Kubernetes configurations from any source platform (EKS, GKE, OpenShift, Rancher, Tanzu, on-premises) to Azure Kubernetes Service (AKS) using multi-agent consensus — with expert sign-off at every phase.


Why This Exists

Background

The Container Migration Solution Accelerator v2 introduced a multi-agent consensus approach for Kubernetes migrations — multiple AI agents (EKS expert, AKS expert, Chief Architect, etc.) independently analyze source manifests and reach consensus through structured discussion. This consensus pattern produces significantly better results than single-agent approaches because it catches design issues that any individual expert would miss.

However, while working with v2, I identified several pain points in its architecture:

  • Slow agent discussions — Every agent reads and writes files through Azure Blob Storage MCP tools, adding 50-200ms per I/O operation. A single consensus round involves dozens of these calls.
  • Costly infrastructure — Running the full stack (Container Apps, Cosmos DB, Storage Queues, Azure OpenAI, VNet, ACR) costs ~$500-1000/month even for development.
  • Long setup time — Provisioning all Azure resources with azd up takes 30-60 minutes, plus quota approvals.
  • Serial broadcast — The GroupChat pattern sends every message to all agents sequentially, leading to up to 100 rounds per phase.

I realized that GitHub Copilot's sub-agent architecture could preserve the multi-agent consensus quality while solving all of these issues — agents run in parallel, files live on local disk, and there's no infrastructure to manage. This project is my experiment to validate that hypothesis.

What This Demonstrates

This project is a proof-of-concept that explores:

  1. Copilot agent orchestration — using migrator.agent.md as a hub that delegates to specialized sub-agents
  2. Multi-agent consensus in Copilot — parallel independent assessment → synthesis → structured sign-off (PASS/FAIL gating)
  3. SKILL.md as domain knowledge — encoding expert knowledge (platform detection, conversion rules, quality standards) in skill files
  4. Quality enforcement patterns — sign-off protocols, blocker boards, evidence-based review, retry cycles

Comparison

v2 (Azure Cloud) This Project (Copilot On-Premise)
Azure Blob Storage I/O (~50-200ms/op) Local filesystem (~1ms/op)
GroupChat broadcast (serial rounds) Parallel sub-agents
~$500-1000/month infrastructure $0 additional (Copilot subscription)
30-60 min setup (azd up + quotas) Minutes (clone + place YAMLs)
Up to 100 rounds per phase 5-10 targeted sub-agent calls

Architecture

graph TB
    User([fa:fa-user User]) --> |"Place YAMLs in migration/source/"| Migrator

    subgraph "GitHub Copilot Agent"
        Migrator["🤖 Migrator Agent<br/>(Orchestrator)"]
        
        subgraph "Phase 1: Analysis"
            A1["🔍 Platform Expert<br/>(EKS/GKE/etc.)"]
            A2["🔍 AKS Expert"]
            A3["🔍 Chief Architect"]
            AS["🔍 Synthesizer"]
        end
        
        subgraph "Phase 2: Design"
            D1["📐 Design Agent"]
            D2["📐 AKS Expert"]
            D3["📐 EKS Expert"]
            D4["📐 Chief Architect"]
        end
        
        subgraph "Phase 3: Convert"
            C1["⚙️ YAML Converter"]
            C2["⚙️ YAML Expert"]
            C3["⚙️ QA Engineer"]
            C4["⚙️ AKS Expert"]
            C5["⚙️ Azure Architect"]
            C6["⚙️ Chief Architect"]
        end
        
        subgraph "Phase 4: Documentation"
            Doc1["📝 Documentation Agent"]
            Doc2["📝 AKS Expert"]
            Doc3["📝 Azure Architect"]
            Doc4["📝 Chief Architect"]
        end
    end

    Migrator --> A1 & A2 & A3
    A1 & A2 & A3 --> AS

    Migrator --> D1
    D1 --> D2 & D3 & D4

    Migrator --> C1
    C1 --> C2 & C3 & C4 & C5 & C6

    Migrator --> Doc1
    Doc1 --> Doc2 & Doc3 & Doc4

    AS --> |analysis_report.md| D1
    D4 --> |design_report.md| C1
    C6 --> |converted YAMLs| Doc1
    Doc4 --> |migration_report.md| Output([fa:fa-file-alt Complete Migration Package])
Loading

Multi-Agent Consensus Pattern

Each phase uses an independent assessment → synthesis → sign-off consensus model:

  1. Independent Assessment — Expert sub-agents run in parallel isolated contexts (no anchoring bias)
  2. Synthesis — A synthesizer merges findings, identifies agreements and conflicts
  3. Sign-off Review — Each expert reviews the draft and produces SIGN-OFF: PASS or SIGN-OFF: FAIL
  4. Resolution — FAILs are fixed and re-reviewed (max 2 cycles), then the report is finalized

This ensures collective intelligence validates every output — a real design issue (like deletionPolicy: Delete contradicting DR goals) was caught and fixed by the Chief Architect reviewer during our E2E testing.


Quick Start

Prerequisites

  • GitHub Copilot (Business or Enterprise with agent mode)
  • GitHub Copilot CLI or VS Code with Copilot Chat
  • A repository with Kubernetes manifests to migrate

Option A: Install Plugin into Your Repo

# Windows
git clone https://github.com/YOUR_ORG/container-migration-copilot.git
cd container-migration-copilot
.\install.ps1 -TargetRepo "C:\path\to\your\repo"
# Linux / macOS
git clone https://github.com/YOUR_ORG/container-migration-copilot.git
cd container-migration-copilot
chmod +x install.sh
./install.sh /path/to/your/repo

This copies the agent, skills, prompts, and instructions into your repo's .github/ directory.

Option B: Use This Repo Directly

git clone https://github.com/YOUR_ORG/container-migration-copilot.git
cd container-migration-copilot

# Place your source K8s YAML files
cp /path/to/your/*.yaml migration/source/

# Open Copilot and run the migration

Run the Migration

Full pipeline (all 4 phases):

@migrator Analyze and migrate my Kubernetes files to AKS

Phase by phase (recommended for first run):

# Phase 1: Analyze source manifests
Use prompt: .github/prompts/01-analysis.prompt.md

# Phase 2: Design AKS architecture
Use prompt: .github/prompts/02-design.prompt.md

# Phase 3: Convert YAML files
Use prompt: .github/prompts/03-convert.prompt.md

# Phase 4: Generate documentation
Use prompt: .github/prompts/04-documentation.prompt.md

Supported Platforms

Source Platform Detection Sample Files
Amazon EKS ebs.csi.aws.com, eks.amazonaws.com annotations data/samples/eks/
Google GKE pd.csi.storage.gke.io, iam.gke.io annotations data/samples/gke/
Red Hat OpenShift DeploymentConfig, Route, SCCs data/samples/openshift/
Rancher cattle.io CRDs, Fleet resources data/samples/rancher/
VMware Tanzu TanzuKubernetesCluster, Pinniped, Carvel data/samples/tanzu/
On-Premises MetalLB, Rook-Ceph, Harbor, bare-metal ingress data/samples/onprem/

Target Platform: Azure Kubernetes Service (AKS)


Output Artifacts

After a complete migration run, the migration/ directory contains:

migration/
├── source/              # Your input K8s manifests (untouched)
├── analysis/
│   └── analysis_report.md    # Platform detection, risk register, complexity scoring
├── design/
│   ├── design_report.md      # AKS architecture, service mapping, WAF assessment
│   └── architecture.mermaid  # Visual architecture diagram
├── converted/
│   ├── aks-*.yaml            # AKS-ready YAML files with mandatory headers
│   ├── conversion_summary.md # Conversion decisions and expert sign-offs
│   └── converted_yaml_inventory.json  # Machine-readable inventory
└── docs/
    └── migration_report.md   # 19-section comprehensive report (operator-ready)

See examples/eks-dr-pipeline/ for a complete reference output from an EKS DR snapshot pipeline migration.


Project Structure

.github/
├── agents/
│   └── migrator.agent.md            # Main orchestrator — consensus protocol, quality enforcement
├── plugins/container-migration/
│   └── skills/
│       ├── k8s-analysis/            # Phase 1: platform detection, complexity scoring
│       ├── k8s-design/              # Phase 2: AKS architecture, WAF pillars
│       ├── k8s-yaml-convert/        # Phase 3: YAML transformation rules, self-test
│       ├── k8s-documentation/       # Phase 4: 19-section report structure
│       ├── platform-eks/            # EKS-specific knowledge
│       ├── platform-gke/            # GKE-specific knowledge
│       ├── platform-openshift/      # OpenShift-specific knowledge
│       ├── platform-rancher/        # Rancher-specific knowledge
│       ├── platform-tanzu/          # Tanzu-specific knowledge
│       ├── platform-onprem/         # On-prem/bare-metal knowledge
│       ├── aks-expert/              # AKS target platform expertise
│       └── yaml-inventory/          # Manifest inventory and ordering
├── prompts/                         # Phase trigger prompts (01-04)
└── instructions/                    # Auto-applied quality rules

data/samples/                        # Sample source manifests per platform
examples/eks-dr-pipeline/            # Reference E2E output
migration/                           # Working directory (your migration runs here)
install.ps1 / install.sh             # Plugin installer scripts

Quality Standards

This workflow enforces the same quality standards as the v2 Solution Accelerator:

  • Complexity scoring (1-5 scale) per resource category
  • Priority classification (P0-P3) for all findings
  • Assumptions table with 5 columns (Assumption, Rationale, Impact if Wrong, What to Confirm, Owner)
  • Mandatory YAML header on every converted file (source platform, date, author, notes, AI disclaimer)
  • Self-test validation — re-reads converted files to verify structure
  • Expert sign-off at every phase with ≥2 verification bullets required for PASS
  • Blocker resolution protocol — FAILs must include evidence and acceptance criteria
  • Multi-paragraph expert insights — narrative format, not bullet lists
  • Azure references — minimum 3-5 validated Microsoft Learn URLs per report
  • WAF 5-pillar assessment — Reliability, Security, Cost, Ops Excellence, Performance

Example E2E Results

From our test migration of an EKS DR snapshot pipeline (4 source files, 7 resources):

Phase Sign-offs Result Key Outcome
Analysis 3/3 ✅ All PASS 13 risk register entries, 4/5 storage complexity
Design 3/3 ✅ All PASS Chief Architect caught deletionPolicy: Delete → fixed to Retain
Convert 5/5 ✅ All PASS 5 AKS-ready files, 8 resources, zero AWS remnants
Documentation 3/3 ✅ All PASS 50KB report, 19 sections, 10 MS Learn references

14 total sign-off reviews, 14 PASS — the consensus protocol caught a real design issue and resolved it within the workflow.

Before / After: YAML Conversion

Source (EKS)Converted (AKS)
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: disaster-recovery
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-aws-vsc
driver: ebs.csi.aws.com
deletionPolicy: Delete
# Converted from Amazon EKS to Azure AKS
# Author: GitHub Copilot Container Migration
# AI GENERATED CONTENT - REVIEW BEFORE USE
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azure-disk-dr-sc
  annotations:
    migration.azure.com/source-platform: "eks"
provisioner: disk.csi.azure.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Delete
parameters:
  skuName: StandardSSD_LRS
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: azure-disk-dr-vsc
  annotations:
    migration.azure.com/source-platform: "eks"
driver: disk.csi.azure.com
deletionPolicy: Retain  # Fixed by Chief Architect
parameters:
  incremental: "true"

Sample Deployment Runbook (from generated report)

# Step 0: Verify AKS prerequisites
kubectl get csidriver disk.csi.azure.com
kubectl get crd volumesnapshots.snapshot.storage.k8s.io

# Step 1-5: Apply converted manifests in dependency order
kubectl apply -f aks-namespace.yaml          # Namespace first
kubectl apply -f aks-dr-classes.yaml         # StorageClass + VolumeSnapshotClass
kubectl apply -f aks-dr-writer.yaml          # PVC + Writer Pod
kubectl apply -f aks-dr-snapshot.yaml        # VolumeSnapshot (wait for data)
kubectl apply -f aks-dr-restore.yaml         # Restored PVC + Recovery Pod

# Step 6: Validate data integrity
kubectl exec appserver -n disaster-recovery -- wc -l /data/out.txt
kubectl exec appserver-recovery -n disaster-recovery -- wc -l /data/out.txt

Sample Sign-off Output (from Phase 3 Conversion)

✅ YAML Expert:      PASS — Valid syntax, proper multi-doc separators, K8s schema compliance
✅ QA Engineer:      PASS — All 7 source resources accounted for, security contexts preserved
✅ AKS Expert:       PASS — Azure Disk CSI parameters verified, StandardSSD_LRS supported
✅ Azure Architect:  PASS — WAF pillars ≥4⭐, naming conventions followed, Retain policy correct
✅ Chief Architect:  PASS — Dependency order verified, all cross-references valid, zero gaps

📂 See examples/eks-dr-pipeline/ for the complete output of all 4 phases.

Sample Analysis Report (Phase 1)

Auto-generated platform detection, file classification, and complexity scoring:

Process ID Platform Confidence File Count Resource Count Readiness
EKS-DR-001 Amazon EKS High (95%) 4 7 Migration-Ready with Modifications
Filename Resource Types Complexity (1–5) Azure Mapping
ebs-kc-classes.yaml StorageClass, VolumeSnapshotClass 4 disk.csi.azure.com provisioner
ebs-kc.yaml PVC, Pod 3 Azure Disk PVC; MCR/ACR image
ebs-kc-snapshot.yaml VolumeSnapshot 2 Azure Disk VolumeSnapshot
ebs-kc-restore.yaml PVC (restore), Pod (recovery) 3 Snapshot dataSource; MCR image

Sample Migration Report (Phase 4 — Executive Summary)

Auto-generated 50KB report with 19 sections. Here's the executive summary:

This report documents the complete migration of a disaster-recovery snapshot pipeline from Amazon EKS with the EBS CSI driver to Azure AKS with the Azure Disk CSI driver. The migration scope encompassed 4 source files containing 7 resources that implement a write-snapshot-restore DR chain. All resources were successfully converted to AKS-native equivalents, yielding 5 output files with 8 resources, achieving a 100% conversion success rate with zero blockers.

A key architectural decision — changing the VolumeSnapshotClass deletionPolicy from Delete to Retain — was made during the design phase by Chief Architect review to prevent accidental snapshot loss in DR scenarios.

Recommendation: Proceed to deploy the converted manifests to a staging AKS cluster, execute the end-to-end DR drill (write → snapshot → restore → verify), and upon successful validation, promote to production.

Sample Risk Register (from final report)

ID Priority Finding Status Detail
R-001 P0 EBS CSI provisioner incompatible with AKS ✅ Mitigated Replaced with disk.csi.azure.com
R-002 P0 EBS CSI snapshot driver incompatible ✅ Mitigated Replaced with disk.csi.azure.com
R-003 P0 DR RTO/RPO unvalidated on Azure ⏳ Pending Requires E2E DR drill on AKS
R-004 P1 ECR registry cross-cloud dependency ✅ Mitigated Image changed to MCR busybox
R-005 P1 Default namespace pollution ✅ Mitigated Dedicated disaster-recovery namespace
... + 8 more entries with full mitigation tracking

FAQ

Q: What models does this work with? A: Any model available through GitHub Copilot. Tested with Claude Opus 4.6, Sonnet, and GPT-5.x family.

Q: Can I use this without GitHub Copilot CLI? A: Yes — it also works with VS Code Copilot Chat in agent mode. The .github/agents/ and prompts are compatible with both.

Q: How do I add support for a new source platform? A: Create a new skill at .github/plugins/container-migration/skills/platform-<name>/SKILL.md with detection signals and conversion mappings. See CONTRIBUTING.md.

Q: Does this handle stateful workloads with data migration? A: This tool handles manifest conversion (YAML files). Actual data migration (PV contents, databases) requires separate tooling like Velero or Azure Migrate.

Q: How long does a full migration take? A: Typically 5-15 minutes for a small workload (< 20 files). Larger workloads with more resources take proportionally longer due to consensus review rounds.


Roadmap

  • v1.0 — Plugin Pack — Agent, skills, prompts, install scripts (current release)
  • v1.1 — Copilot Extension (Skillset) — GitHub App with API endpoints wrapping migration skills, installable without copying files
  • v1.2 — Copilot Extension (Agent) — Full backend server with orchestration engine, multi-agent consensus as a service, Marketplace listing

Contributing

See CONTRIBUTING.md for guidelines on adding platforms, improving prompts, and reporting issues.

License

MIT

About

GitHub Copilot plugin pack for migrating Kubernetes workloads to Azure AKS using multi-agent consensus with expert sign-off at every phase.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors