AI-powered code review assistant — CodeT5 + CodeBERT + FAISS RAG + MCP Orchestration
Phase 1 (Refined) | April 2026 | Research Prototype → Structured Codebase
InspectAI analyzes pull request diffs and generates actionable code review comments:
PR Diff → [CodeT5 + RAG] → Review Comment + [CodeBERT] → Severity Label → GitHub Comment
└─ If critical/major → Jira Ticket
# 1. Setup
git clone <repo> && cd InspectAI
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,train]"
cp .env.example .env # Fill in INSPECTAI_GITHUB_TOKEN
# 2. Collect & train (first time)
make collect # Fetch PR reviews from GitHub
make preprocess # Parse diffs
make split # Train/valid split
make train-codet5 # Fine-tune CodeT5
make train-codebert # Fine-tune CodeBERT classifier
make index # Build FAISS index
# 3. Serve
make serve
# 4. Test
curl -X POST http://localhost:8000/review \
-H "Content-Type: application/json" \
-d '{"diff": "+def foo(x):\n+ eval(x)", "use_rag": true}'src/
├── core/ # Shared: config, embeddings, retrieval, models, logger
├── data/ # ETL: collect → preprocess → split
├── training/ # Model training: CodeT5, CodeBERT, FAISS index, evaluation
├── inference/ # Hot path: generator, classifier, pipeline
├── integrations/ # External APIs: GitHub, Jira, static analysis
├── mcp/ # Orchestration: MCP workflow
├── feedback/ # Feedback loop (Phase 2)
└── api/ # FastAPI: routes, schemas, middleware
Key design decisions documented in docs/architecture.md
| Endpoint | Description |
|---|---|
GET /health |
Service status + model info |
POST /review |
Submit diff → get review + severity |
POST /feedback |
Accept/reject feedback (Phase 2) |
Full API reference: docs/api_reference.md
| Model | Role | Parameters |
|---|---|---|
Salesforce/codet5-small |
Review generation (seq2seq) | 60M |
microsoft/codebert-base |
Severity classification (5 classes) | 125M |
microsoft/codebert-base |
Text embedding for FAISS RAG | 125M |
| Label | Triggers Jira | Meaning |
|---|---|---|
critical |
✅ | Security issues, data loss, crashes |
major |
✅ | Logic bugs, performance issues |
minor |
❌ | Edge cases, missing validation |
style |
❌ | PEP 8, naming, formatting |
nit |
❌ | Optional suggestions |
| Document | Contents |
|---|---|
docs/MASTER_KNOWLEDGE.md |
Start here — complete project reference |
docs/architecture.md |
Component internals and data flow |
docs/api_reference.md |
Endpoint documentation with examples |
docs/research_notes.md |
Paper plan, experiments, venues |
docs/PHASE2_PLAN.md |
What to build next |
InspectAI is a research project targeting publication at MSR/SANER/ICSE workshops.
Core contribution: First unified system combining code review generation, severity classification, RAG-augmented context, static analysis signals, and a developer feedback loop.
Novel metric: Actionability Score — measures how actionable a review comment is (implemented in src/training/evaluate.py).
See docs/research_notes.md for paper plan, experiment designs, and target venues.
| Phase | Status | Description |
|---|---|---|
| Phase 0 | ✅ Complete | Original flat prototype (archived) |
| Phase 1 | ✅ Current | Refactored, modular, documented codebase |
| Phase 2A | ⬜ Next | Code hygiene (delete dead code, ruff, real .gitignore) |
| Phase 2B | ⬜ | Evaluation framework (BLEU, ROUGE, Actionability) |
| Phase 2C | ⬜ | Model upgrades (codet5-base, real severity annotations) |
| Phase 2D | ⬜ | Real integrations (GitHub webhook, Jira, static analysis) |
| Phase 2E | ⬜ | Feedback loop (SQLite store, active learning retraining) |