InspectAI

AI-powered code review assistant — CodeT5 + CodeBERT + FAISS RAG + MCP Orchestration

Phase 1 (Refined) | April 2026 | Research Prototype → Structured Codebase

What It Does

InspectAI analyzes pull request diffs and generates actionable code review comments:

PR Diff → [CodeT5 + RAG] → Review Comment + [CodeBERT] → Severity Label → GitHub Comment
                                                           └─ If critical/major → Jira Ticket

Quick Start

# 1. Setup
git clone <repo> && cd InspectAI
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,train]"
cp .env.example .env    # Fill in INSPECTAI_GITHUB_TOKEN

# 2. Collect & train (first time)
make collect      # Fetch PR reviews from GitHub
make preprocess   # Parse diffs
make split        # Train/valid split
make train-codet5 # Fine-tune CodeT5
make train-codebert  # Fine-tune CodeBERT classifier
make index        # Build FAISS index

# 3. Serve
make serve

# 4. Test
curl -X POST http://localhost:8000/review \
  -H "Content-Type: application/json" \
  -d '{"diff": "+def foo(x):\n+    eval(x)", "use_rag": true}'

Architecture

src/
├── core/          # Shared: config, embeddings, retrieval, models, logger
├── data/          # ETL: collect → preprocess → split
├── training/      # Model training: CodeT5, CodeBERT, FAISS index, evaluation
├── inference/     # Hot path: generator, classifier, pipeline
├── integrations/  # External APIs: GitHub, Jira, static analysis
├── mcp/           # Orchestration: MCP workflow
├── feedback/      # Feedback loop (Phase 2)
└── api/           # FastAPI: routes, schemas, middleware

Key design decisions documented in docs/architecture.md

API

Endpoint	Description
`GET /health`	Service status + model info
`POST /review`	Submit diff → get review + severity
`POST /feedback`	Accept/reject feedback (Phase 2)

Full API reference: docs/api_reference.md

Models

Model	Role	Parameters
`Salesforce/codet5-small`	Review generation (seq2seq)	60M
`microsoft/codebert-base`	Severity classification (5 classes)	125M
`microsoft/codebert-base`	Text embedding for FAISS RAG	125M

Severity Classification

Label	Triggers Jira	Meaning
`critical`	✅	Security issues, data loss, crashes
`major`	✅	Logic bugs, performance issues
`minor`	❌	Edge cases, missing validation
`style`	❌	PEP 8, naming, formatting
`nit`	❌	Optional suggestions

Documentation

Document	Contents
`docs/MASTER_KNOWLEDGE.md`	Start here — complete project reference
`docs/architecture.md`	Component internals and data flow
`docs/api_reference.md`	Endpoint documentation with examples
`docs/research_notes.md`	Paper plan, experiments, venues
`docs/PHASE2_PLAN.md`	What to build next

Research

InspectAI is a research project targeting publication at MSR/SANER/ICSE workshops.

Core contribution: First unified system combining code review generation, severity classification, RAG-augmented context, static analysis signals, and a developer feedback loop.

Novel metric: Actionability Score — measures how actionable a review comment is (implemented in src/training/evaluate.py).

See docs/research_notes.md for paper plan, experiment designs, and target venues.

Phase Status

Phase	Status	Description
Phase 0	✅ Complete	Original flat prototype (archived)
Phase 1	✅ Current	Refactored, modular, documented codebase
Phase 2A	⬜ Next	Code hygiene (delete dead code, ruff, real .gitignore)
Phase 2B	⬜	Evaluation framework (BLEU, ROUGE, Actionability)
Phase 2C	⬜	Model upgrades (codet5-base, real severity annotations)
Phase 2D	⬜	Real integrations (GitHub webhook, Jira, static analysis)
Phase 2E	⬜	Feedback loop (SQLite store, active learning retraining)

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
Research Papers		Research Papers
ci_pipeline_dump		ci_pipeline_dump
config		config
data		data
docs		docs
experiments/results		experiments/results
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
InspectAI_official_synopsis_2025.pdf		InspectAI_official_synopsis_2025.pdf
InspectAI_presentation.pdf		InspectAI_presentation.pdf
Makefile		Makefile
README.md		README.md
fetch_ci_pipeline.py		fetch_ci_pipeline.py
maplibre_reviews.jsonl		maplibre_reviews.jsonl
pygithub.py		pygithub.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InspectAI

What It Does

Quick Start

Architecture

API

Models

Severity Classification

Documentation

Research

Phase Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

InspectAI

What It Does

Quick Start

Architecture

API

Models

Severity Classification

Documentation

Research

Phase Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages