repo-intel

Unified static analysis for AI agents - git history, AST symbols, project metadata, doc-code sync, and (optionally) LLM-augmented file descriptors plus a 3-depth narrative summary, via a cached, incrementally-updatable Rust binary.

Part of the agentsys ecosystem.

Scan a repo once, cache the result, then query it repeatedly. The heavy lifting runs in the agent-analyzer Rust binary - this plugin provides the JavaScript interface, the skill layer that other plugins consume, and the orchestration that spawns Haiku subagents to enrich the artifact with semantic signals.

Why this plugin

Use this when you need to identify high-churn files before a refactor
Use this when evaluating bus factor risk across your codebase
Use this when finding files that always change together (coupling)
Use this when an agent (or you) needs a fast first-foothold in an unfamiliar repo (find <concept> and summary)
Use this when other plugins need repo intelligence (deslop, sync-docs, drift-detect, audit-project, next-task, onboard, can-i-help)

Installation

agentsys install repo-intel

Quick start

/repo-intel init                          # Scan repo (first time, deterministic)
/repo-intel enrich                        # OPTIONAL: spawn Haiku agents to add descriptors + summary
/repo-intel query hotspots                # Most active files, recency-weighted
/repo-intel query find "auth flow"        # Concept search across files (uses descriptors when present)
/repo-intel query summary --depth=1       # One-sentence repo description (needs enrich)
/repo-intel query ownership src/auth/     # Who owns a path
/repo-intel query bus-factor              # Knowledge distribution risk
/repo-intel query painspots               # Hot x buggy x complex
/repo-intel query entry-points            # Where execution starts (binaries, mains, scripts)
/repo-intel query stale-docs              # Docs with stale symbol references

After init, the artifact is cached as repo-intel.json in the platform state dir (.claude/, .opencode/, or .codex/). Subsequent queries are instant. Run /repo-intel update to add new commits incrementally.

Actions

Action	What it does
`init`	Full scan - git history, AST symbols, project metadata, doc-code sync
`update`	Incremental update (only new commits since last scan)
`enrich`	Spawn the `repo-intel-summarizer` and `repo-intel-weighter` Haiku subagents to populate `summary` (3 depths) and `fileDescriptors` (top-500 most-active files). Also runs the embedder when opted in (see below). Optional - all deterministic queries work without it.
`status`	Check cache staleness - commits behind, last analyzed date
`query <type>`	Run a specific analysis query
`embed status`	Show embedder install state, variant + detail, sidecar info
`embed update`	Delta re-embed of changed files only (CI-friendly)
`embed reset`	Clear cached embedder preference; next `enrich` re-prompts

Query types

Activity

Query	Description
`hotspots`	Most-changed files, recency-weighted
`coldspots`	Least-changed files (unmaintained)
`file-history <file>`	Change timeline for a specific file

Quality

Query	Description
`bugspots`	Files with highest bug-fix density (fix commits / total)
`test-gaps`	Hot source files without co-changing test files
`diff-risk <files>`	Risk score for recently changed files
`painspots`	Hotspot x (1 + bug rate) x (1 + complexity/30) - requires AST data

People

Query	Description
`ownership <path>`	Who owns a path, with staleness flags
`contributors`	All contributors with commit counts and AI ratio
`bus-factor`	Knowledge concentration risk with at-risk areas

Coupling

Query	Description
`coupling <file>`	Files that always change together

Standards

Query	Description
`norms`	Detected commit conventions (conventional, freeform, mixed)
`conventions`	Commit style prefixes and scopes

Health

Query	Description
`areas`	Directory-level health (healthy / needs-attention / at-risk)
`health`	Repo-wide health overview
`release-info`	Release cadence and tag history

LLM-augmented (requires `/repo-intel enrich` first)

Query	Description
`find <concept>`	Concept-to-file search. With descriptors, catches synonyms (worker ↔ executor); without, falls back to deterministic substring scoring across paths/symbols/imports/doc-headers.
`summary [--depth=1\|3\|10]`	Cached 3-depth narrative description: one sentence / one paragraph / one-page technical overview.

Slop targeting (consumed by `/deslop`)

Query	Description
`slop-fixes`	Pinpoint structured fix actions (Haiku tier): tracked artifacts, stale CI configs, duplicate tooling, orphan exports, empty catches, tautological tests. Each finding is self-contained for direct apply.
`slop-targets [--limit=N]`	Ranked Sonnet (file-level) and Opus (cross-file) targets. Sonnet: defensive cargo cult, bot-authored, could-be-shorter. Opus: cliché clusters, wrapper towers, single-impl traits, high-bug communities. With the embedder installed: also stylistic outliers and semantic duplicates.

Contributor guidance

Query	Description
`onboard`	Project orientation data (tech stack, key areas, pain points)
`can-i-help`	Good-first areas, test gaps, doc drift, bugspots for contributors

Documentation

Query	Description
`doc-drift`	Documentation files with low code coupling (likely stale)
`stale-docs`	Symbol-level references in docs that no longer exist in code

AST symbols

Query	Description
`symbols <file>`	Exports, imports, and definitions for a file
`dependents <symbol>`	Reverse dependency lookup - who imports this symbol
`entry-points`	Every place execution can start - binaries (`Cargo.toml [[bin]]`, `package.json bin`, `pyproject [project.scripts]`), AST `main` functions, npm `scripts`. Cargo workspace-aware.

Scoring

Hotspot score: (recent_changes * 2 + total_changes) / (total_changes + 1) - recent activity gets 2x weight.

Recency window: 90 days relative to the repo's last commit date (snapshot-relative, not wall clock).

Staleness: A contributor is stale if their last-seen date is > 90 days before the repo's last commit.

Area health:

healthy - active non-stale owner + bug fix rate < 30%
needs-attention - stale owner OR high bug rate
at-risk - stale owner AND high bug rate

Query flags

Flag	Applies to	Description
`--limit N`	most queries	Limit result rows
`--min-changes N`	test-gaps	Minimum change threshold
`--depth 1\|3\|10`	summary	Print just one depth as plain text (omit for full JSON)
`--since <date>`	init	Limit history scan to a date
`--max-commits N`	init	Cap total commits scanned

Architecture

/repo-intel query hotspots
    |
    +-- lib/repo-intel/queries.js   (thin JS wrapper)
    |       |
    |       +-- agent-analyzer repo-intel query hotspots <path>
    |                              (Rust binary, all computation)
    |
    +-- repo-intel.json            (cached in .claude/, .opencode/, or .codex/)

The JavaScript layer is intentionally thin - it resolves paths and parses JSON. All analysis logic lives in the agent-analyzer Rust binary.

Post-init enrichment (LLM-augmented signals)

/repo-intel enrich is opt-in. The Rust binary stays offline-only - the orchestration that produces semantic signals lives entirely in this plugin's JS layer plus two Haiku-backed Task subagents:

/repo-intel enrich
    |
    +-- Task: repo-intel-summarizer (haiku)
    |       reads README + manifests + top-10 hotspot heads,
    |       returns {depth1, depth3, depth10} as JSON between markers
    |       --> piped through `agent-analyzer set-summary --input -`
    |
    +-- Task: repo-intel-weighter (haiku, batched)
            reads top-500 most-active files in batches of 30,
            returns {path: descriptor} as JSON between markers
            --> piped through `agent-analyzer set-descriptors --input -`

After enrich:

query find <concept> adds a 2.5/term descriptor signal that catches semantic synonyms (worker ↔ executor, queue ↔ channel) the deterministic scorer can't see.
query summary [--depth=1|3|10] returns the cached narrative.

Cost is bounded by the top-500 cap regardless of repo size.

Embedder (opt-in)

The first time enrich runs, the skill prompts (via AskUserQuestion) for two choices and caches them in <stateDir>/sources/preference.json:

embedder — none (default) / small (BAAI/bge-small-en-v1.5 Q8 ~30 MB) / big (google/embeddinggemma-300m Q4 ~195 MB, code-aware, multilingual, recommended)
embedderDetail — compact (per-file × 128 dim) / balanced (per-function × 256 dim, recommended) / maximum (per-function × 768 dim)

When embedder !== 'none':

The separate agent-analyzer-embed binary is downloaded into ~/.agent-sh/bin/ (one-time, latest release).
Model files are fetched on first use into the fastembed cache (no bundling — keeps the binary small).
enrich runs agent-analyzer-embed update, pipes the JSON document into agent-analyzer repo-intel set-embeddings.
Embeddings live in a sidecar file <map_stem>.embeddings.bin (packed fp16, deterministic). The main JSON stays diffable.
All consumers degrade gracefully when no sidecar is present — the find and slop-targets queries return AST/graph-only results in that case.

To change variant or detail later: /repo-intel embed reset then /repo-intel enrich.

Keeping embeddings fresh in CI

The embed update action only re-embeds files whose content hash differs from the existing sidecar — fast on small PRs.

repo-intel itself is distributed via the agentsys plugin marketplace (not npm). For CI hooks where Claude Code isn't available, invoke the agent-analyzer binaries directly — they're published as standalone GitHub release assets:

# .github/workflows/repo-intel-embed.yml
name: repo-intel embed update
on:
  push:
    branches: [main]

jobs:
  embed:
    if: hashFiles('.claude/repo-intel.json') != ''
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v5

      - name: Download agent-analyzer + agent-analyzer-embed
        run: |
          set -euo pipefail
          mkdir -p "$HOME/.local/bin"
          TAG=$(curl -fsSL https://api.github.com/repos/agent-sh/agent-analyzer/releases/latest | jq -r .tag_name)
          for bin in agent-analyzer agent-analyzer-embed; do
            curl -fsSL -o "$HOME/.local/bin/$bin.tar.gz" \
              "https://github.com/agent-sh/agent-analyzer/releases/download/$TAG/$bin-x86_64-unknown-linux-gnu.tar.gz"
            tar xzf "$HOME/.local/bin/$bin.tar.gz" -C "$HOME/.local/bin"
            rm "$HOME/.local/bin/$bin.tar.gz"
          done
          chmod +x "$HOME/.local/bin/agent-analyzer" "$HOME/.local/bin/agent-analyzer-embed"
          echo "$HOME/.local/bin" >> "$GITHUB_PATH"

      - name: Update embeddings sidecar
        run: |
          agent-analyzer-embed update . \
              --map-file .claude/repo-intel.json \
              --variant big \
              --detail balanced \
            | agent-analyzer repo-intel set-embeddings \
              --map-file .claude/repo-intel.json --input -

      - uses: actions/upload-artifact@v4
        with:
          name: repo-intel-embeddings
          path: |
            .claude/repo-intel.json
            .claude/repo-intel.embeddings.bin

GitLab CI / Buildkite / Jenkins users: same idea — download the two binaries from the GitHub release, pipe one into the other.

For local hooks where Claude Code IS installed (so the agentsys plugin is available), the cleanest path is to invoke through the skill via your Claude Code CLI. For users who want a standalone hook that doesn't depend on Claude Code, use the same direct-binary pattern from the workflow above.

Consumer plugins

Other plugins use repo-intel data automatically when available:

Plugin	Queries used	Purpose
deslop	slop-fixes, slop-targets, test-gaps	Pinpoint mechanical fixes; route Sonnet/Opus scans where slop is likely
sync-docs	doc-drift, stale-docs	Find stale documentation
drift-detect	doc-drift, areas	Plan vs reality comparison
audit-project	test-gaps	Prioritize review of untested code
next-task	hotspots, bugspots, bus-factor, diff-risk	Risk-aware planning and review
enhance	doc-drift	Prioritize documentation improvements
ship	health, bugspots	Pre-release health check
onboard	onboard	Project orientation data
can-i-help	can-i-help	Contributor guidance signals

Requirements

Git repository with history
agent-analyzer binary (auto-downloaded on first use)

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

repo-intel

Why this plugin

Installation

Quick start

Actions

Query types

Activity

Quality

People

Coupling

Standards

Health

LLM-augmented (requires `/repo-intel enrich` first)

Slop targeting (consumed by `/deslop`)

Contributor guidance

Documentation

AST symbols

Scoring

Query flags

Architecture

Post-init enrichment (LLM-augmented signals)

Embedder (opt-in)

Keeping embeddings fresh in CI

Consumer plugins

Requirements

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

repo-intel

Why this plugin

Installation

Quick start

Actions

Query types

Activity

Quality

People

Coupling

Standards

Health

LLM-augmented (requires /repo-intel enrich first)

Slop targeting (consumed by /deslop)

Contributor guidance

Documentation

AST symbols

Scoring

Query flags

Architecture

Post-init enrichment (LLM-augmented signals)

Embedder (opt-in)

Keeping embeddings fresh in CI

Consumer plugins

Requirements

License

LLM-augmented (requires `/repo-intel enrich` first)

Slop targeting (consumed by `/deslop`)