Shared Rust binary for static analysis in the agent-sh ecosystem. Extracts temporal, social, and behavioral signals from git history, AST-based symbol maps, project data, and doc-code sync.
Repository: https://github.com/agent-sh/agent-analyzer
CLAUDE.mdis the project memory entrypoint for Claude Code.AGENTS.mdis a byte-for-byte copy ofCLAUDE.mdfor tools that readAGENTS.md(Codex CLI, OpenCode, Cursor, Cline, Copilot).- Keep them identical.
- Rust workspace - analyzer-core (shared), analyzer-git-map (history), analyzer-repo-map (AST), analyzer-collectors (data), analyzer-sync-check (docs), analyzer-cli (binary)
- ai_signatures.json is updateable data - Add new AI tool signatures there, not in code. Embedded via
include_str!()at compile time. - Plain text output - No emojis, no ASCII art. Use
[OK],[ERROR],[WARN],[CRITICAL]for status markers. - All JSON output to stdout - Progress and errors go to stderr. Consumers parse stdout.
- Release binaries - Compile with LTO, strip symbols (profile.release in workspace Cargo.toml)
- Task is not done until tests pass - Every feature/fix must have quality tests.
- Create PRs for non-trivial changes - No direct pushes to main.
- Always run git hooks - Never bypass pre-commit or pre-push hooks.
- No unnecessary files - Don't create summary files, plan files, audit files, or temp docs.
- Use single dash for em-dashes - In prose, use
-(single dash with spaces), never--. - Address all PR review comments - Even minor ones. If you disagree, respond in the review thread.
analyzer-core (shared types, git2 wrapper, AI detection, file walking, JSON output)
|
v
+-- analyzer-git-map (git history extraction, aggregation, queries, incremental)
+-- analyzer-repo-map (AST-based symbol mapping - stub)
+-- analyzer-collectors (project data gathering - stub)
+-- analyzer-sync-check (doc-code sync analysis - stub)
|
v
analyzer-cli (unified binary, clap dispatch, depends on all above)
crates/
analyzer-core/ # Shared library
src/
types.rs # RepoIntelData, Contributors, FileActivity, CouplingEntry, AiSignal, CommitDelta
git.rs # git2 wrapper: open_repo, walk_commits, get_commit_diff_stats, renames, deletions
ai_detect.rs # AI commit detection using embedded signature registry
ai_signatures.json # Updateable AI tool signatures (trailers, emails, patterns, bots)
walk.rs # File walking with noise filtering (lockfiles, dist, build, vendor)
output.rs # JSON serialization (pretty + compact)
analyzer-git-map/ # Git history analysis
src/
extractor.rs # extract_full(), extract_delta() using git2
aggregator.rs # create_empty_map(), merge_delta() - full spec implementation
queries.rs # hotspots, bugspots, ownership, bus_factor, areas, norms, coupling, etc.
incremental.rs # check_status(), needs_rebuild(), get_since_sha()
analyzer-repo-map/ # AST symbol extraction (Phase 2)
src/
parser.rs # Language detection, tree-sitter grammar init (6 languages)
extractor.rs # Walk files, parse, extract symbols (exports, imports, definitions, fields)
complexity.rs # Cyclomatic complexity via AST branch-point counting
conventions.rs # Naming pattern detection (snake_case, PascalCase), test framework detection
queries.rs # symbols(), dependents() queries
analyzer-collectors/ # Project metadata (Phase 3)
src/
readme.rs # README detection and heading extraction
ci.rs # CI provider detection (GitHub Actions, GitLab CI, etc.)
license.rs # License detection (SPDX from manifests + file pattern matching)
languages.rs # Language distribution by file extension
analyzer-sync-check/ # Doc-code cross-reference (Phase 4)
src/
parser.rs # Markdown parsing with pulldown-cmark, code ref extraction
matcher.rs # Symbol matching against AST symbol table, camelCase-to-snake_case
checker.rs # Staleness detection (deleted, renamed, hotspot references)
queries.rs # stale_docs(), build_doc_refs()
analyzer-cli/ # Unified CLI binary
src/
main.rs # clap dispatch
commands/
repo_intel.rs # init, update, status, query subcommands
repo_map.rs # stub
collect.rs # stub
sync_check.rs # stub
.github/workflows/
ci.yml # cargo test + clippy + fmt
release.yml # 5-target cross-platform build, GitHub release
RepoIntelData // Full JSON output artifact (repo-intel.json)
git: GitInfo // analyzedUpTo, totalCommitsAnalyzed, dates, scope, shallow
contributors: Contributors // humans (HashMap<String, HumanContributor>), bots
file_activity: HashMap<String, FileActivity> // per-file: changes, recent_changes, authors, ai metrics
coupling: HashMap<String, HashMap<String, CouplingEntry>> // co-change pairs
conventions: ConventionInfo // conventional commit prefixes, style, usesScopes
ai_attribution: AiAttribution // attributed/heuristic/none counts, per-tool breakdown, confidence
releases: Releases // tags, cadence
renames: Vec<RenameEntry> // file rename tracking
deletions: Vec<DeletionEntry> // file deletion tracking
FileActivity // Per-file metrics
changes, recent_changes, authors, created, last_changed
additions, deletions, ai_changes, ai_additions, ai_deletions
bug_fix_changes, refactor_changes, last_bug_fix
HumanContributor // commits, recent_commits, first_seen, last_seen, ai_assisted_commits
BotContributor // commits, recent_commits, first_seen, last_seen
CommitDelta // Raw extraction output (commits, renames, deletions)
CommitInfo // Parsed commit (hash, author, date, subject, body, trailers, files)
AiSignal // Detection result (detected, tool, method)
CommitSize // Tiny(<10), Small(10-50), Medium(50-200), Large(200-500), Huge(>500)Check order (highest confidence first):
- Trailer emails (Co-Authored-By containing known AI emails)
- Author emails (known AI tool domains)
- Bot authors (exact name match:
dependabot[bot],renovate[bot], etc.) - Author name patterns (regex:
\(aider\)$,\[bot\]$) - Message body patterns ("Generated with Claude Code", "^aider: ")
- Trailer names (Co-Authored-By name field: Claude, Cursor, Copilot, etc.)
Signatures loaded from embedded ai_signatures.json - update that file to add new tools.
Note: AI detection confidence is "low" - metadata-based detection catches <15% of AI commits. Phase 5 will add code stylometry for higher accuracy.
All queries operate on the cached RepoIntelData - no git commands needed.
| Function | Returns | Notes |
|---|---|---|
hotspots(map, _months, limit) |
Vec<HotspotEntry> |
Recency-weighted score, sorted by score. months reserved (90-day window is snapshot-relative) |
coldspots(map, _months) |
Vec<ColdspotEntry> |
Sorted by last_changed ascending. months reserved |
bugspots(map, limit) |
Vec<BugspotEntry> |
Bug-fix density (fixes/changes ratio) |
coupling(map, file, human_only) |
Vec<CouplingResult> |
Bidirectional lookup |
ownership(map, path) |
OwnershipResult |
With staleness, bus_factor_risk |
bus_factor(map, adjust_for_ai) |
usize |
People covering 80% of commits |
bus_factor_detailed(map, adjust_for_ai) |
BusFactorResult |
With critical_owners, at_risk_areas |
norms(map) |
NormsResult |
Commit conventions (Phase 2 adds code norms) |
areas(map) |
Vec<AreaEntry> |
Directory-level health with total_symbols, complexity_median, complexity_max from Phase 2 |
painspots(map, limit) |
Vec<PainspotEntry> |
Files ranked by hotspot × (1+bug_rate) × (1+complexity/30) - requires Phase 2 for full score |
contributors(map, months) |
Vec<ContributorEntry> |
Sorted by commit count; includes recent_activity (90-day commits) and stale: bool |
ai_ratio(map, path_filter) |
AiRatioResult |
Repo-wide or per-path |
release_info(map) |
ReleaseInfo |
Cadence, last release, unreleased |
health(map) |
HealthResult |
Active, bus_factor, frequency, ai_ratio |
file_history(map, path) |
Option<&FileActivity> |
Single file lookup |
conventions(map) |
ConventionResult |
Style, prefixes, scopes |
test_gaps(map, min_changes, limit) |
Vec<TestGapEntry> |
Hot files with no co-changing test file |
diff_risk(map, files) |
Vec<DiffRiskEntry> |
Score file list by composite risk |
doc_drift(map, limit) |
Vec<DocDriftEntry> |
Doc files with low code coupling |
recent_ai(map, limit) |
Vec<RecentAiEntry> |
Files with recent AI changes |
onboard(map) |
OnboardResult |
Newcomer-oriented repo summary (structure, key areas, pain points) |
can_i_help(map) |
CanIHelpResult |
Contributor guidance (good-first areas, needs-help areas) |
- Recency window: 90 days relative to repo's
last_commit_date(snapshot-relative, not wall clock) - recent_changes/recent_commits: Counted within the 90-day window
- Stale: A contributor is stale if their
last_seenis >90 days beforelast_commit_date - Hotspot score:
(recent_changes * 2 + total_changes) / (total_changes + 1)
Excluded from coupling and hotspot analysis:
package-lock.json,yarn.lock,Cargo.lock,go.sum,pnpm-lock.yaml.min.js,.min.cssdist/,build/,vendor/
agent-analyzer --version
agent-analyzer repo-intel init [--max-commits=N] <path>
agent-analyzer repo-intel update --map-file=<file> <path>
agent-analyzer repo-intel status --map-file=<file> <path>
agent-analyzer repo-intel query hotspots [--top=N] --map-file=<file> <path>
agent-analyzer repo-intel query coldspots [--top=N] --map-file=<file> <path>
agent-analyzer repo-intel query bugspots [--top=N] --map-file=<file> <path>
agent-analyzer repo-intel query coupling <file> --map-file=<file> <path>
agent-analyzer repo-intel query ownership <file> --map-file=<file> <path>
agent-analyzer repo-intel query bus-factor [--adjust-for-ai] --map-file=<file> <path>
agent-analyzer repo-intel query norms --map-file=<file> <path>
agent-analyzer repo-intel query areas --map-file=<file> <path>
agent-analyzer repo-intel query contributors [--top=N] --map-file=<file> <path>
agent-analyzer repo-intel query ai-ratio [--path-filter=<path>] --map-file=<file> <path>
agent-analyzer repo-intel query release-info --map-file=<file> <path>
agent-analyzer repo-intel query health --map-file=<file> <path>
agent-analyzer repo-intel query file-history <file> --map-file=<file> <path>
agent-analyzer repo-intel query conventions --map-file=<file> <path>
agent-analyzer repo-intel query test-gaps [--top=N] [--min-changes=N] --map-file=<file> <path>
agent-analyzer repo-intel query diff-risk --files=<a,b,c> --map-file=<file> <path>
agent-analyzer repo-intel query doc-drift [--top=N] --map-file=<file> <path>
agent-analyzer repo-intel query recent-ai [--top=N] --map-file=<file> <path>
agent-analyzer repo-intel query onboard --map-file=<file> <path>
agent-analyzer repo-intel query can-i-help --map-file=<file> <path>
agent-analyzer repo-intel query painspots [--top=N] --map-file=<file> <path>
agent-analyzer repo-map generate <path>
agent-analyzer repo-map symbols <file> --map-file=<file>
agent-analyzer repo-map dependents <symbol> [--file=<file>] --map-file=<file>
agent-analyzer collect run <path>
agent-analyzer sync-check check <path> --map-file=<file>
agent-analyzer sync-check stale-docs <path> [--top=N] --map-file=<file>
5 targets (same as agnix):
x86_64-unknown-linux-gnux86_64-unknown-linux-muslaarch64-unknown-linux-gnu(via cross)aarch64-apple-darwinx86_64-pc-windows-msvc
cargo check # Compile check
cargo test # Run all tests (142 tests)
cargo build --release # Build release binary
cargo clippy -- -D warnings # Lint (treat warnings as errors)
cargo fmt --check # Format check
cargo run -p analyzer-cli -- --version # Run CLI- Phase 1-4 complete
- 146 passing tests (24 analyzer-core, 57 analyzer-git-map, 30 analyzer-repo-map, 16 analyzer-collectors, 19 analyzer-sync-check)
- CI: cargo test + clippy + fmt on push/PR
- Release: 5-target cross-platform builds on tag push
| Phase | Crate | Status | Description |
|---|---|---|---|
| 1 | analyzer-core, analyzer-git-map, analyzer-cli | Complete | Git intelligence (recency, staleness, bugspots, norms, areas, onboard, can-i-help) |
| 2 | analyzer-repo-map | Complete | AST symbol extraction (tree-sitter, 6 languages: Rust, TS, JS, Python, Go, Java) |
| 3 | analyzer-collectors | Complete | Project metadata (README, CI, license, languages, package manager) |
| 4 | analyzer-sync-check | Complete | Doc-code cross-reference (inline code matching, hotspot detection, staleness) |
| 5 | analyzer-core | Planned | AI code stylometry (replace metadata-based detection) |
This binary is consumed by JS plugins via the binary resolver in agent-core/lib/binary/:
- JS calls
binary.ensureBinary()which auto-downloads from GitHub releases - Binary location:
~/.agent-sh/bin/agent-analyzer[.exe] - Distribution: lazy download on first use, no manual install
Consumers:
git-mapplugin (JS wrapper usingrepo-intelCLI namespace)repo-mapplugin (usesrepo-mapCLI for AST symbol extraction)agent-core/lib/collectors/(usescollectCLI for project metadata)sync-docsplugin (usessync-checkCLI for doc-code cross-references)
- Part of the agent-sh ecosystem
- Spec:
agent-analyzer/SPEC.md - AI detection:
agent-knowledge/ai-commit-detection-forensics.md - Git analysis research:
agent-knowledge/git-history-analysis-developer-tools.md - https://agentskills.io