-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Overview
Add automatic secrets detection and prevention to stop AI agents from accidentally exposing API keys, credentials, tokens, and other sensitive data in workspace files. Scan for secrets before commits, alert on detection, and optionally block operations that would expose credentials.
Motivation
Current Problem:
- AI agents might accidentally hardcode API keys in source code
- Credentials could be written to
.envfiles that get committed - SSH private keys could be copied to workspace
- Database connection strings with passwords exposed
- No warning before pushing sensitive data
With Secrets Detection:
coi shell --block-secrets
# AI tries to write:
# API_KEY = "sk-abc123..."
⚠️ BLOCKED: Potential secret detected in src/config.py
Type: API Key (Anthropic)
Line: 15: API_KEY = "sk-abc123..."
This appears to be a sensitive credential.
Use environment variables instead: os.getenv('API_KEY')Use Cases
1. Pre-Commit Protection
# Scan workspace before committing
coi secrets scan
# Output:
⚠️ Found 3 potential secrets in workspace:
src/config.py:15
Type: API Key
Pattern: sk-[a-zA-Z0-9]{48}
Line: API_KEY = "sk-abc123..."
.env.example:5
Type: AWS Access Key
Pattern: AKIA[0-9A-Z]{16}
Line: AWS_ACCESS_KEY=AKIAI...
database.yml:12
Type: Database Password
Line: password: "super_secret_123"
Run: coi secrets clean2. Real-Time Protection
# Block AI from writing secrets
coi shell --block-secrets
# AI can still work, but:
# - Can't write files with secrets
# - Can't commit files with secrets
# - Gets warning to use env vars instead3. Historical Scanning
# Scan past sessions for exposed secrets
coi secrets scan --session session-abc123
# Scan all sessions for a project
coi secrets scan --project backend-api --all-sessions
# Generate audit report
coi secrets audit --project backend-api > secrets-audit.json4. Cleanup & Remediation
# Find and remove secrets
coi secrets clean
# Shows each secret and asks:
# Remove from file? [y/N]
# Replace with env var? [Y/n]
# Add to .gitignore? [Y/n]
# Automatic cleanup (dangerous)
coi secrets clean --auto --replace-with-env-varsProposed Implementation
Detection Strategies
1. Pattern-Based Detection
var secretPatterns = []SecretPattern{
{
Name: "Anthropic API Key",
Pattern: regexp.MustCompile(`sk-ant-[a-zA-Z0-9-]{95}`),
Severity: "high",
},
{
Name: "OpenAI API Key",
Pattern: regexp.MustCompile(`sk-[a-zA-Z0-9]{48}`),
Severity: "high",
},
{
Name: "AWS Access Key",
Pattern: regexp.MustCompile(`AKIA[0-9A-Z]{16}`),
Severity: "high",
},
{
Name: "GitHub Token",
Pattern: regexp.MustCompile(`ghp_[a-zA-Z0-9]{36}`),
Severity: "high",
},
{
Name: "Stripe API Key",
Pattern: regexp.MustCompile(`sk_live_[a-zA-Z0-9]{24}`),
Severity: "high",
},
{
Name: "Private Key",
Pattern: regexp.MustCompile(`-----BEGIN (RSA |EC |DSA )?PRIVATE KEY-----`),
Severity: "critical",
},
{
Name: "Generic Secret",
Pattern: regexp.MustCompile(`(?i)(secret|password|token|api_?key)\s*[:=]\s*["']([^"']{8,})["']`),
Severity: "medium",
},
}2. Entropy-Based Detection
func hasHighEntropy(value string) bool {
// Calculate Shannon entropy
entropy := calculateEntropy(value)
// High entropy strings are likely secrets
return entropy > 4.5 && len(value) > 16
}
func calculateEntropy(s string) float64 {
freq := make(map[rune]float64)
for _, c := range s {
freq[c]++
}
var entropy float64
length := float64(len(s))
for _, count := range freq {
p := count / length
entropy -= p * math.Log2(p)
}
return entropy
}3. Integration with Existing Tools
# Use gitleaks
gitleaks detect --source /workspace --no-git
# Use trufflehog
trufflehog filesystem /workspace
# Use detect-secrets
detect-secrets scan /workspaceFile Monitoring
Monitor workspace file writes in real-time:
func monitorWorkspaceWrites(container string) {
// Use inotify or fsnotify to watch workspace
watcher, _ := fsnotify.NewWatcher()
watcher.Add(getWorkspacePath(container))
for event := range watcher.Events {
if event.Op&fsnotify.Write == fsnotify.Write {
// Scan newly written file
if hasSecrets(event.Name) {
alertUser(event.Name)
// Optionally block/remove
}
}
}
}Command-Line Interface
Scanning
# Scan workspace
coi secrets scan
# Scan specific files
coi secrets scan src/config.py .env
# Scan with specific tools
coi secrets scan --tool gitleaks
coi secrets scan --tool trufflehog
coi secrets scan --tool detect-secrets
# Output formats
coi secrets scan --format table
coi secrets scan --format json
coi secrets scan --format sarif # GitHub compatible
# Severity filtering
coi secrets scan --severity high
coi secrets scan --severity criticalPrevention
# Enable real-time protection
coi shell --block-secrets
# Different modes
coi shell --warn-secrets # Warn but don't block
coi shell --block-secrets # Block file writes with secrets
coi shell --audit-secrets # Log all secrets to audit trail
# Per-session scanning
coi secrets scan --session session-abc123
# Historical audit
coi secrets audit --all-sessionsCleanup
# Interactive cleanup
coi secrets clean
# Auto-replace with env vars
coi secrets clean --auto-fix
# Remove detected secrets
coi secrets clean --remove
# Preview changes
coi secrets clean --dry-runConfiguration
# Configure detection rules
coi secrets config --add-pattern "custom_token:[a-z0-9]{32}"
coi secrets config --ignore-file test_fixtures.py
coi secrets config --ignore-pattern "EXAMPLE_.*"
# Whitelist known false positives
coi secrets whitelist add "sk-test-123" # Test API keyExample Output
Scan Results
Secrets Scan Results
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CRITICAL (1)
src/keys.py:23
Private Key: -----BEGIN RSA PRIVATE KEY-----
Risk: Critical - Never commit private keys to source control
HIGH (3)
src/config.py:15
Anthropic API Key: sk-ant-api03-...
Risk: High - API key with full account access
.env:7
AWS Access Key: AKIAIOSFODNN7EXAMPLE
Risk: High - AWS credentials with potential broad access
database.yml:12
Database Password: password: "MySuperSecret123"
Risk: High - Database credentials in plaintext
MEDIUM (2)
test/fixtures.py:45
Generic Secret: api_key = "test_key_12345678"
Risk: Medium - May be test data (review manually)
SUMMARY
Total files scanned: 142
Secrets found: 6 (1 critical, 3 high, 2 medium)
Files affected: 4
RECOMMENDATIONS
1. Move all secrets to environment variables
2. Add .env to .gitignore
3. Rotate exposed API keys immediately
4. Use secret management (e.g., 1Password, AWS Secrets Manager)
Run: coi secrets clean --interactive
Real-Time Block
❌ BLOCKED: Secret detected
File: src/config.py
Line: 15
Type: Anthropic API Key
Pattern: sk-ant-api03-xxxxxxxxxxxxx
AI attempted to write:
API_KEY = "sk-ant-api03-xxxxxxxxx..."
This appears to be a sensitive credential.
RECOMMENDED FIXES:
1. Use environment variable:
import os
API_KEY = os.getenv('ANTHROPIC_API_KEY')
2. Use configuration file (not committed):
# In config.py
from config_local import API_KEY # Add config_local.py to .gitignore
3. Use secrets manager:
from secretsmanager import get_secret
API_KEY = get_secret('anthropic_api_key')
The file was NOT written. Please fix and try again.
Cleanup Interactive
Secret Cleanup Wizard
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Found: Anthropic API Key in src/config.py:15
API_KEY = "sk-ant-api03-..."
Options:
[1] Replace with environment variable (recommended)
[2] Remove line entirely
[3] Comment out with warning
[4] Skip (keep as-is)
[5] Whitelist (mark as false positive)
Choice: 1
✓ Replaced with: API_KEY = os.getenv('ANTHROPIC_API_KEY')
✓ Added import: import os
✓ Created .env.example with: ANTHROPIC_API_KEY=your_key_here
Next: Found AWS Access Key in .env:7
...
Implementation Phases
Phase 1: Basic Scanning (MVP)
- Pattern-based secret detection
- Common secret patterns (API keys, tokens, passwords)
-
coi secrets scancommand - Integration with gitleaks or trufflehog
- Text output with findings
Phase 2: Real-Time Protection
- File write monitoring in containers
-
--block-secretsflag forcoi shell - Real-time alerts when secrets detected
- Block file writes with secrets
- Suggest fixes (use env vars)
Phase 3: Cleanup & Remediation
-
coi secrets cleancommand - Interactive cleanup wizard
- Auto-replace with env vars
- Generate .env.example files
- Whitelist management
Phase 4: Advanced Detection
- Entropy-based detection
- Machine learning models for secret detection
- Context-aware detection (reduce false positives)
- Custom pattern support
- Integration with multiple scanning tools
Phase 5: Integration & Reporting
- GitHub Actions integration (SARIF output)
- Pre-commit hook generation
- Audit trail of detected secrets
- Rotation recommendations
- Secret manager integration suggestions
Configuration
# ~/.config/coi/config.toml
[secrets]
enabled = true
block_by_default = false # Warn by default, don't block
[secrets.scan]
tools = ["gitleaks", "trufflehog"] # Tools to use for scanning
severity_threshold = "medium" # minimum severity to report
[secrets.patterns]
# Custom patterns
custom = [
{ name = "Company Token", pattern = "COMP_[A-Z0-9]{32}", severity = "high" }
]
[secrets.ignore]
# Ignore patterns (false positives)
patterns = [
"EXAMPLE_.*",
"TEST_KEY_.*",
]
# Ignore files
files = [
"test/fixtures/*.py",
"**/*_test.go",
]
[secrets.whitelist]
# Known safe values
values = [
"sk-test-12345", # Test API key
]
[secrets.auto_fix]
replace_with_env_vars = true
create_env_example = true
add_to_gitignore = trueIntegration with Git Hooks
Generate pre-commit hook:
coi secrets install-hook
# Creates .git/hooks/pre-commit:
#!/bin/sh
coi secrets scan --severity high || exit 1Or integrate with existing pre-commit framework:
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: coi-secrets-check
name: COI Secrets Detection
entry: coi secrets scan
language: system
pass_filenames: falseSecret Types Detected
API Keys
- Anthropic (Claude)
- OpenAI (ChatGPT)
- Google Cloud
- AWS
- Azure
- Stripe
- Twilio
- SendGrid
- GitHub
- GitLab
Credentials
- Database connection strings
- JDBC URLs with passwords
- Redis connection strings
- SMTP credentials
Keys
- SSH private keys
- PGP private keys
- TLS/SSL certificates
- JWT tokens
- Session tokens
- OAuth tokens
Cloud Provider Secrets
- AWS access keys
- GCP service account keys
- Azure connection strings
- DigitalOcean tokens
- Heroku API keys
Generic Patterns
- High-entropy strings
- Password fields
- Secret/token fields
- Base64-encoded credentials
Benefits
Security:
- Prevent accidental credential exposure
- Reduce attack surface
- Comply with security policies
- Protect production systems
Education:
- Teach AI agents best practices
- Show proper secret management
- Guide to environment variables
- Prevent bad habits
Compliance:
- Meet security audit requirements
- SOC2/ISO27001 compliance
- Prevent data breaches
- Maintain audit trails
Cost:
- Prevent compromised API keys
- Avoid key rotation costs
- Prevent unauthorized usage
- Reduce security incidents
Technical Considerations
Performance
Scanning strategies:
- Incremental scan (only changed files)
- Background scanning (don't block AI)
- Cached results (avoid re-scanning)
- Parallel scanning
False Positives
Reduce with:
- Context awareness (test files, examples)
- Entropy analysis (high randomness = likely secret)
- Whitelist management
- Pattern refinement
False Negatives
Improve detection:
- Multiple scanning tools
- Custom patterns for company-specific secrets
- Regular expression updates
- Community-contributed patterns
Related Issues
- Monitoring (feat: Add container monitoring and observability (coi monitor) #112) - Monitor secret exposure attempts
- Session management - Track which sessions exposed secrets
- Audit mode - Full audit trail of secret detections
Integration with Secret Managers
Suggest integration with:
# After detecting secrets
⚠️ Secrets detected. Consider using a secret manager:
1Password CLI:
op inject -i config.template.yml -o config.yml
AWS Secrets Manager:
aws secretsmanager get-secret-value --secret-id api-key
HashiCorp Vault:
vault kv get secret/api-key
Environment variables:
export ANTHROPIC_API_KEY=$(cat ~/.secrets/anthropic)
coi shellOpen Questions
-
Should we block or warn by default?
- Proposal: Warn by default, configurable to block
-
How to handle test fixtures with fake secrets?
- Proposal: Whitelist patterns, special markers in code
-
Should we scan container filesystem or just workspace?
- Proposal: Workspace only (main risk), optional full scan
-
Should we auto-rotate detected secrets?
- Proposal: No (too risky), provide rotation instructions
-
How to handle secrets in git history?
- Proposal: Separate tool/command, use git-filter-repo
References
- gitleaks - Secret scanning tool
- trufflehog - Find secrets in git
- detect-secrets - Yelp's secret scanner
- OWASP Secrets Management Cheat Sheet