A curated list of open-source tools, frameworks, and resources for securing autonomous AI agents.
This list is organized by the security lifecycle of an autonomous agent, covering red teaming, runtime protection, sandboxing, and governance.
- Agent Firewalls & Gateways (Runtime Protection)
- Red Teaming & Vulnerability Scanners
- Static Analysis & Linters
- Sandboxing & Isolation Environments
- Guardrails & Compliance
- Benchmarks & Datasets
- Identity & Authentication
- Contributing
Tools that sit between the agent and the world to filter traffic, prevent unauthorized tool access, and block prompt injections.
- AgentGateway - A Linux Foundation project providing an AI-native proxy for secure connectivity (A2A & MCP protocols). It adds RBAC, observability, and policy enforcement to agent-tool interactions.
- Envoy AI Gateway - An Envoy-based gateway that manages request traffic to GenAI services, providing a control point for rate limiting and policy enforcement.
Offensive tools to test agents for security flaws, loop conditions, and unauthorized actions.
- Strix - An autonomous AI agent designed for penetration testing. It runs inside a docker sandbox to actively probe applications and generate verified exploit capabilities.
- PyRIT - Microsoft’s open-source red teaming framework for generative AI. It automates multi-turn adversarial attacks to test if an agent can be coerced into harmful behavior.
- Agentic Security - A dedicated vulnerability scanner for agent workflows and LLMs capable of running multi-step jailbreaks and fuzzing attacks against agent logic.
- Garak - The "Nmap for LLMs." A vulnerability scanner that probes models for hallucination, data leakage, and prompt injection susceptibilities.
- A2A Scanner - A scanner by Cisco designed to inspect "Agent-to-Agent" communication protocols for threats, validating agent identities and ensuring compliance with communication specs.
- Cybersecurity AI (CAI) - A framework for building specialized security agents for offensive and defensive operations, often used in CTF (Capture The Flag) scenarios.
Tools to analyze agent configuration and logic code before deployment.
- Agentic Radar - A static analysis tool that visualizes agent workflows (LangGraph, CrewAI, AutoGen). It detects risky tool usage, permission loops, and maps them to known vulnerabilities.
- Agent Bound - A design-time analysis tool that calculates "Agentic Entropy"—a metric to quantify the unpredictability and risk of infinite loops or unconstrained actions in agent architectures.
- Checkov - While primarily for IaC, Checkov includes policies for scanning AI infrastructure and configurations to prevent misconfigurations in deployment.
Secure runtimes to prevent agents from damaging the host system during code execution.
- SandboxAI - An open-source runtime for executing AI-generated code (Python/Shell) in isolated containers with granular permission controls.
- Kubernetes Agent Sandbox - A Kubernetes Native project providing a Sandbox Custom Resource Definition (CRD) to manage isolated, stateful workloads for AI agents.
- Agent-Infra Sandbox - An "All-In-One" sandbox combining Browser, Shell, VSCode, and File System access in a single Docker container, optimized for agentic tasks.
- OpenHands - Formerly OpenDevin, this platform includes a secure runtime environment for autonomous coding agents to operate without accessing the host machine's sensitive files.
Middleware to enforce business logic and safety policies on inputs and outputs.
- NeMo Guardrails - NVIDIA’s toolkit for adding programmable rails to LLM-based apps. It ensures agents stay on topic, avoid jailbreaks, and adhere to defined safety policies.
- Guardrails - A Python framework for validating LLM outputs against structural and semantic rules (e.g., "must return valid JSON," "must not contain PII").
- LiteLLM Guardrails - While known for model proxying, LiteLLM includes built-in guardrail features to filter requests and responses across multiple LLM providers.
Resources to evaluate agent security performance.
- CVE Bench - A benchmark for evaluating an AI agent's ability to exploit real-world web application vulnerabilities (useful for testing defensive agents).
Tools to manage agent identity (non-human identities).
- WSO2 - An identity management solution that treats AI agents as first-class identities, enabling secure authentication and authorization for agent actions.
Contributions are welcome! Please read the contribution guidelines first.
- Fork the project.
- Create your feature branch (
git checkout -b feature/AmazingFeature). - Commit your changes (
git commit -m 'Add some AmazingFeature'). - Push to the branch (
git push origin feature/AmazingFeature). - Open a Pull Request.