A security enumeration tool designed to detect and analyze sandbox environments in AI code assistants, identifying execution capabilities, system access, and potential security boundaries.
Sandbox Probe is specifically designed to fingerprint the execution environment of AI code assistants (such as Claude Code, Gemini CLI, and similar tools) and identify:
- Sandbox/Container Detection: Docker, Podman, LXC, Firejail, Bubblewrap, gVisor, WSL, OpenVZ, Seatbelt, Landlock
- File System Permissions: Writable system paths and readable sensitive files
- Network Capabilities: DNS resolution, external connectivity, open TCP/UDP ports
- Process Information: Running processes and parent process detection
- System Context: User/group information, hostname, mounted volumes
- Proxy Configuration: Environment-based proxy detection
- Go 1.25 or later
- Protocol Buffer compiler (
buf) - install viamake install-buf- required if changing the protobuf definitions
jq- JSON processor for parsing outputs- provides pretty printing for JSON reports
Depending on which sandboxing you want to test a combination of these may be required:
docker- For containerized testingpodman- For containerized testingclaude-code- Claude Code CLI for Claude testinggemini-cli- Gemini Code Assist CLI for Gemini testingnono- A sandboxing tool for wrapping AI agents and other programs
# Clone the repository
git clone https://github.com/controlplaneio/sandbox-probe.git
cd sandbox-probe
# Build the binary
make buildIf running sandbox-probe inside a container make sure that it was built statically,
with standard library paths, or find a method to mount additional paths.
This isn't typically an issue but can be for non-glibc or non-fhs systems like alpine or nixos (and via nix).
Run security enumeration probes on the current environment.
./bin/sandbox-probe scan [flags]Flags:
--tasks- Additional individual tasks to run (comma-separated)--tasksets- Group of tasks to select:baseline,ps,all(default:baseline)--output_path- Path to write the JSON report (default:report.json)--tags- Metadata tags to append to the report (comma-separated)--log_level- Set log level (default:info)
Examples:
Run all baseline probes:
./bin/sandbox-probe scanRun specific tasksets:
./bin/sandbox-probe scan --tasksets baseline,psRun with custom output path and tags:
./bin/sandbox-probe scan --output_path results.json --tags "test,docker"Run specific tasks:
./bin/sandbox-probe scan --tasks baseline_network_task,baseline_process_taskList all available tasks and tasksets with their descriptions.
./bin/sandbox-probe tasks listThis command displays a formatted table of all available tasks, including:
- Task names (color-coded in blue)
- Task descriptions
Example output:
baseline_path_task : Scans filesystem for writable and sensitive readable paths
baseline_network_task : Scans network for DNS resolution, connectivity, and open TCP/UDP ports
baseline_proxy_task : Detects proxy configuration from environment variables
...
Display version information for the sandbox-probe binary.
./bin/sandbox-probe versionExample output:
version v1.0.0
git commit abc1234
build date 2026-02-13T10:30:00Z
Run all baseline probes (outside of the context of an AI assistant). It is useful just for testing the go code. If running on a desktop device you actually use the report can be very large. For dedicated servers or containerized environments (such as the environments used by some AI tooling) there will be less access and as such less output.
./bin/sandbox-probe scanImportant
Since the goal of these test scripts is to run inside the sandbox many of these are executed by the agent. Please consider the risk that these agents could execute other actions, especially on non-interactive/YOLO modes.
This does not apply to pure sandboxes which you run other AI agents within such as nono.
You might reduce this risk by using the interactive version of these scripts but the Agent may still take autonomous action you don't expect/trust.
If you have the pre-requisite dependencies consider running a script in ./tests such as ./tests/sandbox_nono.sh.
AI Agent tooling such as Gemini and Claude will need login details.
tests/
├── baseline_nono.sh
├── baseline_claude.sh
├── baseline_gemini_interactive.sh
├── sandbox_nono.sh
├── sandbox_claude.sh
└── sandbox_gemini_interactive.sh
└── ...
These scripts will output to the ./reports subdirectory.
For more details please see here.
The tool generates multiple outputs:
- Console Output: Structured logs showing probe execution progress
- report.json: Detailed findings in JSON format
- Log Files: Timestamped logs in
logs/directory (e.g.,logs/sandbox-probe-2026-02-09-15-30-45.log)
{
"version": "1.0.0",
"timestamp": "2026-02-09T15:30:45Z",
"probeBinary": {
"goVersion": "go1.21.0",
"os": "linux",
"arch": "amd64",
"static": false
},
"findings": [
{
"findingType": "sandbox_detection",
"task": "baseline_sandbox_detector",
"description": "Sandbox/container runtime",
"value": "docker"
}
{
"//" "..."
},
]
}The baseline probe includes the following tasks:
| Task Name | Description |
|---|---|
| baseline_path_task | Scans filesystem for writable and sensitive readable paths |
| baseline_network_task | Scans network for DNS resolution, connectivity, and open TCP/UDP ports |
| baseline_proxy_task | Detects proxy configuration from environment variables |
| baseline_socket_task | Scans filesystem for Unix domain sockets |
| baseline_process_task | Detects running processes and parent process information |
| baseline_user_context_task | Detects user and group context information (UID, GID, EUID, EGID) |
| ps_all_task | Lists all running processes using ps command |
| baseline_hostname_task | Detects the system hostname |
| baseline_sandbox_task | Detects container runtime and sandbox environments (Docker, Podman, LXC, etc.) |
| baseline_mount_task | Detects host-mounted volumes and filesystem mounts |
| ps_parent_task | Gets parent process information using ps command |
| ps_single_task | Gets information about the running process using ps command |