Skip to content

feat(code-interpreter): add DOCKER driver for self-hosted sandbox execution#20105

Open
krzysztof7363 wants to merge 6 commits intotwentyhq:mainfrom
krzysztof7363:feat/docker-code-interpreter
Open

feat(code-interpreter): add DOCKER driver for self-hosted sandbox execution#20105
krzysztof7363 wants to merge 6 commits intotwentyhq:mainfrom
krzysztof7363:feat/docker-code-interpreter

Conversation

@krzysztof7363
Copy link
Copy Markdown
Contributor

@krzysztof7363 krzysztof7363 commented Apr 28, 2026

Summary

Adds a DOCKER option to CODE_INTERPRETER_TYPE so self-hosters can run Twenty's code_interpreter against a local Docker sandbox instead of E2B (paid SaaS) or LOCAL (refused in production).

The driver speaks dockerode against the host Docker socket. Each execute() call:

  1. Stages a per-request work dir on a host path (DOCKER_SANDBOX_WORK_DIR) — the same path bind-mounted into the server container so the host daemon resolves it.
  2. Creates a hardened sandbox container (read-only rootfs, CapDrop: ALL, no-new-privileges, configurable memory + PIDs limits) on a configurable Docker network.
  3. Streams stdout/stderr from docker exec back through the existing StreamCallbacks interface; harvests files written under output/ from the host side of the bind mount.
  4. Tears down the container and removes the work dir.

Five commits, each independently revertable:

# Commit
1 chore(deps): add dockerode to twenty-server
2 feat(code-interpreter): add DOCKER driver for self-hosted sandbox execution
3 feat(docker): add twentycrm/sandbox image for DOCKER code interpreter
4 feat(code-interpreter): add CODE_INTERPRETER_SERVER_URL override (sandbox-internal MCP callback URL when sandbox network can't resolve the public SERVER_URL)
5 feat(code-interpreter): allow selecting a non-default Docker runtime for sandboxes (lets operators route sandboxes through gVisor's runsc or Sysbox; no-op when unset)

Notes for review

  • New opinionated driver — happy to discuss whether it belongs in core or as a documented extension. If interest is mixed, the helper-plumbing commits (4 and 5) are useful even with the existing E2B driver in restricted-network deployments and could be split out.
  • The sandbox image (twentycrm/sandbox) ships pandas/numpy/matplotlib/pillow/python-docx/python-pptx/openpyxl/pypdf/pdfplumber/reportlab — same tier as E2B's data-analysis template. Built from packages/twenty-docker/sandbox/Dockerfile.
  • Single-user threat model assumed; multi-tenant deployments should layer gVisor (DOCKER_SANDBOX_RUNTIME=runsc, commit 5) or stay on a microVM driver. Documented in the sandbox image README.
  • Companion PR fix(code-interpreter): three correctness fixes for the TwentyMCP helper + tool output shape #20103 carries three correctness fixes for the existing code_interpreter tool (helper routing, UI rendering, prompt) that this stack surfaced. Not a hard dependency for compilation, but operationally pairs with this PR.

Test plan

  • Build the sandbox image: docker build -f packages/twenty-docker/sandbox/Dockerfile -t twentycrm/sandbox:dev ..
  • Set CODE_INTERPRETER_TYPE=DOCKER, DOCKER_SANDBOX_WORK_DIR=<host-path>, DOCKER_SANDBOX_NETWORK=twenty_sandbox, mount /var/run/docker.sock into the server container.
  • Run a chat turn requesting a Python computation; confirm code_interpreter returns stdout + exitCode === 0; matplotlib output renders inline.
  • Run a chat turn that uses the injected TwentyMCP helper to fetch records and compute over them.
  • With gVisor installed on the host: set DOCKER_SANDBOX_RUNTIME=runsc. Inspect a live sandbox: docker inspect --format '{{.HostConfig.Runtime}}' <id> reports runsc.
  • With CODE_INTERPRETER_SERVER_URL=http://server:3000: sandbox-internal MCP calls reach the server even when SERVER_URL is a public URL the sandbox network can't resolve.
  • docker ps -a shows no lingering exited sandbox containers after a run.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

Welcome!

Hello there, congrats on your first PR! We're excited to have you contributing to this project.
By submitting your Pull Request, you acknowledge that you agree with the terms of our Contributor License Agreement.

Generated by 🚫 dangerJS against a8fd1f0

Prepares for a new DOCKER code-interpreter driver that talks to the host
Docker daemon.
…cution

Adds a fourth value to CodeInterpreterDriverType alongside LOCAL, E_2_B,
and DISABLED. The DOCKER driver runs user code in a per-request container
on the host Docker daemon, giving self-hosters a production-safe option
without the E2B SaaS dependency.

Design:
 - Per-call lifecycle: stage a host work dir, create + start a container
   with a long-running placeholder cmd, exec python -u -c <code>, stream
   demuxed stdout/stderr, harvest outputs from the host side of the bind
   mount, remove the container.
 - Bind mount a host tempdir at /home/user instead of using
   putArchive/getArchive: Docker's archive API targets the rootfs layer,
   which is shadowed by tmpfs/bind mount overlays, so the sandbox process
   and the archive API see two different views of the same path. Bind
   mount lets both actors share one filesystem (see moby/moby#40885).
 - Hardening defaults: ReadonlyRootfs, CapDrop: ALL, no-new-privileges,
   bounded Memory and PidsLimit, optional NetworkMode for isolated-egress
   configurations.
 - Timeout: races exec against a SIGKILL on the placeholder; on timeout
   returns exit code 124 and an "Execution timed out" error string.
 - LineEmitter buffers partial lines across byte-chunks so callbacks see
   whole lines (matching the E2B driver's per-line streaming contract).

Configuration (all under ConfigVariablesGroup.CODE_INTERPRETER_CONFIG):
 - DOCKER_SANDBOX_IMAGE (default twentycrm/sandbox:latest)
 - DOCKER_SANDBOX_NETWORK (optional; intended for an internal compose
   network shared with the server container)
 - DOCKER_SANDBOX_WORK_DIR (default /var/run/twenty-sandbox; must resolve
   to the same path on the host Docker daemon and inside the server
   container — typically a named volume mounted at both sides)
 - DOCKER_SANDBOX_MEMORY_MB (default 512)
 - DOCKER_SANDBOX_PIDS_LIMIT (default 256)
 - Reuses existing CODE_INTERPRETER_TIMEOUT_MS.

The factory's buildConfigKey now hashes CODE_INTERPRETER_CONFIG for DOCKER
(same treatment as E2B) so config changes invalidate the cached driver.
Minimal Python runtime the DOCKER code-interpreter driver boots per
request. Based on python:3.12-slim with pandas, numpy, matplotlib,
pillow, and the office-document libraries (python-docx, python-pptx,
openpyxl, pypdf, pdfplumber, reportlab) so typical user code and the
pre-seeded sandbox-scripts (docx/pdf/pptx/xlsx helpers) work without
pip installs at runtime.

Build from twenty-server as context so the sandbox-scripts COPY
resolves:

  cd packages/twenty-server
  docker build \
    -t twentycrm/sandbox:dev \
    -f ../twenty-docker/sandbox/Dockerfile \
    src/engine/core-modules/code-interpreter
The TwentyMCP helper injected into the sandbox calls back into Twenty's
/mcp endpoint at the URL stored in TWENTY_SERVER_URL, which was hard-coded
to the public SERVER_URL. That works for E2B (SaaS sandboxes with public
internet access) but breaks for the new DOCKER driver, where sandboxes
run on an internal-only network and can't resolve the public hostname.

Add a CODE_INTERPRETER_SERVER_URL config var that, when set, overrides
SERVER_URL for the sandbox's MCP callback. Operators running the DOCKER
driver set it to the internal server hostname (e.g. http://server:3000
under docker-compose). E2B / LOCAL deployments leave it unset and inherit
the existing behavior.
…for sandboxes

Lets operators route DOCKER-driver sandbox containers through a Docker
runtime other than runc (e.g. gVisor's runsc, Sysbox) to get stronger
isolation without changing the rest of the driver. DOCKER_SANDBOX_RUNTIME
is optional; unset keeps the current behaviour.
@krzysztof7363 krzysztof7363 force-pushed the feat/docker-code-interpreter branch from 1963146 to b201da3 Compare April 28, 2026 04:24
@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 28, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Added@​types/​dockerode@​3.3.471001007785100
Addeddockerode@​4.0.1010010010085100

View full report

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 9 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/twenty-docker/sandbox/README.md">

<violation number="1" location="packages/twenty-docker/sandbox/README.md:38">
P2: README incorrectly states bind-mounting `/home/user` keeps baked `/home/user/scripts` visible; bind mounts obscure image contents at the mount point.</violation>
</file>

<file name="packages/twenty-server/src/engine/core-modules/twenty-config/config-variables.ts">

<violation number="1" location="packages/twenty-server/src/engine/core-modules/twenty-config/config-variables.ts:675">
P1: `DOCKER_SANDBOX_NETWORK` is documented as required for Docker mode but is effectively optional, so sandbox containers can launch without an explicitly isolated network.</violation>
</file>

<file name="packages/twenty-server/src/engine/core-modules/code-interpreter/drivers/docker.driver.ts">

<violation number="1" location="packages/twenty-server/src/engine/core-modules/code-interpreter/drivers/docker.driver.ts:106">
P1: Unsanitized `file.filename` is written to host disk, allowing path traversal/arbitrary file write outside the temporary sandbox directory.</violation>

<violation number="2" location="packages/twenty-server/src/engine/core-modules/code-interpreter/drivers/docker.driver.ts:111">
P2: Directory permissions are set to `0o777` despite comment requiring sticky world-writable directories; use `0o1777` to enforce sticky-bit semantics.</violation>

<violation number="3" location="packages/twenty-server/src/engine/core-modules/code-interpreter/drivers/docker.driver.ts:160">
P2: Forced timeout kill can route execution to generic error handling, causing timeouts to return `exitCode: 1` instead of the intended timeout result.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread packages/twenty-docker/sandbox/README.md Outdated
@twenty-ci-bot-public
Copy link
Copy Markdown

twenty-ci-bot-public Bot commented Apr 28, 2026

📊 API Changes Report

GraphQL Schema Changes

GraphQL Schema Changes

[error] Error: Unable to read JSON file: /home/runner/work/twenty/twenty/main-schema-introspection.json: Not valid JSON content
at JsonFileLoader.handleFileContent (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:147:19)
at /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:95:43
at async Promise.all (index 0)
at async JsonFileLoader.load (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:88:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:15:39
at async Promise.all (index 4)
at async loadFile (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:13:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/collect-sources.js:200:25
Error: Unable to read JSON file: /home/runner/work/twenty/twenty/main-schema-introspection.json: Not valid JSON content
at JsonFileLoader.handleFileContent (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:147:19)
at /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:95:43
at async Promise.all (index 0)
at async JsonFileLoader.load (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:88:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:15:39
at async Promise.all (index 4)
at async loadFile (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:13:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/collect-sources.js:200:25
⚠️ Breaking changes or errors detected in GraphQL schema

[error] Error: Unable to read JSON file: /home/runner/work/twenty/twenty/main-schema-introspection.json: Not valid JSON content
    at JsonFileLoader.handleFileContent (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:147:19)
    at /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:95:43
    at async Promise.all (index 0)
    at async JsonFileLoader.load (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:88:9)
    at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:15:39
    at async Promise.all (index 4)
    at async loadFile (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:13:9)
    at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/collect-sources.js:200:25
Error generating diff

GraphQL Metadata Schema Changes

GraphQL Metadata Schema Changes

[error] Error: Unable to read JSON file: /home/runner/work/twenty/twenty/main-metadata-schema-introspection.json: Not valid JSON content
at JsonFileLoader.handleFileContent (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:147:19)
at /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:95:43
at async Promise.all (index 0)
at async JsonFileLoader.load (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:88:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:15:39
at async Promise.all (index 4)
at async loadFile (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:13:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/collect-sources.js:200:25
Error: Unable to read JSON file: /home/runner/work/twenty/twenty/main-metadata-schema-introspection.json: Not valid JSON content
at JsonFileLoader.handleFileContent (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:147:19)
at /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:95:43
at async Promise.all (index 0)
at async JsonFileLoader.load (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:88:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:15:39
at async Promise.all (index 4)
at async loadFile (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:13:9)
at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/collect-sources.js:200:25
⚠️ Breaking changes or errors detected in GraphQL metadata schema

[error] Error: Unable to read JSON file: /home/runner/work/twenty/twenty/main-metadata-schema-introspection.json: Not valid JSON content
    at JsonFileLoader.handleFileContent (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:147:19)
    at /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:95:43
    at async Promise.all (index 0)
    at async JsonFileLoader.load (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/json-file-loader/cjs/index.js:88:9)
    at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:15:39
    at async Promise.all (index 4)
    at async loadFile (/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/load-file.js:13:9)
    at async /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/@graphql-inspector/cli/node_modules/@graphql-tools/load/cjs/load-typedefs/collect-sources.js:200:25
Error generating diff

REST API Analysis Error

⚠️ Error occurred while analyzing REST API changes

Error Output

REST Metadata API Analysis Error

⚠️ Error occurred while analyzing REST Metadata API changes

Error Output

⚠️ Please review these API changes carefully before merging.

Five fixes flagged by review on PR twentyhq#20105:

 - Reject input filenames that contain path separators, traversal
   sequences, or NUL bytes before writing them to the host work dir.
   Closes a path-traversal write under operator-controlled hostDir
   (P1).

 - Make DOCKER_SANDBOX_NETWORK actually required when
   CODE_INTERPRETER_TYPE=DOCKER. The combo @ValidateIf+@IsOptional was
   self-cancelling — replace @IsOptional with @isdefined so a sandbox
   never launches without an explicitly isolated network (P1).

 - Move the pre-seeded sandbox-scripts/ from /home/user/scripts to
   /opt/sandbox-scripts so the driver's bind mount on /home/user no
   longer shadows them. Add PYTHONPATH so user code can import them
   without juggling sys.path. Update the README to reflect the new
   layout.

 - Use 0o1777 (sticky) for the host work dir mode instead of 0o777, to
   match the comment that already promised sticky semantics.

 - Restructure the timeout flow so a kill-induced stream rejection
   doesn't escape into the generic catch and clobber the result with
   exitCode: 1 — timeouts now reliably return exitCode: 124 with the
   "Execution timed out" error. Also gracefully handle a failing
   exec.inspect() after the kill.

Plus Prettier formatting on the touched files.
@krzysztof7363
Copy link
Copy Markdown
Contributor Author

Heads-up on the three remaining red checks (test-compose, test-app-dev, ci-test-docker-status-check):

##[error]Username and password required

All three fail at the same step — the Login to Docker Hub action — because GitHub Actions does not expose repo secrets to workflows triggered from a forked PR. Logs (e.g. job 73327560002) confirm the failure is auth-only; no code or compose-config issue.

Two ways to resolve:

  1. A maintainer pushes the branch into a twentyhq/twenty topic branch and re-opens the PR from there, which gives the workflow access to DOCKERHUB_USERNAME / DOCKERHUB_TOKEN.
  2. Or guard the Docker Hub login step with if: github.event.pull_request.head.repo.full_name == github.repository so it's skipped (with a clean pass) on fork PRs.

I'm happy to adjust the workflow to do (2) if that direction sounds reasonable — let me know. Everything else (server-test, server-integration-test (1..10), server-build, server-lint-typecheck, front-build, sdk-e2e-test, Socket Security, all front-sb-test shards, cubic, etc.) is green on the current head a8fd1f08.

@FelixMalfait FelixMalfait self-assigned this Apr 28, 2026
@FelixMalfait
Copy link
Copy Markdown
Member

FelixMalfait commented Apr 28, 2026

This is an interesting approach/contribution thanks. Before we get into this direction, which is a pretty significant choice for the project, I'd like to make sure we've explored all options!

Single-user threat model assumed; multi-tenant deployments should layer gVisor

Why not use Local Driver in that case? I'd say the main benefit of introducing a new driver is to introduce a secured way for people to enable Code Execution. This should be considered not just in the context of the code interpreter of the AI chat but ideally would be a solution that would also enable Apps and Workflow which can currently only run on AWS Lambda safely, so really not great for self-hosters and much nicer if we can find a great solution to tackle all this at once (safely, for all use-cases).

@FelixMalfait
Copy link
Copy Markdown
Member

OK I had a quick look and what you did looks promising. But I think we want to explore having gvisor as the default configuration (document default docker compose etc) and also see how your solution would apply to code execution driver (for apps/code workflow node). I feel like what you did is very interesting but need to push it further for it to make it to the core. I don't have much time to dig into it this week myself (+I'm not too versed on the devops side, will need a second opinion from someone more qualified in our team) but we'll happily review. Thanks a lot for the contribution!

@krzysztof7363
Copy link
Copy Markdown
Contributor Author

Thanks @FelixMalfait — both points make sense.

Why not LOCAL for self-hosters?

LOCAL is refused in production by code-interpreter-driver.factory.ts (intentional, since it spawns Python in the server process) — no isolation, no resource limits, no separate filesystem, and a runaway script OOM-kills the server. It works for dev only. So self-hosters in production today have exactly two options: pay for E2B, or DISABLED. The DOCKER driver is the missing third — local, resource-limited, and isolated, with hardening that scales by adding a runtime layer rather than by writing more code.

gVisor as the default

Agree. The branch already exposes the toggle (DOCKER_SANDBOX_RUNTIME); I'm happy to flip the recipe so the documented Docker Compose + image install ships with runsc as the default and the self-hoster opts out of it for environments that can't run it (no /dev/kvm-free Linux excluded). Concretely:

  • Document runsc install in packages/twenty-docker/sandbox/README.md (one binary curl + a /etc/docker/daemon.json fragment, ~2 minutes).
  • Set DOCKER_SANDBOX_RUNTIME=runsc in the recommended compose snippet, with a note for hosts that can't install it.
  • Add a startup probe that warns if the configured runtime isn't registered in the host daemon, so misconfiguration fails loudly.

Happy to push this as a follow-up commit on this branch.

Apps / Workflow code-execution driver

I haven't dug into the LogicFunctionDriver (or whatever the Apps/Workflow code-execution path uses) yet, but the dockerode + bind-mount + exec pattern in docker.driver.ts is interface-agnostic — it just needs execute(code, files, context) → result semantics. If that other path has similar shape, the driver can be lifted into a shared module and both surfaces consume it. I'd want to read that codepath before promising specifics, but I think the answer is yes and the work is small.

Two reasonable shapes for that consolidation:

  1. Keep this PR scoped to code-interpreter (current state), and open a follow-up that extracts a shared DockerSandboxDriver consumed by both the AI-chat and Apps/Workflow paths. Less change at the edges, easier to revert.
  2. In this PR, refactor first into the shared form and have both consumers wire up. Fewer cross-PR dependencies but a bigger diff for review.

I lean toward (1) — keeps this PR's review surface tight, and the shared module is a natural follow-up that you can sequence after the gVisor-default work lands.

Let me know which direction you'd prefer; I can adjust scope. No hurry on the DevOps review — happy to wait for the second opinion before the next push.

@FelixMalfait
Copy link
Copy Markdown
Member

When I said "then why not local for self-hosters" it was more a rethorical questions, since you said in a previous message docker without gvisor wasn't secured for a multi-tenant env. The real value-add for me here should be providing a good simple secure solutions for all self-hosters.

For the question of mixing concerns, it's one of those cases where I'd rather have everything in the same PR so we can have the full picture and make the right architectural decision ; I don't mind the big diff as long as it isn't slop. Let's try to do something elegant and easy/preconfigured the right way by default for self-hosters

For the docs we usually update the docs folder (twenty-docs) rather than created isolated READMEs.

Thanks a lot!

@FelixMalfait
Copy link
Copy Markdown
Member

Ok so we discussed this internally and I wonder if the right direction (so that self-hosting is easy) isn't for us to implement something like this https://github.com/rivet-dev/secure-exec ; and create a VM-like environment rather than a real env. We need to look at this option

@krzysztof7363
Copy link
Copy Markdown
Contributor Author

For Apps/Workflow (LogicFunctionDriver), secure-exec looks like a clean fit: the LAMBDA backend gets replaced by a library call. No container lifecycle.

For the AI code interpreter, the picture's more nuanced. secure-exec maps onto agent code that runs complex queries against the API without moving data through the context window — JS-as-glue-language. That's a real and valuable use case. But the data-analysis ecosystem (pandas, numpy, matplotlib, openpyxl, pypdf, pdfplumber, python-docx, python-pptx, reportlab — the libraries this PR's twentycrm/sandbox image bundles) is harder to put on V8: Pyodide (smolagents uses it) covers the numerics core but is uneven on the long tail; native CPython doesn't have that gap.

Context on this PR: I built it because I wanted to play with Twenty's AI features without paying E2B and without running Python on my host. It solves that problem today — the only other options are LOCAL (refused in production), E2B (paid), or DISABLED. Compared to those, a Docker driver (with gVisor as a hardening option) is a meaningful improvement for self-hosters who want the Python data-analysis path now.

For small deployments, Docker may be good enough. In a single-tenant environment, using the Docker executor (even without gVisor) is less about security and more about containerizing the execution environment. For a multi-tenant hosted environment, E2B (or a similar solution, e.g., CubeSandbox if one wants to self-host a compatible one) is the way to go.

@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, DockerDriver enables a read-only root filesystem but does not provide a writable /tmp (tmpfs/bind) or set TMPDIR, so Python tempfile usage will fail and bundled sandbox-scripts/user code will crash.

Severity: action required | Category: correctness

How to fix: Mount writable /tmp tmpfs

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

ReadonlyRootfs: true makes /tmp read-only unless explicitly mounted. Python and bundled sandbox scripts use tempfile which defaults to /tmp, so executions can fail.

Issue Context

Docker sandboxes currently only bind-mount /home/user and do not configure Tmpfs.

Fix Focus Areas

  • packages/twenty-server/src/engine/core-modules/code-interpreter/drivers/docker.driver.ts[143-153]

Suggested changes

  • Add HostConfig.Tmpfs: { '/tmp': 'rw,noexec,nosuid,nodev,size=64m' } (size configurable if needed).
  • Optionally set Env to include TMPDIR=/home/user/tmp and create that directory on the host mount, but tmpfs on /tmp is the most compatible.

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Found by Qodo. Free code review for open-source maintainers.

@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, DOCKER_SANDBOX_NETWORK validation allows empty strings but the factory converts empty string to undefined, silently omitting NetworkMode and running sandboxes on the default Docker network (likely with egress).

Severity: action required | Category: security

How to fix: Reject empty network strings

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

DOCKER_SANDBOX_NETWORK is intended to be required for DOCKER mode, but @IsDefined() allows ''. The factory then drops '' via network || undefined, silently disabling sandbox network isolation.

Issue Context

The README expects an internal network (no egress) shared with the server container.

Fix Focus Areas

  • packages/twenty-server/src/engine/core-modules/twenty-config/config-variables.ts[677-688]
  • packages/twenty-server/src/engine/core-modules/code-interpreter/code-interpreter-driver.factory.ts[96-104]

Suggested changes

  • Add @IsNotEmpty() (or @MinLength(1)) to DOCKER_SANDBOX_NETWORK under the existing ValidateIf.
  • Alternatively (or additionally), remove the network || undefined coercion and let Dockerode receive the empty string as a misconfiguration that errors early, but config validation is preferred.

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Found by Qodo code review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants