Skip to content

Add cache mounts to docker files#1452

Merged
arkid15r merged 13 commits intoOWASP:mainfrom
ahmedxgouda:feature/introduce-docker-cache-mounts
May 2, 2025
Merged

Add cache mounts to docker files#1452
arkid15r merged 13 commits intoOWASP:mainfrom
ahmedxgouda:feature/introduce-docker-cache-mounts

Conversation

@ahmedxgouda
Copy link
Collaborator

@ahmedxgouda ahmedxgouda commented May 1, 2025

Resolves #1449

The last build I made was 10x faster than the normal (The frontend container was built in milliseconds regarding the last update to the packages). The packages are persistent even between different builds (i.e. test and local).
Screencast from 2025-05-02 07-58-14.webm

@ahmedxgouda ahmedxgouda requested review from arkid15r and kasya as code owners May 1, 2025 16:30
@coderabbitai
Copy link
Contributor

coderabbitai bot commented May 1, 2025

Summary by CodeRabbit

  • Chores
    • Improved Docker build efficiency for backend, frontend, and documentation images by introducing build cache mounts for dependency managers (Poetry and pnpm).
    • Standardized environment variables and user/group ID management across Dockerfiles for consistency and better caching.
    • Enhanced Docker build reliability with explicit system updates and upgrades.
    • Added "pypoetry" to the custom dictionary for spell checking.
  • Style
    • Minor reordering and clarification of Dockerfile environment variable declarations for readability.
      """

Summary by CodeRabbit

  • Chores
    • Improved Docker build efficiency for backend, frontend, and documentation environments by adding dependency cache mounts and explicit environment variables for user and group IDs.
    • Enhanced Dockerfiles to set up dedicated cache directories for Poetry (Python) and pnpm (Node.js), with improved ownership and permissions.
    • Updated package installation steps for better caching and reliability.
    • Added "pypoetry" to the custom spell-check dictionary.

Summary by CodeRabbit

  • Chores
    • Improved Docker build performance for backend, frontend, and documentation environments by adding cache mounts for dependency installation steps.
    • Adjusted Poetry and pnpm installation commands to optimize dependency caching and installation.
    • Ensured virtual environment directories are properly copied and permissions set in runtime images.
    • No changes to application features or user-facing functionality.

Walkthrough

This change updates several Dockerfiles across backend, frontend, and documentation components to optimize dependency installation and caching. For Python-based images, the installation of Poetry is modified to remove the --no-cache-dir flag, and build cache mounts are introduced targeting Poetry's cache directory to speed up dependency installation. The virtual environment directory is explicitly copied into runtime images with correct permissions and ownership. For Node.js-based images, cache mounts are added for pnpm's store directory during dependency installation, and some Dockerfiles separate pnpm installation and dependency installation into distinct RUN steps. No changes were made to exported or public code entities.

Changes

Files Change Summary
backend/docker/Dockerfile
backend/docker/Dockerfile.local
backend/docker/Dockerfile.test
Poetry installation switched to omit --no-cache-dir, build cache mounts added for Poetry cache directory during poetry install, and explicit copying of the virtual environment directory (.venv or /home/owasp/.venv) from builder to runtime stage with permissions and ownership. No changes to exported/public entities.
docs/docker/Dockerfile.local Poetry installed without --no-cache-dir, build cache mount for Poetry cache directory added, and explicit copy of virtual environment directory from builder to runtime with correct permissions and ownership. No changes to exported/public entities.
frontend/docker/Dockerfile
frontend/docker/Dockerfile.local
frontend/docker/Dockerfile.e2e.test
frontend/docker/Dockerfile.unit.test
Added cache mounts for pnpm store during pnpm install in all relevant Dockerfiles. Some Dockerfiles split pnpm installation and dependency installation into separate RUN commands. In Dockerfile.local, the copy step for node_modules was moved but functionally unchanged. No changes to exported/public entities.
cspell/custom-dict.txt Added the word "pypoetry" to the custom dictionary.

Assessment against linked issues

Objective Addressed Explanation
Use Docker cache mounts to cache packages between builds (#1449)

Suggested reviewers

  • kasya
  • arkid15r
    """

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3be4c59 and 9d46437.

📒 Files selected for processing (8)
  • backend/docker/Dockerfile (1 hunks)
  • backend/docker/Dockerfile.local (3 hunks)
  • backend/docker/Dockerfile.test (1 hunks)
  • cspell/Dockerfile (1 hunks)
  • docs/docker/Dockerfile.local (1 hunks)
  • frontend/docker/Dockerfile.e2e.test (1 hunks)
  • frontend/docker/Dockerfile.local (1 hunks)
  • frontend/docker/Dockerfile.unit.test (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • backend/docker/Dockerfile
  • frontend/docker/Dockerfile.unit.test
  • frontend/docker/Dockerfile.local
  • backend/docker/Dockerfile.local
  • frontend/docker/Dockerfile.e2e.test
  • cspell/Dockerfile
  • docs/docker/Dockerfile.local
  • backend/docker/Dockerfile.test
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (python)
  • GitHub Check: CodeQL (javascript-typescript)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added backend frontend docker Pull requests that update Docker code labels May 1, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (10)
backend/docker/Dockerfile (2)

6-6: Pin Poetry version & cache pip for reproducibility
Rather than installing the latest Poetry arbitrarily, pin to a known version (e.g. poetry==1.6.1) and consider mounting pip’s cache directory to speed up pip install poetry across builds.

Example diff:

-    python -m pip install poetry
+    python -m pip install "poetry==1.6.1"
+RUN --mount=type=cache,target=/root/.cache/pip \
+    python -m pip install "poetry==1.6.1"

37-37: Avoid overwriting .venv permissions on second copy
The subsequent COPY /home/owasp will merge in .venv again, potentially resetting its mode to 555. Instead, copy application files explicitly to prevent re-copying the virtualenv:

- COPY --from=builder --chmod=555 --chown=owasp:owasp /home/owasp /home/owasp
+ COPY --from=builder --chmod=555 --chown=owasp:owasp \
+     /home/owasp/apps /home/owasp/manage.py /home/owasp/wsgi.py \
+     /home/owasp/settings /home/owasp/static /home/owasp/templates \
+     /home/owasp/entrypoint.sh \
+     /home/owasp/
backend/docker/Dockerfile.local (3)

7-7: Pin Poetry version & cache pip in builder
Consider pinning Poetry (poetry==1.6.1) for reproducible builds and mounting pip’s cache to speed up installs:

-    python -m pip install poetry
+RUN --mount=type=cache,target=/root/.cache/pip \
+    python -m pip install "poetry==1.6.1"

27-27: Install Poetry in runtime stage may be unnecessary
In a local dev image you may need Poetry, but duplicating the install inflates final image size. Consider removing or copying the binary from the builder stage instead.


37-37: Prevent double-copy of .venv
After copying .venv, the full-directory copy will merge it again. Extract only code assets instead to preserve permissions and reduce layer size.

backend/docker/Dockerfile.test (2)

7-7: Pin Poetry version & cache pip
Lock Poetry to a specific version and shadow pip’s cache into a BuildKit mount to speed up repeated builds.

-    python -m pip install poetry
+RUN --mount=type=cache,target=/root/.cache/pip \
+    python -m pip install "poetry==1.6.1"

42-42: Selective copy to avoid .venv overwrite
Rather than copying the entire /home/owasp, explicitly list code & config files to ensure .venv stays intact with correct permissions.

docs/docker/Dockerfile.local (2)

9-9: Pin Poetry version & cache pip
For deterministic docs builds, pin Poetry and mount pip’s cache:

-    python -m pip install poetry
+RUN --mount=type=cache,target=/root/.cache/pip \
+    python -m pip install "poetry==1.6.1"

40-40: Avoid double .venv copy in docs image
Copying the full /home/owasp after .venv merges the venv twice. Use a selective copy of docs sources to keep layers minimal and permissions correct.

frontend/docker/Dockerfile.local (1)

13-15: Consider adding explicit cache IDs for stability
Your cache mounts look correct for the local build, but adding an id ensures the same cache is reused across layers and builds.

Example enhancement:

-RUN --mount=type=cache,target=/root/.pnpm-store \
-    --mount=type=cache,target=/home/owasp/node_modules \
-    pnpm install --frozen-lockfile --ignore-scripts
+RUN --mount=type=cache,id=pnpm-store,target=/root/.pnpm-store \
+    --mount=type=cache,id=node-modules,target=/home/owasp/node_modules \
+    pnpm install --frozen-lockfile --ignore-scripts
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00350d9 and 74a03e6.

📒 Files selected for processing (8)
  • backend/docker/Dockerfile (3 hunks)
  • backend/docker/Dockerfile.local (4 hunks)
  • backend/docker/Dockerfile.test (3 hunks)
  • docs/docker/Dockerfile.local (3 hunks)
  • frontend/docker/Dockerfile (1 hunks)
  • frontend/docker/Dockerfile.e2e.test (1 hunks)
  • frontend/docker/Dockerfile.local (1 hunks)
  • frontend/docker/Dockerfile.unit.test (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: CodeQL (javascript-typescript)

@ahmedxgouda
Copy link
Collaborator Author

Oops, something I missed in the e2e and unit files.

@ahmedxgouda ahmedxgouda marked this pull request as draft May 1, 2025 17:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5023cc9 and eee5fc6.

📒 Files selected for processing (1)
  • cspell/Dockerfile (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: CodeQL (javascript-typescript)

Comment on lines +9 to 10
RUN --mount=type=cache,id=pnpm,target=/pnpm/store \
pnpm install --frozen-lockfile --ignore-scripts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure cache mount aligns with pnpm’s actual store directory
The build cache is mounted at /pnpm/store, but pnpm’s default store path is typically under the user’s home (e.g., ~/.pnpm-store). Without explicitly configuring pnpm to use /pnpm/store, the cache mount will not be utilized.

Update the install command to direct pnpm’s store into your cache mount:

 RUN --mount=type=cache,id=pnpm,target=/pnpm/store \
-    pnpm install --frozen-lockfile --ignore-scripts
+    pnpm install --frozen-lockfile --ignore-scripts --store-dir=/pnpm/store

Alternatively, set it globally before installation:

RUN pnpm config set store-dir=/pnpm/store

This ensures the cache mount is effective and speeds up subsequent builds.

🤖 Prompt for AI Agents (early access)
In cspell/Dockerfile around lines 9 to 10, the pnpm cache mount is set to /pnpm/store but pnpm by default uses a different store directory, so the cache is not utilized. Fix this by configuring pnpm to use /pnpm/store as its store directory either by adding a command before installation to set the store-dir config globally or by passing the store-dir option directly in the install command. This ensures the cache mount is effective and speeds up builds.

COPY package.json pnpm-lock.yaml ./

RUN npm install --ignore-scripts -g pnpm && \
RUN npm install --ignore-scripts -g pnpm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Pin pnpm version for reproducible builds
Installing the latest pnpm without a version specifier can lead to unpredictable build failures when a new pnpm release is published.

Consider pinning pnpm to a known stable version, for example:

 RUN npm install --ignore-scripts -g pnpm
+RUN npm install --ignore-scripts -g pnpm@8.8.0

You can bump the @8.8.0 to whichever version aligns with your team’s compatibility requirements.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN npm install --ignore-scripts -g pnpm
RUN npm install --ignore-scripts -g pnpm
RUN npm install --ignore-scripts -g pnpm@8.8.0
🤖 Prompt for AI Agents (early access)
In cspell/Dockerfile at line 7, the RUN command installs pnpm without specifying a version, which can cause unpredictable build failures. Modify the command to pin pnpm to a specific stable version by appending the version number after the package name, for example, use "pnpm@8.8.0" instead of just "pnpm" to ensure reproducible builds.

@ahmedxgouda
Copy link
Collaborator Author

@coderabbitai Do you see any issues? Is the cache dir for poetry specified correctly?

@coderabbitai
Copy link
Contributor

coderabbitai bot commented May 1, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check Poetry's documentation about cache directory
echo "Checking Poetry documentation for cache directory information..."
curl -s https://python-poetry.org/docs/configuration/ | grep -i cache

# Check backend Dockerfiles for the Poetry cache mount implementation
echo -e "\nChecking backend Dockerfiles for Poetry cache implementation:"
grep -r --include="*/docker/Dockerfile*" "poetry" backend/

# Check all Dockerfiles for cache mount directives
echo -e "\nChecking all Dockerfiles for cache mount directives:"
grep -r --include="*/docker/Dockerfile*" "cache" .

# Check if test files have proper cache mount configuration
echo -e "\nChecking e2e and unit test files specifically:"
grep -r --include="*e2e.test" "cache" .
grep -r --include="*unit.test" "cache" .

Length of output: 3504


🏁 Script executed:

#!/bin/bash
# Check actual Dockerfile contents to see the Poetry cache implementation
echo "Examining backend Dockerfile content:"
cat backend/docker/Dockerfile

echo -e "\nExamining backend Dockerfile.test content:"
cat backend/docker/Dockerfile.test

echo -e "\nExamining docs Dockerfile content:"
cat docs/docker/Dockerfile.local

# Look at PR diff for these files to see specifically what was changed
echo -e "\nChecking PR diff for backend Dockerfile:"
git diff origin/main -- backend/docker/Dockerfile

echo -e "\nChecking PR diff for test files that might be missing cache mounts:"
git diff origin/main -- frontend/docker/Dockerfile.e2e.test
git diff origin/main -- frontend/docker/Dockerfile.unit.test

Length of output: 6186


@ahmedxgouda Looking at the Dockerfiles, I spotted a potential issue with the Poetry cache directory. While the cache mounts have been successfully implemented, there's a user/permissions mismatch that might cause problems:

  1. You're setting the cache mount target to /root/.cache/pypoetry but running Poetry as the owasp user.

According to Poetry's documentation, the default cache directory on Linux is ~/.cache/pypoetry, which for the owasp user would translate to /home/owasp/.cache/pypoetry.

The current implementation might work because Docker's BuildKit runs the RUN command with root privileges during build time (even when USER is set to owasp), but it's not ideal from a permissions perspective.

For better alignment with best practices, consider one of these approaches:

# Option 1: Set POETRY_CACHE_DIR to a location writable by owasp user
ENV POETRY_VIRTUALENVS_IN_PROJECT=true \
+   POETRY_CACHE_DIR=/tmp/poetry-cache \
    PYTHONUNBUFFERED=1

# Option 2: Use uid/gid in the mount specification
-RUN --mount=type=cache,target=/root/.cache/pypoetry \
+RUN --mount=type=cache,target=/home/owasp/.cache/pypoetry,uid=1000,gid=1000 \
    poetry install --no-root --without dev --without test

The implementation in frontend e2e and unit test files looks correct with the pnpm cache mounts properly configured.

@ahmedxgouda ahmedxgouda marked this pull request as ready for review May 2, 2025 04:44
@ahmedxgouda
Copy link
Collaborator Author

@arkid15r Could you review it please?

@arkid15r arkid15r enabled auto-merge May 2, 2025 18:46
@sonarqubecloud
Copy link

sonarqubecloud bot commented May 2, 2025

Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's try this 👍

@arkid15r arkid15r added this pull request to the merge queue May 2, 2025
Merged via the queue into OWASP:main with commit 56866ab May 2, 2025
22 checks passed
@ahmedxgouda ahmedxgouda deleted the feature/introduce-docker-cache-mounts branch May 21, 2025 23:23
shdwcodr pushed a commit to shdwcodr/Nest that referenced this pull request Jun 5, 2025
* Add cache mounts to docker files

* Fix docker e2e and unit test syntax

* Apply coderabbit suggestion

* Fix spelling in backend

* Refactor Dockerfiles to use consistent cache mount for pnpm

* Fix typo in cache mount path in docs Dockerfile.local

* Fix typo in cache mount path in backend Dockerfile

* Add cache mount to cspell

* Fix poetry cache

* Add PNPM Home environement variable

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
@coderabbitai coderabbitai bot mentioned this pull request Dec 31, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend docker Pull requests that update Docker code frontend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce Docker Cache Mounts

2 participants