Skip to content

batches: Mount paths into a container#770

Merged
Piszmog merged 37 commits intomainfrom
rc/batch-changes-mount
Jun 14, 2022
Merged

batches: Mount paths into a container#770
Piszmog merged 37 commits intomainfrom
rc/batch-changes-mount

Conversation

@Piszmog
Copy link
Copy Markdown
Contributor

@Piszmog Piszmog commented Jun 7, 2022

Closes #31790.

Add the ability to mount a path on the local machine into a container during a step.

A path can point to a file (./foo.sh or /bar/foo.sh) or a directory (./bar or /foo/bar/). The path can be an absolute path or a relative path (for example ./foo.sh). Regardless if the path is absolute or relative, the path must be within the same directory as the batch spec that is being ran (the batch spec directory is considered the "working directory").

Validation/handling of the mount field in a step is handled in the src-cli versus batcheslib in sourcegraph/sourcegraph. The reason for this was the avoid having conditional logic to figure out if parseBatchSpec is being called on a local machine vs behind an API. This also contains the file system logic within the context of src-cli which is dealing with the file system.

This PR also includes some general fixes I saw while in the code (for example, package and variable naming collision).

Test plan

Go Unit Tests and manual testing.

Examples

Below are examples of mounting a path that contains a script (python or Go binary in this case) that is ran in the run field of a step.

Python

Example of running a python script that is mounted into the container and ran in a step.

Batch Spec
name: python-script-test
description: Use a python script
on:
  - repositoriesMatchingQuery: file:README.md
steps:
  - run: python /tmp/updater.py
    container: python:latest
    mount:
      - path: ./updater.py
        mountpoint: /tmp/updater.py
changesetTemplate:
  title: Hello from Python
  body: Updated using a Python Script
  branch: python-script-test
  commit:
    message: Append Hello World to all README.md files
Script
#!/usr/bin/env python3
import os.path


def main():
    exists = os.path.exists('README.md')
    if exists:
        with open('README.md', 'a') as f:
            f.write('\nHello from python')


if __name__ == "__main__":
    main()
Changesets

Screen Shot 2022-06-07 at 14 43 47

Binary

Example of running a binary (written in Go) that is mounted into the container and ran in a step.

Batch Spec
name: binary-test
description: Use a binary
on:
  - repositoriesMatchingQuery: file:README.md
steps:
  - run: /tmp/updater
    container: alpine:latest
    mount:
      - path: ./updater # the linux binary
        mountpoint: /tmp/updater
changesetTemplate:
  title: Hello from a binary
  body: Updated using a binary
  branch: binary-test
  commit:
    message: Append Hello World to all README.md files
Binary

Binary was built from Go

package main

import "os"

func main() {
	fileInfo, err := os.Stat("README.md")
	if err != nil {
		panic(err)
	}
	if fileInfo.IsDir() {
		panic("file is a directory")
	}

	f, err := os.OpenFile("README.md", os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
	if err != nil {
		panic(err)
	}
	defer f.Close()

	if _, err = f.WriteString("\nhello from go"); err != nil {
		panic(err)
	}
}

Built with the following command,

env GOOS=linux GOGOARCH=amd64 go build -ldflags="-s -w"

Changesets

Screen Shot 2022-06-07 at 14 44 50

Directory

The above examples have the path pointing to a specific file. Directories can also be mounted in the container. This example reuses the above Python Example but mounts a directory that contains the script.

Batch Spec
name: python-script-test
description: Use a python script
on:
  - repositoriesMatchingQuery: file:README.md
steps:
  - run: python /tmp/scripts/updater.py
    container: python:latest
    mount:
      - path: ./scripts
        mountpoint: /tmp/scripts
changesetTemplate:
  title: Hello from Python
  body: Updated using a Python Script
  branch: python-script-test
  commit:
    message: Append Hello World to all README.md files
Changesets

Screen Shot 2022-06-07 at 14 42 45

@Piszmog Piszmog marked this pull request as ready for review June 7, 2022 21:07
@Piszmog Piszmog requested a review from a team June 7, 2022 21:07
@courier-new
Copy link
Copy Markdown
Contributor

Thank you for such a thorough write-up of your testing examples!! 😄

Copy link
Copy Markdown
Contributor

@courier-new courier-new left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defer to others on the team for a heavier review of the docker-specific interactions, but the mounted file handling and other changes look great to me. 🙂

Comment thread internal/batches/service/service.go Outdated
Comment thread internal/batches/service/service_test.go
Copy link
Copy Markdown
Member

@eseliger eseliger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR description and testing! I left a few comments inline

Comment thread cmd/src/batch_common.go Outdated
Comment thread internal/batches/service/service.go Outdated
}
// Update the mount path to the absolute path so building the absolute path (above) does not need to be
// redone when adding the mount argument to the Docker container.
step.Mount[j].Path = p
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this affect caching somehow? Sorry, I feel like I should know, but sadly I don't 👎

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any issues with caching, but I am not super familiar with the cache yet.

Actually, I was having issues with the cache where I would only make a change to the script file and would rerun the batch spec and nothing would change because of the cache (for example changing the text to append to the README.md from Hello from python to Hello from python1).

I was getting slightly annoyed with this issue, and I am sure another user will too. Is this something we want to deal with in this Issue or in a follow up Issue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw the flag -clear-cache. So I guess, it is really not a problem? Maybe we should have some docs around the mount and cache and communicate that the -clear-cache flag is available when updating the script[s]?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually be inclined to go the other way, and start by not using or writing to the cache at all if mount is in use to avoid this problem.

In a follow up, I think we could start figuring out how to use the cache in some ways — we probably don't want to hash every file in mount (since some users will probably be mounting large binaries or other files), but maybe we could come up with a make style heuristic around last modification time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LawnGnome what is the best way to go about creating a followup to "re-implement" caching for mount functionality? Create an Issue and talk about it? Design sesh? RFC?

Comment thread internal/batches/executor/run_steps.go Outdated
Comment thread internal/batches/executor/run_steps.go Outdated
Comment thread internal/batches/service/service_test.go
Comment thread internal/batches/service/service.go Outdated
Comment thread cmd/src/batch_common.go
Comment thread cmd/src/batch_common.go Outdated
Copy link
Copy Markdown
Contributor

@LawnGnome LawnGnome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good! I have a few questions, but nothing fundamental to how this is implemented.

Comment thread cmd/src/batch_common.go Outdated
Comment thread internal/batches/executor/coordinator.go
Comment thread internal/batches/executor/executor.go
Comment thread internal/batches/service/service.go Outdated
Comment thread cmd/src/batch_common.go Outdated
Comment thread internal/batches/executor/run_steps.go Outdated
@LawnGnome
Copy link
Copy Markdown
Contributor

It would appear that Erik and my owls crossed in mid-air. Sorry!

@Piszmog Piszmog requested a review from eseliger June 13, 2022 16:21
Copy link
Copy Markdown
Member

@eseliger eseliger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, very clean :)

Comment thread internal/batches/service/service.go
Comment thread internal/batches/executor/executor.go
@Piszmog Piszmog merged commit 75fa7f1 into main Jun 14, 2022
@Piszmog Piszmog deleted the rc/batch-changes-mount branch June 14, 2022 14:59
@malomarrec
Copy link
Copy Markdown

Very cool! I like that you can mount an entire directory as well 💯 I can think of many users that are gonna love this right away.

scjohns pushed a commit that referenced this pull request Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mount files on steps containers for local runs(minimal)

5 participants