-
Notifications
You must be signed in to change notification settings - Fork 89
Description
I'm starting to think about what the deployment options are for cuPyNumeric, and I'd like to get your advice.
Here are the properties that I think I'd like to see in any potential solution:
- All dependencies are listed in one place. Note that my project depends on cuPyNumeric as well as other Python libraries, and also tools like formatters (ruff) and type checkers (basedpyright).
- Dependencies should be listed with approximate versions (e.g.,
numpy>=1.25.2,<2or similar). - There should also be a way to record the exact versions of packages actually installed in my deployment. (A.k.a., a "lock" file in Cargo/uv parlance.)
- All of the above should be portable (e.g., I can install the same package versions on both x86 and ARM without losing the precise version information.)
- Ideally tools should interact with the above in a clean way. E.g., editing the NumPy dependency to
numpy>=2.0.0,<2.3should result in minimally re-solving the environment. I should never be throwing my environment away completely and starting from scratch. - Everything above should be deployable even if all access to the actual installed environment is lost (i.e., everything important should be checked into Git).
- Ideally, all of the above should be the same deployment I use for actual performance runs. I.e., the same system should be used on laptops and supercomputers.
My impression is that for a pure-Python project, where all dependencies are available in PyPI, this is possible to achieve today with the uv package manager. You write a pyproject.toml file which tracks your dependencies. E.g.:
https://github.com/scipy/scipy/blob/main/pyproject.toml#L49
Then when you install the project, uv automatically generates a uv.lock which records what you actually installed. Notably, this is all automatic and mostly works seamlessly.
https://docs.astral.sh/uv/guides/projects/#project-structure
By contrast, as far as I'm aware, Conda's support is extremely poor. You can create YAML files that correspond to an environment, but they have loose (approximate) dependencies only. Conda has no native notion of a lock file. While you can generate one manually (by "freezing" the list of installed packages to a file), there are no tools to work with such files automatically. Conda has no notion of a minimal re-solve based on an edit to a environment YAML file (or even any notion of updating an environment at all; once you create your environment you either manually conda install new packages or you destroy it and start over from scratch).
I've heard that cuPyNumeric supports PyPI but I'm not sure how high quality this solution actually is (e.g., does it automatically pull in CUDA, NCCL, UCX, etc. as required?). Meanwhile, everyone I know deploys cuPyNumeric via Conda, but based on my understanding of the state of Conda based tools, there really is no good solution to be found in that direction.
Also consider that I often require access to experimental builds and if those are not made to PyPI then in many cases I'll be out luck.
If there are any best practices, or thoughts on how to do this, I'd appreciate them. And perhaps it would be good to document this as well.