-
Notifications
You must be signed in to change notification settings - Fork 100
Closed
Labels
core-infraHelpful infrastructureHelpful infrastructure
Description
Use cases, pain points, and background
Gym should be able to use the Ray cluster to parallelize CPU-intensive tasks across multi nodes. This is necessary to support faster training and inference for environments like Comp coding or SWE agents.
Description:
Ray infra support in Gym
Design:
- Configure ray in gym to use the ray cluster deployed by NeMo-RL
- Refactor comp coding from using subprocess for verification to use ray.remote calls
- Refactor comp coding to use ray.remote calls over multiple GPU nodes
- Documentation on how to use ray in resource servers.
Out of scope:
- Configure detected CPU nodes for gym task. This will be part of NeMo-RL and will be worked on later
Acceptance Criteria:
- Speed-up in training step time for Comp coding environment
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
core-infraHelpful infrastructureHelpful infrastructure