Closed
Conversation
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Add copy-pr-bot
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Add initial repo template
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Khushi Bhardwaj <kbhardwaj@nvidia.com> Co-authored-by: bxyu-nvidia <bxyu@nvidia.com>
Migrated over from gitlab: - Display aggregate metrics - Aggregate generic keys using multineedle - Display other dynamic aggregations - Count string totals and unique values - Remove TrainDataProcessor dependency, add test - Remove dupe file read, fix arg types hints --------- Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
…nfo (#27) Signed-off-by: Brian Yu <bxyu@nvidia.com>
updated the following logging print when running ng_prepare_data from, for example: "Found 0 agent server instance configs withOUT datasets:" to "Found 0 agent server instance configs WITHOUT datasets:" to match the format of the subsequent logs, for example: "Found 1 agent server instance configs WITH datasets:" Signed-off-by: chrismun <cmunley@nvidia.com>
update readme for resources servers for updated cli Signed-off-by: chrismun <cmunley@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
## DRAFT - Seeking Additional Input ### What's Complete - Types of contributions and priorities - Development setup and workflow - DCO and commit signing (complete guide) - CI/CD requirements and troubleshooting - Quality control checklist for resource servers - Common issues and troubleshooting ### What Needs Input #### Resource Server Guidelines - These need to be updated for OSS community users @banghuaz-nvidia #### RL Framework Integrations - I proposed a checklist of things, but need @bxyu-nvidia to help --- Addresses #132 --------- Signed-off-by: Chris Wing <cwing@nvidia.com> Signed-off-by: Brian Yu <bxyu@nvidia.com> Co-authored-by: Brian Yu <bxyu@nvidia.com>
…id being mistaken as a secret Signed-off-by: Christopher Z. Cui <czcui@ucsd.edu>
…ing secret checker Signed-off-by: Christopher Z. Cui <czcui@ucsd.edu>
…omehow Signed-off-by: Christopher Z. Cui <czcui@ucsd.edu>
…atures (#155) - Add per-record regex extraction from template_metadata.output_regex - Add full generation rescue when regex extraction fails (partial credit) - Add length-based threshold to skip regex for long answers (>120 chars) - Add 3 new tests covering all new features (7/7 passing) - Add example_openqa.jsonl with 5 diverse examples + rollouts + metrics - Update README with new config fields and accurate defaults - Optimize defaults for OpenQA while maintaining backward compatibility All features only activate when template_metadata.output_regex is present, making them safe for existing datasets without template_metadata. --------- Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
…port STEM MCQA dataset (#128) Adds support for custom answer extraction in MCQA resources server via the optional `template_metadata.output_regex` field. This enables handling STEM datasets with custom prompt formats that don't match the standard grading modes. --------- Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
#193 --------- Signed-off-by: Sugam Devare <sdevare@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Sugam Devare <sdevare@nvidia.com>
This was referenced Oct 27, 2025
…d in pull request Signed-off-by: Christopher Zhang Cui <czcui@ucsd.edu>
Signed-off-by: Christopher Zhang Cui <czcui@ucsd.edu>
This was referenced Oct 27, 2025
…ating README Signed-off-by: Christopher Zhang Cui <czcui@ucsd.edu>
98c6cb9 to
96f1854
Compare
Contributor
|
Sorry folks, this PR was mistakenly closed when one of our folks mistakenly force-pushed diverging refs to Github. We are looking to remedy this and re-open the PR. |
Contributor
|
Replacement PR opened here: #874 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Contributing To NeMo-Gym — PR Answers (TALES Resource Server)
1) Necessary information
i. Corresponding dataset on the spreadsheet: N/A
ii. Description of the prompt (source + domain):
iii. Description of the environment:
iv. Description of the verifier:
v. Legal approval status:
N/A
2) Simple correctness check
i Commands used to run the server for the uploaded data:
ii Resulting rollout and judges (5 examples):
Please see examples under data/gpt4o_single_turn_examples
iii Additional notes for running the server properly:
Please see the README.md under example_scripts/single_turn/ for more details
examples_clean are the stripped down input-output for ease of viewing.
examples_full contain the entire input-output.examples_full have been removed due to the response id triggering the secret detector.3) Tests
Test files / command to run tests:
Notes on coverage / responsibilities:
The use of the walkthroughs for Step 4 implicitly acts as a unit test for the environments. If needed, a test can be added verifying the outcome of the walkthrough. (Am wanting to wait until the actual multi-turn is working before I added this)
4) Reward profiling
Models:
We generate 500+ prompt-response pairs for the specified model. As TALES is inherently multi-turn, not every step on a correct trajectory will return a reward. We do the following to emulate the reward distribution for single-turn domains.
Method (from README):
/no_thinkto the user input.Command used:
examples_clean are the stripped down input-output for ease of viewing.
examples_full contain the entire input-output.examples_full have been removed due to the response id triggering the secret detector.Report the reward distribution (percent all-correct / all-incorrect / mixture):
See the outputs under data/single_turn_rollouts
5) Training-based correctness check (after NeMo Gym + NeMo RL integration)
N/A (Was told this isn't ready yet)
6) PR Check and Review
Reviewer (independent reproduction):
Prithviraj Ammanabrolu (pammanabrolu@nvidia.com)
Reviewer checklist:
Signing Your Work
All commits include a DCO sign-off:
git commit -s -m "Add TALES resource server integration and examples"Pointers to examples & docs (from repo layout)
Single-turn examples & generator scripts:
resources_servers/tales/example_scripts/single_turn/generate_single_turn_gpt_rollouts.py(5 GPT-4o examples)generate_single_turn_rollouts.py(~500+ prompts for reward profiling)Multi-turn drafts & notes: see
example_scripts/multi_turn/and notes in READMESample data:
data/single_turn_rollouts/example_clean.jsonl(referenced in README)Environment / Setup Recap
Java required (ScienceWorld only):
sudo apt-get update && sudo apt-get install -y default-jre default-jdkStart vLLM (example):
Start NeMo Gym server for TALES: