feat (OpenQA): Add OpenQA support with per-record regex and rescue features by psgundecha-nv · Pull Request #155 · NVIDIA-NeMo/Gym

psgundecha-nv · 2025-10-14T17:34:24Z

Add per-record regex extraction from template_metadata.output_regex
Add full generation rescue when regex extraction fails (partial credit)
Add length-based threshold to skip regex for long answers (>120 chars)
Add 3 new tests covering all new features (7/7 passing)
Add example_openqa.jsonl with 5 diverse examples + rollouts + metrics
Update README with new config fields and accurate defaults
Optimize defaults for OpenQA while maintaining backward compatibility

All features only activate when template_metadata.output_regex is present,
making them safe for existing datasets without template_metadata.

Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

- Add check_full_generation_on_fail config to compare expected vs full generation when first pass fails - Add reward_if_full_generation_succeeds config for partial credit (default 0.5) - Implement fallback logic: use full generation when use_per_record_regex=true, swap otherwise - Add length threshold optimization to skip swap check for very long answers - Maintain 100% backward compatibility with existing behavior - All existing tests pass Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

- Add per-record regex extraction from template_metadata.output_regex - Add full generation rescue when regex extraction fails (partial credit) - Add length-based threshold to skip regex for long answers (>120 chars) - Add 3 new tests covering all new features (7/7 passing) - Add example_openqa.jsonl with 5 diverse examples + rollouts + metrics - Update README with new config fields and accurate defaults - Optimize defaults for OpenQA while maintaining backward compatibility All features only activate when template_metadata.output_regex is present, making them safe for existing datasets without template_metadata. Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

copy-pr-bot · 2025-10-14T17:34:28Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

banghuaz-nvidia · 2025-10-17T18:16:57Z

Do we already have a ready-to-train dataset on HF? If so can we put the dataset link and instructions to both config.yaml and readme? This applies to both MCQA and OpenQA.

banghuaz-nvidia · 2025-10-17T17:50:18Z

resources_servers/equivalence_llm_judge/data/example_openqa.jsonl

@@ -0,0 +1,5 @@
+{"responses_create_params": {"input": [{"role": "user", "content": "Your final answer (and only the answer) must be enclosed in double parentheses. Solve the problem and include necessary explanations.\n\nConsider\n\\[\n\\frac{dx}{dt}=-x-e^{-rx}+\\sqrt{2}\\alpha+\\frac{3}{2},\n\\]\nwhere \\(r>0\\).\n\nWhat is the nearest bifurcation point to 0?"}]}, "expected_answer": "\\[-\\frac{\\sqrt{2}}{4}\\]", "uuid": "6279204c-9a94-5755-96be-5c313796d3b0", "reward_profiles": [{"model_hf_path": "Qwen/Qwen3-30B-A3B", "num_generations": 3, "pass_rate": 1.0}], "template_metadata": {"template_id": "openqa_generated_153", "template_prompt": "Your final answer (and only the answer) must be enclosed in double parentheses. Solve the problem and include necessary explanations.\n\n{problem}", "output_regex": "\\(\\((.*?)\\)\\)", "weight": 0.004310344827586207, "prompt_type": "generated", "format_type": "openqa"}}


What is the weight here?

banghuaz-nvidia

Synced with Pritam on questions there. Can confirm this works fine and backward compatible.

…atures (NVIDIA-NeMo#155) - Add per-record regex extraction from template_metadata.output_regex - Add full generation rescue when regex extraction fails (partial credit) - Add length-based threshold to skip regex for long answers (>120 chars) - Add 3 new tests covering all new features (7/7 passing) - Add example_openqa.jsonl with 5 diverse examples + rollouts + metrics - Update README with new config fields and accurate defaults - Optimize defaults for OpenQA while maintaining backward compatibility All features only activate when template_metadata.output_regex is present, making them safe for existing datasets without template_metadata. --------- Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

psgundecha-nv added 7 commits October 7, 2025 16:03

Add per-record regex support and update judge configuration

c4ba561

Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

expected_answer support

62870ff

Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

Add extraction length threshold config for equivalence LLM judge

f326197

Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

chore: Add *.backup to gitignore

7c2679a

Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

fix: allow flexible UUID types and extra fields in LLMJudgeRunRequest

2b2d2d0

Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>

psgundecha-nv force-pushed the psgundecha/rl-templates-openqa branch 2 times, most recently from ad6f55f to ec29578 Compare October 14, 2025 17:38

psgundecha-nv requested review from banghuaz-nvidia, bxyu-nvidia and soares-f October 14, 2025 17:39

psgundecha-nv changed the title ~~feat: add OpenQA support with per-record regex and rescue features~~ feat (OpenQA): Add OpenQA support with per-record regex and rescue features Oct 14, 2025

banghuaz-nvidia reviewed Oct 17, 2025

View reviewed changes

banghuaz-nvidia approved these changes Oct 21, 2025

View reviewed changes

psgundecha-nv force-pushed the psgundecha/rl-templates-openqa branch from df8fc0f to 9d4cbcf Compare October 21, 2025 21:05

psgundecha-nv merged commit eb676a5 into main Oct 21, 2025
5 checks passed

psgundecha-nv deleted the psgundecha/rl-templates-openqa branch October 21, 2025 21:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat (OpenQA): Add OpenQA support with per-record regex and rescue features#155

feat (OpenQA): Add OpenQA support with per-record regex and rescue features#155
psgundecha-nv merged 7 commits intomainfrom
psgundecha/rl-templates-openqa

psgundecha-nv commented Oct 14, 2025

Uh oh!

copy-pr-bot bot commented Oct 14, 2025

Uh oh!

banghuaz-nvidia commented Oct 17, 2025

Uh oh!

banghuaz-nvidia Oct 17, 2025

Uh oh!

banghuaz-nvidia left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,5 @@
		{"responses_create_params": {"input": [{"role": "user", "content": "Your final answer (and only the answer) must be enclosed in double parentheses. Solve the problem and include necessary explanations.\n\nConsider\n\\[\n\\frac{dx}{dt}=-x-e^{-rx}+\\sqrt{2}\\alpha+\\frac{3}{2},\n\\]\nwhere \\(r>0\\).\n\nWhat is the nearest bifurcation point to 0?"}]}, "expected_answer": "\\[-\\frac{\\sqrt{2}}{4}\\]", "uuid": "6279204c-9a94-5755-96be-5c313796d3b0", "reward_profiles": [{"model_hf_path": "Qwen/Qwen3-30B-A3B", "num_generations": 3, "pass_rate": 1.0}], "template_metadata": {"template_id": "openqa_generated_153", "template_prompt": "Your final answer (and only the answer) must be enclosed in double parentheses. Solve the problem and include necessary explanations.\n\n{problem}", "output_regex": "\\(\\((.*?)\\)\\)", "weight": 0.004310344827586207, "prompt_type": "generated", "format_type": "openqa"}}

Conversation

psgundecha-nv commented Oct 14, 2025

Uh oh!

copy-pr-bot bot commented Oct 14, 2025

Uh oh!

banghuaz-nvidia commented Oct 17, 2025

Uh oh!

banghuaz-nvidia Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

banghuaz-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants