feat (OpenQA): Add OpenQA support with per-record regex and rescue features#155
Merged
psgundecha-nv merged 7 commits intomainfrom Oct 21, 2025
Merged
feat (OpenQA): Add OpenQA support with per-record regex and rescue features#155psgundecha-nv merged 7 commits intomainfrom
psgundecha-nv merged 7 commits intomainfrom
Conversation
Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
- Add check_full_generation_on_fail config to compare expected vs full generation when first pass fails - Add reward_if_full_generation_succeeds config for partial credit (default 0.5) - Implement fallback logic: use full generation when use_per_record_regex=true, swap otherwise - Add length threshold optimization to skip swap check for very long answers - Maintain 100% backward compatibility with existing behavior - All existing tests pass Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
- Add per-record regex extraction from template_metadata.output_regex - Add full generation rescue when regex extraction fails (partial credit) - Add length-based threshold to skip regex for long answers (>120 chars) - Add 3 new tests covering all new features (7/7 passing) - Add example_openqa.jsonl with 5 diverse examples + rollouts + metrics - Update README with new config fields and accurate defaults - Optimize defaults for OpenQA while maintaining backward compatibility All features only activate when template_metadata.output_regex is present, making them safe for existing datasets without template_metadata. Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
ad6f55f to
ec29578
Compare
Contributor
|
Do we already have a ready-to-train dataset on HF? If so can we put the dataset link and instructions to both config.yaml and readme? This applies to both MCQA and OpenQA. |
| @@ -0,0 +1,5 @@ | |||
| {"responses_create_params": {"input": [{"role": "user", "content": "Your final answer (and only the answer) must be enclosed in double parentheses. Solve the problem and include necessary explanations.\n\nConsider\n\\[\n\\frac{dx}{dt}=-x-e^{-rx}+\\sqrt{2}\\alpha+\\frac{3}{2},\n\\]\nwhere \\(r>0\\).\n\nWhat is the nearest bifurcation point to 0?"}]}, "expected_answer": "\\[-\\frac{\\sqrt{2}}{4}\\]", "uuid": "6279204c-9a94-5755-96be-5c313796d3b0", "reward_profiles": [{"model_hf_path": "Qwen/Qwen3-30B-A3B", "num_generations": 3, "pass_rate": 1.0}], "template_metadata": {"template_id": "openqa_generated_153", "template_prompt": "Your final answer (and only the answer) must be enclosed in double parentheses. Solve the problem and include necessary explanations.\n\n{problem}", "output_regex": "\\(\\((.*?)\\)\\)", "weight": 0.004310344827586207, "prompt_type": "generated", "format_type": "openqa"}} | |||
Contributor
There was a problem hiding this comment.
What is the weight here?
banghuaz-nvidia
approved these changes
Oct 21, 2025
Contributor
banghuaz-nvidia
left a comment
There was a problem hiding this comment.
Synced with Pritam on questions there. Can confirm this works fine and backward compatible.
df8fc0f to
9d4cbcf
Compare
abubakaria56
pushed a commit
to abubakaria56/Gym
that referenced
this pull request
Mar 2, 2026
…atures (NVIDIA-NeMo#155) - Add per-record regex extraction from template_metadata.output_regex - Add full generation rescue when regex extraction fails (partial credit) - Add length-based threshold to skip regex for long answers (>120 chars) - Add 3 new tests covering all new features (7/7 passing) - Add example_openqa.jsonl with 5 diverse examples + rollouts + metrics - Update README with new config fields and accurate defaults - Optimize defaults for OpenQA while maintaining backward compatibility All features only activate when template_metadata.output_regex is present, making them safe for existing datasets without template_metadata. --------- Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
abubakaria56
pushed a commit
to abubakaria56/Gym
that referenced
this pull request
Mar 2, 2026
…atures (NVIDIA-NeMo#155) - Add per-record regex extraction from template_metadata.output_regex - Add full generation rescue when regex extraction fails (partial credit) - Add length-based threshold to skip regex for long answers (>120 chars) - Add 3 new tests covering all new features (7/7 passing) - Add example_openqa.jsonl with 5 diverse examples + rollouts + metrics - Update README with new config fields and accurate defaults - Optimize defaults for OpenQA while maintaining backward compatibility All features only activate when template_metadata.output_regex is present, making them safe for existing datasets without template_metadata. --------- Signed-off-by: Pritam Gundecha <pgundecha@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
All features only activate when template_metadata.output_regex is present,
making them safe for existing datasets without template_metadata.