Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
72d8eaf
Add chemistry_direct RL environment
danecor Mar 13, 2026
c348fd0
Use negative absolute error reward for float properties
danecor Mar 15, 2026
616abcf
Rename chemistry_direct → rdkit_chemistry, branch → dane/rdkit-chemistry
danecor Mar 15, 2026
11d8b95
Shorten license.
danecor Mar 16, 2026
bf7af8b
Config for gpt oss 20b low.
danecor Mar 16, 2026
6efa376
Updates to include mcp tools.
danecor Mar 17, 2026
9139a43
Remove gpt-oss-20b-reasoning-low.yaml from branch
danecor Mar 17, 2026
9c678e3
Update rdkit_chemistry config and README for mcp-python tool-use support
danecor Mar 18, 2026
1ad524d
Update example data with new samples and use_box_format field
danecor Mar 18, 2026
84f3962
Add \boxed{} answer extraction and compute_metrics
danecor Mar 18, 2026
41eb022
New prompt format, strict answer extraction.
danecor Mar 18, 2026
401d964
Updates example jsonl.
danecor Mar 18, 2026
81e2d98
Update examples.
danecor Mar 19, 2026
f829caf
Parse both content and tool calls correctly.
danecor Mar 19, 2026
a4c17ea
Add sandbox_launcher for auto-starting sandbox from rdkit_chemistry s…
danecor Mar 19, 2026
c73befc
Update rdkit-chemistry-gym submodule to array-based rollout workflow
danecor Mar 19, 2026
adac43b
Update to RDKit reward function (optional)
mlgill Mar 20, 2026
dbb1c60
Merge branch 'michelle/rdkit-chemistry_new_rewards' into 'dane/rdkit-…
danecor Mar 20, 2026
d53370a
Revert defensive behavior in app.py
danecor Mar 20, 2026
f54cdab
Revert app.py
danecor Mar 20, 2026
100d7e8
New example file.
danecor Mar 21, 2026
315d85a
Merge branch 'dane/rdkit-chemistry' of https://gitlab-master.nvidia.c…
danecor Mar 23, 2026
44c6ab7
Fix closing parentheses.
danecor Mar 23, 2026
1c6b6b8
Temp: retry-then-continue logic in core nemo-gym code.
danecor Mar 23, 2026
63c184e
try fix duplicated usage counting and
bxyu-nvidia Mar 23, 2026
3b63617
empty commit for qa
bxyu-nvidia Mar 23, 2026
8980bba
Revert rollout retry experiment.
danecor Mar 24, 2026
8ceee69
Merge branch 'bxyu/fix-938' into dane/rdkit-chemistry
danecor Mar 24, 2026
c2b3343
Revert "Merge branch 'bxyu/fix-938' into dane/rdkit-chemistry"
danecor Mar 24, 2026
3b46072
try fix duplicated usage counting and
bxyu-nvidia Mar 23, 2026
aae3072
Update rdkit-chemistry-gym submodule.
danecor Mar 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 55 additions & 67 deletions README.md

Large diffs are not rendered by default.

21 changes: 4 additions & 17 deletions benchmarks/aime24/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,7 @@
config_paths:
- resources_servers/math_with_judge/configs/math_with_judge.yaml

# We use `_inherit_from` directives to inherit from and not use the generic config above to ensure this benchmark config is isolated.
aime24_math_with_judge_resources_server:
_inherit_from: math_with_judge

aime24_math_with_judge_simple_agent:
_inherit_from: math_with_judge_simple_agent
responses_api_agents:
simple_agent:
resources_server:
name: aime24_math_with_judge_resources_server
datasets:
- name: aime24
type: benchmark
jsonl_fpath: benchmarks/aime24/data/aime24_benchmark.jsonl
prompt_config: benchmarks/aime24/prompts/default.yaml
prepare_script: benchmarks/aime24/prepare.py
num_repeats: 32
# Rollout collection defaults — picked up directly by RolloutCollectionConfig
agent_name: math_with_judge_simple_agent
input_jsonl_fpath: benchmarks/aime24/data/aime24_validation.jsonl
num_repeats: 32
37 changes: 0 additions & 37 deletions benchmarks/aime24/data/aime24_benchmark_metrics.json

This file was deleted.

2 changes: 1 addition & 1 deletion benchmarks/aime24/prepare.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

BENCHMARK_DIR = Path(__file__).parent
DATA_DIR = BENCHMARK_DIR / "data"
OUTPUT_FPATH = DATA_DIR / "aime24_benchmark.jsonl"
OUTPUT_FPATH = DATA_DIR / "aime24_validation.jsonl"

# HuggingFace dataset for AIME 2024
HF_REPO_ID = "HuggingFaceH4/aime_2024"
Expand Down
2 changes: 0 additions & 2 deletions benchmarks/aime24/prompts/default.yaml

This file was deleted.

14 changes: 0 additions & 14 deletions benchmarks/aime25/__init__.py

This file was deleted.

21 changes: 0 additions & 21 deletions benchmarks/aime25/config.yaml

This file was deleted.

1 change: 0 additions & 1 deletion benchmarks/aime25/data/.gitignore

This file was deleted.

37 changes: 0 additions & 37 deletions benchmarks/aime25/data/aime25_benchmark_metrics.json

This file was deleted.

57 changes: 0 additions & 57 deletions benchmarks/aime25/prepare.py

This file was deleted.

2 changes: 0 additions & 2 deletions benchmarks/aime25/prompts/default.yaml

This file was deleted.

25 changes: 0 additions & 25 deletions benchmarks/gpqa/README.md

This file was deleted.

14 changes: 0 additions & 14 deletions benchmarks/gpqa/__init__.py

This file was deleted.

25 changes: 0 additions & 25 deletions benchmarks/gpqa/config.yaml

This file was deleted.

1 change: 0 additions & 1 deletion benchmarks/gpqa/data/.gitignore

This file was deleted.

37 changes: 0 additions & 37 deletions benchmarks/gpqa/data/gpqa_diamond_benchmark_metrics.json

This file was deleted.

Loading
Loading