Skip to content

Normalize checkpoint success flags and identifier handling to fix resume behavior#57

Merged
hemanth-asirvatham merged 1 commit intomainfrom
investigate-reset_files-behavior-in-gabriel
Feb 10, 2026
Merged

Normalize checkpoint success flags and identifier handling to fix resume behavior#57
hemanth-asirvatham merged 1 commit intomainfrom
investigate-reset_files-behavior-in-gabriel

Conversation

@hemanth-asirvatham
Copy link
Collaborator

Motivation

  • Resuming long runs could fail to detect completed rows when the Successful column was stored as strings (e.g. "True", "true", "1"), causing already-done rows to be reprocessed.
  • Identifier type mismatches (numeric vs string) also prevented correct deduplication and resume logic.
  • Make resume/checkpoint detection robust to hand-edited or legacy CSVs so partially completed runs reliably continue from checkpoints.

Description

  • Normalize the Successful column when loading a saved CSV by coercing boolean-like values and matching common success strings ("true", "1", "yes", "completed", etc.) before building the done set; change located in src/gabriel/utils/openai_utils.py.
  • Normalize identifier handling to string form when computing done, filtering todo_pairs, and during write-time deduplication so comparisons are type-consistent; updates in src/gabriel/utils/openai_utils.py.
  • Ensure written_identifiers stores string identifiers and use astype(str) when checking batch_df to avoid reappends of already-saved rows; updates in src/gabriel/utils/openai_utils.py.
  • Add a regression test test_resume_treats_string_success_values_as_completed to tests/test_reset_files.py that seeds a checkpoint CSV with string-valued Successful entries and verifies reset_files=False resumes without reprocessing those rows.

Testing

  • Ran pytest -q tests/test_reset_files.py tests/test_openai_dummy.py and both tests suites passed locally (6 passed for the two test files run).
  • The new regression test confirms previously failing behavior (string success flags treated as incomplete) is resolved.

Codex Task

@github-actions
Copy link


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@hemanth-asirvatham hemanth-asirvatham merged commit 3b0300b into main Feb 10, 2026
1 check failed
@github-actions github-actions bot locked and limited conversation to collaborators Feb 10, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant