How to resume running from a output dir such that we can rollout a large dataset (even if some error occurring during running )