Problem Statement
In sdk-typescript, we allow model retries - that was implemented as part of strands-agents/sdk-typescript#222
Folks would like to retry on arbitrary exceptions (see strands-agents#370) and I think we should let them.
Proposed Solution
Allow AfterModelCall hook callbacks to set a field to retry model invocation. For now it should only be allowed if an exception is thrown.
This does not replace our existing retry-strategy, but makes it more flexible
Use Case
Retrying model calls on exceptions
Implementation Requirements
Based on repository analysis and clarification discussion, here are the detailed requirements:
Technical Approach
Hook System Enhancement:
- Add writable
retry_model field to AfterModelCallEvent (boolean, default False)
- Field should only be checked when
exception attribute is present (not on successful calls)
- Implement
_can_write() method to allow modification of retry_model field
- Multiple hooks can set/unset the field; final value is read after all callbacks execute
Retry Logic:
- Hooks determine their own retry parameters (count, delay, conditions)
- Hook-initiated retries are independent from existing throttle retry logic
- Existing throttle retry can be conceptually viewed as a built-in retry mechanism
- No validation on exception types - hooks decide what to retry
- No maximum retry limit enforced by framework (hooks manage their own limits)
- Hooks should implement their own delay logic (no delay parameter on event)
Integration with Existing Code:
- Modify
_handle_model_execution() in src/strands/event_loop/event_loop.py
- Check
retry_model field after invoking AfterModelCallEvent callbacks
- If
retry_model=True and exception exists, continue retry loop
- Existing throttle retry logic should remain unchanged
Files to Modify
src/strands/hooks/events.py - Add retry_model field and _can_write() to AfterModelCallEvent
src/strands/event_loop/event_loop.py - Integrate hook-initiated retries into _handle_model_execution()
tests/strands/agent/hooks/test_agent_events.py - Add unit tests for retry functionality
Acceptance Criteria
Scope
- In Scope: Regular model calls via
Agent.__call__() and Agent.stream_async()
- Out of Scope:
structured_output invocations (per existing AfterModelCallEvent behavior)
Example Usage
A hook provider might implement retry logic like this:
class RetryOnServiceUnavailable(HookProvider):
def __init__(self, max_retries=3):
self.max_retries = max_retries
self.retry_counts = {}
def register_hooks(self, registry: HookRegistry) -> None:
registry.add_callback(AfterModelCallEvent, self.handle_retry)
async def handle_retry(self, event: AfterModelCallEvent) -> None:
if event.exception and "ServiceUnavailable" in str(event.exception):
request_id = id(event) # Use some request identifier
count = self.retry_counts.get(request_id, 0)
if count < self.max_retries:
self.retry_counts[request_id] = count + 1
await asyncio.sleep(2 ** count) # Exponential backoff
event.retry_model = True
else:
# Max retries reached, let exception propagate
self.retry_counts.pop(request_id, None)
Related Issues
Additional Context
The existing retry mechanism only handles ModelThrottledException with exponential backoff (MAX_ATTEMPTS=6, INITIAL_DELAY=4s, MAX_DELAY=240s). This feature enables users to implement custom retry logic for any exception type via hooks, providing the flexibility requested in issue strands-agents#370 without hardcoding specific exception types into the framework.
Problem Statement
In sdk-typescript, we allow model retries - that was implemented as part of strands-agents/sdk-typescript#222
Folks would like to retry on arbitrary exceptions (see strands-agents#370) and I think we should let them.
Proposed Solution
Allow AfterModelCall hook callbacks to set a field to retry model invocation. For now it should only be allowed if an exception is thrown.
This does not replace our existing retry-strategy, but makes it more flexible
Use Case
Retrying model calls on exceptions
Implementation Requirements
Based on repository analysis and clarification discussion, here are the detailed requirements:
Technical Approach
Hook System Enhancement:
retry_modelfield toAfterModelCallEvent(boolean, defaultFalse)exceptionattribute is present (not on successful calls)_can_write()method to allow modification ofretry_modelfieldRetry Logic:
Integration with Existing Code:
_handle_model_execution()insrc/strands/event_loop/event_loop.pyretry_modelfield after invokingAfterModelCallEventcallbacksretry_model=Trueand exception exists, continue retry loopFiles to Modify
src/strands/hooks/events.py- Addretry_modelfield and_can_write()toAfterModelCallEventsrc/strands/event_loop/event_loop.py- Integrate hook-initiated retries into_handle_model_execution()tests/strands/agent/hooks/test_agent_events.py- Add unit tests for retry functionalityAcceptance Criteria
AfterModelCallEventhas writableretry_model: bool = Falsefieldretry_modelis only checked whenexceptionis present (not on successful calls)retry_model=Trueto retry the model callScope
Agent.__call__()andAgent.stream_async()structured_outputinvocations (per existingAfterModelCallEventbehavior)Example Usage
A hook provider might implement retry logic like this:
Related Issues
Additional Context
The existing retry mechanism only handles
ModelThrottledExceptionwith exponential backoff (MAX_ATTEMPTS=6, INITIAL_DELAY=4s, MAX_DELAY=240s). This feature enables users to implement custom retry logic for any exception type via hooks, providing the flexibility requested in issue strands-agents#370 without hardcoding specific exception types into the framework.