⚡️ Speed up method BRD.time_series by 23%#114
Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
Open
Conversation
The optimized code achieves a **23% speedup** through three key optimizations:
### 1. **Vectorized `_set_action_dist` using `np.bincount`** (58% faster)
The original implementation uses a Python loop to populate the action distribution array:
```python
for i in range(self.N):
action_dist[actions[i]] += 1
```
The optimized version replaces this with NumPy's vectorized `bincount`:
```python
counts = np.bincount(np.asarray(actions, dtype=np.intp), minlength=self.num_actions)
action_dist[:counts.shape[0]] = counts
```
**Why it's faster**: `np.bincount` is implemented in optimized C code and processes the entire array in one operation, avoiding Python loop overhead. The line profiler shows this function improved from 1.23ms to 0.515ms.
### 2. **Inlined `play()` logic in `time_series` loop** (Eliminates function call overhead)
The original code calls `self.play()` inside the main loop, which:
- Incurs function call overhead for 2,881 iterations
- Performs redundant `check_random_state()` calls (taking 13.3% of total time)
- Creates unnecessary intermediate variables
The optimized version inlines the critical operations directly:
```python
action_dist[action] -= 1
next_action = self.player.best_response(...)
action_dist[next_action] += 1
```
**Why it's faster**: Line profiler shows the redundant `check_random_state` calls in the original loop consumed 27.9ms (13.3% of total time). Eliminating these calls and function overhead provides significant savings.
### 3. **Incremental cumulative sum updates** (Avoids redundant recomputation)
The original code recomputes the cumulative sum on every iteration:
```python
action = np.searchsorted(action_dist.cumsum(), player_ind_seq[t], side='right')
```
The optimized version maintains the cumulative sum and updates it incrementally:
```python
cum = action_dist.cumsum() # Once before loop
# In loop:
action = np.searchsorted(cum, idx, side='right')
# Update cum based on what changed
if action != next_action:
if action < next_action:
cum[action:next_action] -= 1
else:
cum[next_action:action] += 1
```
**Why it's faster**: Computing `cumsum()` takes 13.5% of loop time in the original code. The incremental update only modifies the affected range, which is cheaper than full recomputation, especially when `action == next_action` (no update needed).
### Impact on Workloads
The test results show consistent speedups across all scenarios:
- **Small games** (N=2-5, ts_length=5-30): 8-21% faster
- **Large-scale** (N=100-500, ts_length=100-500): 16-33% faster, with the best gains on long time series (32.8% for ts_length=500)
The optimizations are particularly effective for:
- Long time series where loop overhead and redundant cumsum computations accumulate
- Games with many players where `_set_action_dist` is more expensive
- Scenarios where actions frequently don't change (`action == next_action`), avoiding cumsum updates entirely
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📄 23% (0.23x) speedup for
BRD.time_seriesinquantecon/game_theory/brd.py⏱️ Runtime :
47.9 milliseconds→38.9 milliseconds(best of137runs)📝 Explanation and details
The optimized code achieves a 23% speedup through three key optimizations:
1. Vectorized
_set_action_distusingnp.bincount(58% faster)The original implementation uses a Python loop to populate the action distribution array:
The optimized version replaces this with NumPy's vectorized
bincount:Why it's faster:
np.bincountis implemented in optimized C code and processes the entire array in one operation, avoiding Python loop overhead. The line profiler shows this function improved from 1.23ms to 0.515ms.2. Inlined
play()logic intime_seriesloop (Eliminates function call overhead)The original code calls
self.play()inside the main loop, which:check_random_state()calls (taking 13.3% of total time)The optimized version inlines the critical operations directly:
Why it's faster: Line profiler shows the redundant
check_random_statecalls in the original loop consumed 27.9ms (13.3% of total time). Eliminating these calls and function overhead provides significant savings.3. Incremental cumulative sum updates (Avoids redundant recomputation)
The original code recomputes the cumulative sum on every iteration:
The optimized version maintains the cumulative sum and updates it incrementally:
Why it's faster: Computing
cumsum()takes 13.5% of loop time in the original code. The incremental update only modifies the affected range, which is cheaper than full recomputation, especially whenaction == next_action(no update needed).Impact on Workloads
The test results show consistent speedups across all scenarios:
The optimizations are particularly effective for:
_set_action_distis more expensiveaction == next_action), avoiding cumsum updates entirely✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-BRD.time_series-mkp5t5k4and push.