⚡️ Speed up method LocalInteraction.play by 137%#117
Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
Open
⚡️ Speed up method LocalInteraction.play by 137%#117codeflash-ai[bot] wants to merge 1 commit intomainfrom
LocalInteraction.play by 137%#117codeflash-ai[bot] wants to merge 1 commit intomainfrom
Conversation
The optimized code achieves a **136% speedup** (from 280ms to 118ms) by eliminating expensive sparse matrix operations in the hot path of the `_play` method. ## Key Optimization **What changed:** The original code constructs a temporary sparse CSR matrix (`actions_matrix`) on every call to `_play`, then performs a sparse matrix multiplication (`adj_matrix @ actions_matrix`) followed by a `.toarray()` conversion to dense format. The optimized version directly traverses the CSR matrix internals using `indptr`, `indices`, and `data` arrays, and uses `np.bincount` to aggregate weighted neighbor actions. **Why it's faster:** 1. **Eliminates per-iteration allocations:** The original creates two temporary sparse matrices per update (one for actions, one for the product), plus a dense array conversion. Line profiler shows the sparse matrix construction takes ~15% of runtime and the matrix multiplication takes ~57%. 2. **Reduces memory operations:** The optimized version preallocates a single `opponent_act` array and fills it incrementally using fast NumPy operations (`np.bincount` with weights), avoiding the overhead of sparse matrix arithmetic and format conversions. 3. **Better cache locality:** Direct array indexing (`actions_arr[neigh]`) followed by bincount operates on contiguous memory, whereas sparse matrix multiplication involves pointer chasing through CSR structures. ## Performance Characteristics From the annotated tests, the optimization shows: - **3-5x speedup** on small simultaneous updates (e.g., 223μs → 50μs) - **8-9x speedup** on large asynchronous workloads (e.g., 17.2ms → 1.8ms, 33.5ms → 3.2ms) - **Scales better with iterations:** 500 reps test shows 350% speedup (95.7ms → 21.3ms) The gains are most pronounced when: - Many iterations occur (asynchronous revision with large `num_reps`) - Networks are sparse (fewer neighbors = less bincount overhead) - Player updates are frequent relative to setup cost One test case shows a 12.5% slowdown for a large N=200 simultaneous update, likely because the overhead of bincount setup doesn't amortize well when all players update once in a dense configuration, but this is an outlier among overwhelmingly positive results. ## Impact on Workloads Since `LocalInteraction.play` is a core simulation method for evolutionary game dynamics on networks, this optimization significantly accelerates: - Monte Carlo simulations requiring thousands of game iterations - Parameter sweeps over network structures - Asynchronous best-response dynamics (where individual players update sequentially) The optimization preserves exact numerical behavior—all tests pass with identical outputs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📄 137% (1.37x) speedup for
LocalInteraction.playinquantecon/game_theory/localint.py⏱️ Runtime :
280 milliseconds→118 milliseconds(best of47runs)📝 Explanation and details
The optimized code achieves a 136% speedup (from 280ms to 118ms) by eliminating expensive sparse matrix operations in the hot path of the
_playmethod.Key Optimization
What changed:
The original code constructs a temporary sparse CSR matrix (
actions_matrix) on every call to_play, then performs a sparse matrix multiplication (adj_matrix @ actions_matrix) followed by a.toarray()conversion to dense format. The optimized version directly traverses the CSR matrix internals usingindptr,indices, anddataarrays, and usesnp.bincountto aggregate weighted neighbor actions.Why it's faster:
Eliminates per-iteration allocations: The original creates two temporary sparse matrices per update (one for actions, one for the product), plus a dense array conversion. Line profiler shows the sparse matrix construction takes ~15% of runtime and the matrix multiplication takes ~57%.
Reduces memory operations: The optimized version preallocates a single
opponent_actarray and fills it incrementally using fast NumPy operations (np.bincountwith weights), avoiding the overhead of sparse matrix arithmetic and format conversions.Better cache locality: Direct array indexing (
actions_arr[neigh]) followed by bincount operates on contiguous memory, whereas sparse matrix multiplication involves pointer chasing through CSR structures.Performance Characteristics
From the annotated tests, the optimization shows:
The gains are most pronounced when:
num_reps)One test case shows a 12.5% slowdown for a large N=200 simultaneous update, likely because the overhead of bincount setup doesn't amortize well when all players update once in a dense configuration, but this is an outlier among overwhelmingly positive results.
Impact on Workloads
Since
LocalInteraction.playis a core simulation method for evolutionary game dynamics on networks, this optimization significantly accelerates:The optimization preserves exact numerical behavior—all tests pass with identical outputs.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-LocalInteraction.play-mkp7k202and push.