Skip to content

⚡️ Speed up method GAMWriter.to_string by 35%#111

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-GAMWriter.to_string-mkp4ec11
Open

⚡️ Speed up method GAMWriter.to_string by 35%#111
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-GAMWriter.to_string-mkp4ec11

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Jan 22, 2026

📄 35% (0.35x) speedup for GAMWriter.to_string in quantecon/game_theory/game_converters.py

⏱️ Runtime : 3.70 milliseconds 2.74 milliseconds (best of 26 runs)

📝 Explanation and details

The optimized code achieves a 34% speedup by reducing redundant calls to NumPy's expensive array2string formatting function.

Key Optimization

Original approach: Called np.array2string() separately for each player's payoff array (typically 2-54 times per game based on test data).

Optimized approach:

  1. First collects all flattened payoff arrays and checks if they share the same dtype
  2. When dtypes are homogeneous (the common case), concatenates all arrays into a single large array
  3. Calls np.array2string() just once on the concatenated result instead of N times (where N = number of players)

Why This Works

np.array2string() has significant overhead for formatting, string allocation, and type inspection that occurs on every call. The line profiler shows the original code spending 95.4% of its time in np.array2string() calls. By batching these calls when safe to do so, the optimized version reduces this to 91.9% of total time while the absolute time drops dramatically.

The dtype homogeneity check ensures formatting semantics remain identical—different dtypes could format differently (e.g., int vs float precision), so the code preserves the original per-player behavior as a fallback when dtypes differ.

Performance Characteristics

Test results show the optimization is most effective for:

  • Multi-player games: 46% faster for 3-player games (168μs → 114μs)
  • Float payoffs: 39-43% faster with floating-point values
  • Larger games: 21% faster for 6-player games with 384 total values

Single-player games show minimal or slight regression (4.6% slower) since there's only one payoff array, making concatenation overhead outweigh any benefit.

Impact on Workloads

Based on function_references, this function is called from:

  • to_gam() - The main export function for game serialization
  • Test suites validating game format conversion

Since game serialization can be performed repeatedly (e.g., batch exports, simulation loops), and multi-player games are the primary use case, this optimization meaningfully reduces latency in typical workflows where games with 2+ players are converted to GAM format.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 25 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import numpy as np  # used for constructing payoff arrays and numeric checks
# imports
import pytest  # used for our unit tests
from quantecon.game_theory.game_converters import GAMWriter

# Helper concrete classes used in tests to construct minimal game-like objects.
# These are not meant to be "fakes" of complex domain classes; they simply
# provide the attributes that GAMWriter expects (N, nums_actions, players,
# and player.payoff_array). Names avoid the banned patterns such as FakeX/StubX.
class SimplePlayer:
    def __init__(self, payoff_array):
        # payoff_array should be a numpy ndarray with appropriate shape
        self.payoff_array = payoff_array

class SimpleGame:
    def __init__(self, nums_actions, players):
        # Number of players inferred from length of nums_actions
        self.N = len(nums_actions)
        self.nums_actions = list(nums_actions)
        self.players = list(players)

def test_basic_two_player_integer_payoffs():
    # Two players, each with 2 actions -> payoff arrays are 2x2.
    # Define payoff arrays with small integers for clarity.
    p0 = SimplePlayer(np.array([[1, 2],
                                [3, 4]]))  # shape (2,2)
    p1 = SimplePlayer(np.array([[10, 20],
                                [30, 40]]))  # shape (2,2)

    # Create a simple NormalFormGame-like object
    g = SimpleGame(nums_actions=[2, 2], players=[p0, p1])

    # Produce string using the function under test
    codeflash_output = GAMWriter.to_string(g); s = codeflash_output # 118μs -> 92.4μs (28.4% faster)

    # Manually construct expected string using the exact format described:
    # First line: number of players
    # Second line: space-separated number of actions per player
    # Blank line, then payoffs for each player concatenated with single spaces
    # The payoffs ordering is determined by the code's transpose/ravel logic:
    # For player 0 (i=0): transpose(0,1) then Fortran ravel -> column-major flatten
    # For player 1 (i=1): transpose(1,0) then Fortran ravel -> effectively .T then column-major
    expected_numbers_p0 = [1, 3, 2, 4]  # column-major flatten of p0
    expected_numbers_p1 = [10, 20, 30, 40]  # result after transpose and column-major flatten of p1
    expected_tail = ' '.join(map(str, expected_numbers_p0 + expected_numbers_p1))
    expected = "2\n2 2\n\n" + expected_tail

def test_single_player_one_dimensional_array():
    # Single-player game with 3 actions and 1D payoff array.
    p0 = SimplePlayer(np.array([5, 6, 7]))  # shape (3,)
    g = SimpleGame(nums_actions=[3], players=[p0])

    # Call the function
    codeflash_output = GAMWriter.to_string(g); s = codeflash_output # 73.8μs -> 77.3μs (4.63% slower)

    # Expected string for single player: header then numbers
    expected = "1\n3\n\n5 6 7"

def test_three_player_small_multidimensional_arrays():
    # Three players, each with 2 actions -> payoff arrays of shape (2,2,2)
    # Use small distinct values to make manual verification straightforward.
    total = 8  # 2*2*2
    # Player 0 values: 0..7
    arr0 = np.arange(total).reshape((2, 2, 2))
    # Player 1 values offset by +10
    arr1 = (np.arange(total) + 10).reshape((2, 2, 2))
    # Player 2 values offset by +100
    arr2 = (np.arange(total) + 100).reshape((2, 2, 2))

    p0 = SimplePlayer(arr0)
    p1 = SimplePlayer(arr1)
    p2 = SimplePlayer(arr2)

    g = SimpleGame(nums_actions=[2, 2, 2], players=[p0, p1, p2])

    codeflash_output = GAMWriter.to_string(g); s = codeflash_output # 168μs -> 114μs (46.1% faster)

    # Extract numeric tokens from the output (everything after the blank line)
    parts = s.split("\n\n", 1)
    tail = parts[1].strip()
    tokens = tail.split()

    # Convert tokens to integers and verify that they match expected per-player flattened sequences.
    nums = list(map(int, tokens))
    # Recreate expected concatenation using the same transform but computed here in the test.
    # This is a direct computation of the intended correct ordering:
    expected_nums = []
    N = 3
    players_arrays = [arr0, arr1, arr2]
    for i, arr in enumerate(players_arrays):
        # The axes ordering used by GAMWriter:
        axes = tuple((*range(N - i, N), *range(N - i)))
        # Apply transpose and Fortran-order ravel as the writer does
        block = list(arr.transpose(axes).ravel(order='F'))
        expected_nums.extend([int(x) for x in block])

def test_trailing_whitespace_removed():
    # Ensure the string does not end with an extra space (the writer uses rstrip()).
    p0 = SimplePlayer(np.array([[1, 2], [3, 4]]))
    p1 = SimplePlayer(np.array([[5, 6], [7, 8]]))
    g = SimpleGame(nums_actions=[2, 2], players=[p0, p1])

    codeflash_output = GAMWriter.to_string(g); s = codeflash_output # 113μs -> 86.7μs (31.2% faster)

def test_float_payoffs_preserved_and_parsable():
    # Use floating point numbers to verify formatting preserves decimal points
    p0 = SimplePlayer(np.array([[1.5, 2.5], [3.75, 4.0]]))
    p1 = SimplePlayer(np.array([[10.0, 20.25], [30.5, 40.125]]))
    g = SimpleGame(nums_actions=[2, 2], players=[p0, p1])

    codeflash_output = GAMWriter.to_string(g); s = codeflash_output # 201μs -> 144μs (39.1% faster)

    # Extract numeric tokens and convert to float for comparison
    tail = s.split("\n\n", 1)[1].strip()
    token_strings = tail.split()
    token_floats = list(map(float, token_strings))

    # Expected ordering as in the integer basic test but with floats
    expected_numbers_p0 = [1.5, 3.75, 2.5, 4.0]  # column-major flatten of p0
    expected_numbers_p1 = [10.0, 20.25, 30.5, 40.125]  # after transpose and Fortran flatten
    expected_floats = expected_numbers_p0 + expected_numbers_p1

def test_large_scale_counts_and_sum():
    # Large but bounded case:
    # Use N=6 players, each with 2 actions -> total per player = 2^6 = 64 values
    # Total tokens = 6 * 64 = 384 (< 1000 as required)
    N = 6
    actions = [2] * N
    total_per_player = 1
    for a in actions:
        total_per_player *= a

    players = []
    for p_idx in range(N):
        # Each player's payoff array contains a distinct range so we can detect ordering errors
        base = p_idx * total_per_player
        arr = (np.arange(total_per_player) + base).reshape(tuple(actions))
        players.append(SimplePlayer(arr))

    g = SimpleGame(nums_actions=actions, players=players)

    codeflash_output = GAMWriter.to_string(g); s = codeflash_output # 731μs -> 602μs (21.4% faster)

    # Verify headers are correct
    first_line, second_line = s.splitlines()[0], s.splitlines()[1]

    # Extract numeric tokens
    tail = s.split("\n\n", 1)[1].strip()
    tokens = tail.split()

    # Convert to integers and compute sum - this is robust to permutations of ordering per block,
    # because we know exactly how we constructed the arrays and ordering is deterministic.
    int_tokens = list(map(int, tokens))
    total_sum_tokens = sum(int_tokens)

    # Compute expected sum using arithmetic (avoid numpy sum in final assertion)
    # For each player p, the player's values are base + [0 .. total_per_player-1]
    # base = p * total_per_player
    # sum per player = total_per_player * base + sum(0..total_per_player-1)
    n = total_per_player
    sum_0_to_n_minus_1 = (n - 1) * n // 2  # arithmetic series sum
    expected_total = 0
    for p in range(N):
        base = p * n
        expected_total += n * base + sum_0_to_n_minus_1
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import numpy as np
import pytest
from quantecon.game_theory import NormalFormGame
from quantecon.game_theory.game_converters import GAMWriter

def test_basic_2x2_game():
    """Test basic functionality with a simple 2x2 game."""
    # Create a 2-player game with 2 actions each
    payoff_1 = np.array([[1, 2], [3, 4]])
    payoff_2 = np.array([[5, 6], [7, 8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    # Convert to string format
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 101μs -> 68.3μs (48.5% faster)

def test_output_format_structure():
    """Test that output has correct structural format."""
    payoff_1 = np.array([[1, 2], [3, 4]])
    payoff_2 = np.array([[5, 6], [7, 8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 103μs -> 78.3μs (32.3% faster)
    lines = result.split('\n')

def test_negative_payoffs():
    """Test game with negative payoffs."""
    payoff_1 = np.array([[-1, -2], [-3, -4]])
    payoff_2 = np.array([[-5, -6], [-7, -8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 101μs -> 77.2μs (31.0% faster)

def test_zero_payoffs():
    """Test game with all zero payoffs."""
    payoff_1 = np.zeros((2, 2))
    payoff_2 = np.zeros((2, 2))
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 134μs -> 100μs (33.6% faster)

def test_float_payoffs():
    """Test game with floating point payoffs."""
    payoff_1 = np.array([[1.5, 2.5], [3.5, 4.5]])
    payoff_2 = np.array([[5.5, 6.5], [7.5, 8.5]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 178μs -> 124μs (43.4% faster)

def test_large_payoff_values():
    """Test game with very large payoff values."""
    payoff_1 = np.array([[1e6, 2e6], [3e6, 4e6]])
    payoff_2 = np.array([[5e6, 6e6], [7e6, 8e6]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 178μs -> 126μs (40.9% faster)

def test_small_payoff_values():
    """Test game with very small payoff values."""
    payoff_1 = np.array([[1e-6, 2e-6], [3e-6, 4e-6]])
    payoff_2 = np.array([[5e-6, 6e-6], [7e-6, 8e-6]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 183μs -> 128μs (42.3% faster)

def test_extreme_negative_payoffs():
    """Test with extremely negative payoffs."""
    payoff_1 = np.array([[-1e10, -2e10], [-3e10, -4e10]])
    payoff_2 = np.array([[-5e10, -6e10], [-7e10, -8e10]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 183μs -> 128μs (42.7% faster)

def test_mixed_sign_payoffs():
    """Test game with mixed positive and negative payoffs."""
    payoff_1 = np.array([[1, -2], [-3, 4]])
    payoff_2 = np.array([[-5, 6], [7, -8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 102μs -> 67.3μs (52.0% faster)

def test_identical_payoffs_both_players():
    """Test symmetric game where both players have identical payoffs."""
    payoff_matrix = np.array([[1, 2], [3, 4]])
    game = NormalFormGame((payoff_matrix, payoff_matrix))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 100μs -> 65.4μs (54.3% faster)

def test_no_trailing_whitespace():
    """Test that output does not have trailing whitespace."""
    payoff_1 = np.array([[1, 2], [3, 4]])
    payoff_2 = np.array([[5, 6], [7, 8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 101μs -> 77.4μs (31.1% faster)

def test_no_extra_blank_lines():
    """Test that output doesn't have extra blank lines at end."""
    payoff_1 = np.array([[1, 2], [3, 4]])
    payoff_2 = np.array([[5, 6], [7, 8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 100μs -> 66.0μs (53.0% faster)

def test_consistency_multiple_calls():
    """Test that calling to_string multiple times produces identical results."""
    payoff_1 = np.array([[1, 2], [3, 4]])
    payoff_2 = np.array([[5, 6], [7, 8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result1 = codeflash_output # 111μs -> 73.4μs (51.4% faster)
    codeflash_output = GAMWriter.to_string(game); result2 = codeflash_output # 69.0μs -> 53.5μs (28.9% faster)
    codeflash_output = GAMWriter.to_string(game); result3 = codeflash_output # 62.3μs -> 47.5μs (31.0% faster)

def test_payoff_values_preserved():
    """Test that specific payoff values appear in output."""
    payoff_1 = np.array([[1, 2], [3, 4]])
    payoff_2 = np.array([[5, 6], [7, 8]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 106μs -> 81.3μs (31.2% faster)

def test_game_with_integer_payoffs():
    """Test that integer payoffs are handled correctly."""
    payoff_1 = np.array([[10, 20], [30, 40]], dtype=int)
    payoff_2 = np.array([[50, 60], [70, 80]], dtype=int)
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 88.7μs -> 66.4μs (33.7% faster)

def test_game_with_boolean_like_values():
    """Test game with payoffs that are 0 and 1."""
    payoff_1 = np.array([[0, 1], [1, 0]])
    payoff_2 = np.array([[1, 0], [0, 1]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 101μs -> 66.1μs (53.9% faster)

def test_scientific_notation_payoffs():
    """Test game with payoffs that may be in scientific notation."""
    payoff_1 = np.array([[1e-3, 2e-3], [3e-3, 4e-3]])
    payoff_2 = np.array([[5e-3, 6e-3], [7e-3, 8e-3]])
    game = NormalFormGame((payoff_1, payoff_2))
    
    codeflash_output = GAMWriter.to_string(game); result = codeflash_output # 180μs -> 128μs (41.2% faster)

To edit these changes git checkout codeflash/optimize-GAMWriter.to_string-mkp4ec11 and push.

Codeflash Static Badge

The optimized code achieves a **34% speedup** by reducing redundant calls to NumPy's expensive `array2string` formatting function.

## Key Optimization

**Original approach**: Called `np.array2string()` separately for each player's payoff array (typically 2-54 times per game based on test data).

**Optimized approach**: 
1. First collects all flattened payoff arrays and checks if they share the same dtype
2. When dtypes are homogeneous (the common case), concatenates all arrays into a single large array
3. Calls `np.array2string()` just **once** on the concatenated result instead of N times (where N = number of players)

## Why This Works

`np.array2string()` has significant overhead for formatting, string allocation, and type inspection that occurs on every call. The line profiler shows the original code spending **95.4%** of its time in `np.array2string()` calls. By batching these calls when safe to do so, the optimized version reduces this to **91.9%** of total time while the absolute time drops dramatically.

The dtype homogeneity check ensures formatting semantics remain identical—different dtypes could format differently (e.g., int vs float precision), so the code preserves the original per-player behavior as a fallback when dtypes differ.

## Performance Characteristics

Test results show the optimization is most effective for:
- **Multi-player games**: 46% faster for 3-player games (168μs → 114μs)
- **Float payoffs**: 39-43% faster with floating-point values
- **Larger games**: 21% faster for 6-player games with 384 total values

Single-player games show minimal or slight regression (4.6% slower) since there's only one payoff array, making concatenation overhead outweigh any benefit.

## Impact on Workloads

Based on `function_references`, this function is called from:
- `to_gam()` - The main export function for game serialization
- Test suites validating game format conversion

Since game serialization can be performed repeatedly (e.g., batch exports, simulation loops), and multi-player games are the primary use case, this optimization meaningfully reduces latency in typical workflows where games with 2+ players are converted to GAM format.
@codeflash-ai codeflash-ai bot requested a review from aseembits93 January 22, 2026 07:18
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Jan 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants