Skip to content

Fix card server crash due to StopIteration in cards_for_run#2952

Open
Aryanonghub wants to merge 2 commits intoNetflix:masterfrom
Aryanonghub:fix-card-server-stopiteration
Open

Fix card server crash due to StopIteration in cards_for_run#2952
Aryanonghub wants to merge 2 commits intoNetflix:masterfrom
Aryanonghub:fix-card-server-stopiteration

Conversation

@Aryanonghub
Copy link

PR Type

  • Bug fix
  • New feature
  • Core Runtime change
  • Docs / tooling
  • Refactoring

Summary

When a run contains more than max_cards (default: 20), cards_for_run raises StopIteration inside a generator.

Under Python 3.7+ (PEP 479), explicitly raising StopIteration inside a generator is converted into:

RuntimeError: generator raised StopIteration

This causes the card server to crash instead of terminating cleanly.

This PR replaces the explicit raise StopIteration with return, allowing the generator to terminate properly without raising RuntimeError.


Issue

Fixes #2946


Reproduction

Runtime: local

Commands to run:

metaflow card server <flow_name>/<run_id>

Where the run contains more than 20 cards (default max_cards).


Where evidence shows up

Card server console output.


Before (error snippet)
RuntimeError: generator raised StopIteration

Originating from:

metaflow/plugins/cards/card_server.py:197

After (evidence that fix works)
Card server runs successfully.
Stops cleanly after reaching max_cards.
No RuntimeError is raised.

Root Cause

cards_for_run is implemented as a generator:

def cards_for_run(...):
    ...
    for card in card_generator:
        curr_idx += 1
        if curr_idx >= max_cards:
            raise StopIteration
        yield task.pathspec, card

Raising StopIteration manually inside a generator violates PEP 479 semantics in Python 3.7+, which converts it into:

RuntimeError: generator raised StopIteration

This results in unintended process termination rather than graceful generator completion.


Why This Fix Is Correct

The fix replaces:

raise StopIteration

with:

return

This:

  • Properly terminates the generator
  • Preserves existing max_cards behavior
  • Avoids triggering PEP 479 conversion
  • Is minimal and scoped to the faulty logic

No external behavior changes beyond preventing the crash.


Failure Modes Considered

  1. Off-by-one behavior change
    Verified that the generator still stops at the intended max_cards limit.

  2. Caller interaction
    Callers consume via for loop, which handles generator termination correctly when return is used.

No concurrency or shared-state concerns exist in this function.


Tests

  • Unit tests added
  • Reproduction script provided (not required for this bug fix)
  • CI passes (awaiting CI)

Added:

test/unit/test_card_server.py

Test verifies:

  • cards_for_run respects max_cards
  • No RuntimeError is raised
  • Generator terminates cleanly

Non-Goals

  • No changes to card generation logic
  • No changes to max_cards semantics
  • No changes to card server API
  • No refactoring

This PR strictly fixes generator termination semantics.


AI Tool Usage

  • AI tools were used

AI tools were used for PR structure guidance and explanation drafting.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 2, 2026

Greptile Summary

This PR fixes a RuntimeError: generator raised StopIteration crash in the card server when a run contains more than max_cards cards. The fix replaces raise StopIteration with return in the cards_for_run generator to comply with PEP 479, and reorders the yield/check logic to also fix a pre-existing off-by-one bug where the card at the limit boundary was dropped.

  • PEP 479 fix: raise StopIterationreturn in cards_for_run generator (card_server.py:198)
  • Off-by-one fix: yield moved before the max_cards check so the card at the boundary is included
  • New test: test_card_server.py validates max_cards is respected without RuntimeError
  • Minor style issues: Unused pytest import in test file, trailing whitespace on blank line in card_server.py

Confidence Score: 4/5

  • This PR is safe to merge — it's a targeted, correct fix for a real runtime crash with no behavioral regressions.
  • The core fix (StopIteration → return) is correct and well-understood. The yield reordering also fixes a legitimate off-by-one. The change is minimal and scoped. Minor style issues (unused import, trailing whitespace, missing EOF newline) are the only concerns preventing a 5.
  • No files require special attention — both changes are straightforward and correct.

Important Files Changed

Filename Overview
metaflow/plugins/cards/card_server.py Replaces raise StopIteration with return in the cards_for_run generator (PEP 479 fix) and reorders to yield before checking the limit, fixing an off-by-one. Minor trailing whitespace on blank line.
test/unit/test_card_server.py New unit test for cards_for_run verifying max_cards is respected without RuntimeError. Has an unused pytest import and missing trailing newline.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["cards_for_run(max_cards=N)"] --> B["Iterate steps & tasks"]
    B --> C{"card_generator is None?"}
    C -- Yes --> B
    C -- No --> D["For each card in card_generator"]
    D --> E["curr_idx += 1"]
    E --> F["yield (pathspec, card)"]
    F --> G{"curr_idx >= max_cards?"}
    G -- Yes --> H["return (generator ends cleanly)"]
    G -- No --> D
    D -- No more cards --> B
    B -- No more tasks --> I["Generator exhausted"]

    style H fill:#4CAF50,color:#fff
    style F fill:#2196F3,color:#fff
Loading

Last reviewed commit: 99b91dd

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 194 to 198
for card in card_generator:
curr_idx += 1
if curr_idx >= max_cards:
raise StopIteration
return
yield task.pathspec, card
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing off-by-one: last card is never yielded

The return (previously raise StopIteration) fires before the yield on line 198, so the card that causes curr_idx to reach max_cards is silently dropped. For example, with max_cards=20, only 19 cards are ever yielded; with max_cards=1, zero cards are yielded.

This existed before this PR too, so it's not a regression, but since you're already touching this line it may be worth fixing the ordering so the card is yielded first:

Suggested change
for card in card_generator:
curr_idx += 1
if curr_idx >= max_cards:
raise StopIteration
return
yield task.pathspec, card
for card in card_generator:
curr_idx += 1
yield task.pathspec, card
if curr_idx >= max_cards:
return

Comment on lines +50 to +61
# This should NOT raise RuntimeError
try:
list(
cards_for_run(
flow_datastore=None,
run_object=run,
only_running=False,
max_cards=1,
)
)
except RuntimeError:
pytest.fail("cards_for_run raised RuntimeError due to StopIteration") No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test doesn't assert on the returned card count

The test only verifies no RuntimeError is raised, but doesn't assert how many cards were actually returned. This means it can't catch off-by-one issues or regressions in the max_cards limit. Consider adding an assertion on the result length:

Suggested change
# This should NOT raise RuntimeError
try:
list(
cards_for_run(
flow_datastore=None,
run_object=run,
only_running=False,
max_cards=1,
)
)
except RuntimeError:
pytest.fail("cards_for_run raised RuntimeError due to StopIteration")
# This should NOT raise RuntimeError
result = list(
cards_for_run(
flow_datastore=None,
run_object=run,
only_running=False,
max_cards=1,
)
)
assert len(result) == 1, f"Expected 1 card, got {len(result)}"

)
)
except RuntimeError:
pytest.fail("cards_for_run raised RuntimeError due to StopIteration") No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline at end of file

The file is missing a trailing newline, which is a common convention and can cause noise in future diffs.

Suggested change
pytest.fail("cards_for_run raised RuntimeError due to StopIteration")
pytest.fail("cards_for_run raised RuntimeError due to StopIteration")

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@@ -0,0 +1,49 @@
import pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused pytest import

pytest is imported but never used in this file. It was used in the first commit (pytest.fail(...)) but that was removed in the second commit. This will trigger lint warnings (e.g., F401 in flake8).

Suggested change
import pytest
from metaflow.plugins.cards.card_server import cards_for_run


if curr_idx >= max_cards:
return

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace on blank line

This line contains trailing whitespace (16 spaces). Minor style nit — it should be a completely blank line or removed entirely.

Suggested change

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@npow npow added the gsoc Google Summer of Code label Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gsoc Google Summer of Code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Card server crashes with RuntimeError when run has 20+ cards

2 participants