feat: Improve coordination with improvements; improve and expand multimedia generation#964
feat: Improve coordination with improvements; improve and expand multimedia generation#964Henry-811 merged 11 commits intodev/v0.1.58from
Conversation
… add continuation support for generate img; fix eval criteria generation to include correctness hopefully fixing rendering
…re than once if the first time was an error; eval criteria improvement; grok image/video gen
|
Caution Review failedThe pull request is closed. ℹ️ Recent review infoConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThis PR introduces session-scoped logging architecture with ContextVar isolation for concurrent runs, new CLI mode flags (single-agent, coordination-mode, quick, personas), enhanced evaluation criteria with verification guidance, improved subagent orchestration with context path handling and parse_at_references support, new TUI modals for evaluation criteria and subagent context viewing, plateau detection for checklist criteria, and extensive skill documentation for multimedia capabilities. Changes
Sequence Diagram(s)(omitted — changes are broad and document/code additions; no single new multi-component sequential control flow diagram required per generation rules) Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 4❌ Failed checks (3 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
massgen/filesystem_manager/_filesystem_manager.py (1)
1897-1906:⚠️ Potential issue | 🟡 MinorUpdate
save_snapshotdocstring for the new parameter.
preserve_existing_snapshotwas added to the signature but is missing fromArgs.As per coding guidelines, `**/*.py`: For new or changed functions, include Google-style docstrings.📝 Minimal docstring fix
Args: timestamp: Optional timestamp to use for the snapshot directory (if not provided, generates one) is_final: If True, save as final snapshot for presentation + preserve_existing_snapshot: If True, do not overwrite non-empty snapshot_storage + (used for interrupted saves to preserve the last submitted answer snapshot).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@massgen/filesystem_manager/_filesystem_manager.py` around lines 1897 - 1906, The docstring for save_snapshot is missing the new parameter preserve_existing_snapshot; update the Google-style Args section of the save_snapshot method to document preserve_existing_snapshot (type: bool) and describe its behavior (when True, do not overwrite or rotate existing snapshot files in snapshot_storage/log dirs; when False, proceed with normal pruning/rotation), and include default value and effect on is_final if applicable so the docstring fully matches the function signature and behavior.massgen/config_builder.py (1)
4004-4013:⚠️ Potential issue | 🟡 MinorDocument the new
quickstart_config_filenameparameter inrun_quickstart.The function signature changed, but the docstring does not describe the new argument.
📝 Suggested docstring update
def run_quickstart(self, quickstart_config_filename: str | None = None) -> tuple | None: """Run simplified quickstart flow - just agents count and backend/model for each. @@ their models. + + Args: + quickstart_config_filename: Optional quickstart config filename. When set, + the config is saved under the project `.massgen/` directory. Returns: Tuple of (filepath, question, interface_choice, install_skills_now) or None if cancelled """As per coding guidelines:
**/*.py: For new or changed functions, include Google-style docstrings.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@massgen/config_builder.py` around lines 4004 - 4013, Add a Google-style docstring entry for the new quickstart_config_filename parameter on run_quickstart: describe its type (str | None), default (None), what it represents (path/filename of an existing quickstart config to load or pre-fill fields), and the behavior when provided (how it alters the quickstart flow / whether it is validated or causes an early return); keep the existing Returns section unchanged but ensure the Args section lists quickstart_config_filename alongside any other parameters and documents expected formats and side-effects.massgen/tests/integration/test_subagent_message_e2e.py (1)
1-16:⚠️ Potential issue | 🟡 MinorAdd the integration marker for this module.
This file is in
massgen/tests/integration/but is not marked@pytest.mark.integration, which weakens opt-in gating for integration runs.✅ Minimal marker addition
import pytest +pytestmark = pytest.mark.integration + from massgen.mcp_tools.hooks import HumanInputHook, RuntimeInboxPollerAs per coding guidelines
massgen/tests/**/*.py: “Use pytest markers:@pytest.mark.integration(opt-in: --run-integration),@pytest.mark.live_api,@pytest.mark.expensive,@pytest.mark.docker,@pytest.mark.asyncio”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@massgen/tests/integration/test_subagent_message_e2e.py` around lines 1 - 16, This test module lacks the integration marker; add a module-level pytest integration marker by defining pytestmark = pytest.mark.integration (or decorating tests with `@pytest.mark.integration`) at top of massgen/tests/integration/test_subagent_message_e2e.py so the test is opt-in for integration runs; reference the module-level variable name pytestmark or the decorator form and place it near the existing imports before the test definitions that use HumanInputHook, RuntimeInboxPoller, SubagentManager, SubagentConfig, and SubagentState.massgen/system_prompt_sections.py (1)
3259-3289:⚠️ Potential issue | 🔴 CriticalInitialize
self.builder_enabledbefore checklist-gated use.
DecompositionSection._build_decision_block()passesbuilder_enabled=self.builder_enabled(Line 3340), butDecompositionSection.__init__never definesself.builder_enabled. This will raiseAttributeErrorat runtime on the checklist-gated decomposition path.🛠️ Proposed fix
class DecompositionSection(SystemPromptSection): @@ def __init__( self, subtask: str | None = None, voting_threshold: int | None = None, voting_sensitivity: str = "roi", answers_used: int = 0, answer_cap: int | None = None, checklist_require_gap_report: bool = True, gap_report_mode: str = "changedoc", has_changedoc: bool = False, custom_checklist_items: list[str] | None = None, item_categories: dict[str, str] | None = None, item_verify_by: dict[str, str] | None = None, + builder_enabled: bool = True, ): @@ self.custom_checklist_items = custom_checklist_items self.item_categories = item_categories self.item_verify_by = item_verify_by + self.builder_enabled = builder_enabledAlso applies to: 3334-3341
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@massgen/system_prompt_sections.py` around lines 3259 - 3289, DecompositionSection.__init__ never sets self.builder_enabled but DecompositionSection._build_decision_block passes builder_enabled=self.builder_enabled, causing an AttributeError; add initialization of self.builder_enabled (e.g., default False or configurable via an __init__ parameter) inside DecompositionSection.__init__ so the attribute always exists, and if desired add an __init__ arg (builder_enabled) and assign it to self.builder_enabled to allow configuration.
🟡 Minor comments (27)
AGENTS.md-185-185 (1)
185-185:⚠️ Potential issue | 🟡 MinorUpdate stale identifier and clarify function location.
Line 185 references
_metadata_dirsinsideFilesystemManager.save_snapshot()'s helper, but the actual implementation uses module-level_WORKSPACE_METADATA_DIRSconstant (defined at line 126 inmassgen/filesystem_manager/_filesystem_manager.py) and a module-levelhas_meaningful_content()function (defined at line 129) called from withinsave_snapshot(). Update both AGENTS.md and CLAUDE.md to reference the correct identifiers and clarify thathas_meaningful_content()is module-level.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AGENTS.md` at line 185, Update the docs to reference the actual module-level identifiers used in massgen/filesystem_manager/_filesystem_manager.py: replace `_metadata_dirs` with `_WORKSPACE_METADATA_DIRS` and state that `has_meaningful_content()` is a module-level function (called from `FilesystemManager.save_snapshot()`), and adjust the explanatory text in AGENTS.md and CLAUDE.md to mention the constant `_WORKSPACE_METADATA_DIRS` and module-level `has_meaningful_content()` so readers can find them in that file.massgen/tests/test_evaluation_criteria_generator.py-329-333 (1)
329-333:⚠️ Potential issue | 🟡 MinorSilence ARG002 in fake subagent manager methods.
*args/**kwargsare intentionally broad in mocks, but currently produce lint warnings for unused parameters. Rename unused bindings to underscored names.🧹 Suggested fix
- def __init__(self, *args, **kwargs): + def __init__(self, *_args, **kwargs): captured["coordination"] = kwargs["subagent_orchestrator_config"].coordination - async def spawn_subagent(self, **kwargs): + async def spawn_subagent(self, **_kwargs): return SimpleNamespace( success=True, answer=json.dumps(Also applies to: 376-380
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_evaluation_criteria_generator.py` around lines 329 - 333, Rename unused broad mock parameters to underscored names to silence ARG002: change the __init__ signature from def __init__(self, *args, **kwargs) to use def __init__(self, *_args, **_kwargs) (and similarly rename def spawn_subagent(self, **kwargs) to def spawn_subagent(self, **_kwargs)) in the fake subagent manager used in test_evaluation_criteria_generator (references: the __init__ method that sets captured["coordination"] and the spawn_subagent method returning SimpleNamespace), and apply the same underscore-renaming for the other mock methods at the 376-380 region.massgen/skills/video-generation/references/production.md-22-28 (1)
22-28:⚠️ Potential issue | 🟡 MinorSpecify a language for the fenced shot-list example.
The code fence is missing a language identifier (MD040). Use
textfor plain shot plans.📝 Suggested fix
-``` +```text Shot 1 (0-6s): Wide establishing shot — hydrothermal vents, bioluminescent plankton, slow camera glide Shot 2 (6-12s): Close-medium tracking — comb jelly, rainbow cilia, soft bioluminescent light Shot 3 (12-18s): Medium dolly — dumbo octopus flapping fins over seafloor Shot 4 (18-24s): Dramatic reveal — anglerfish lure emerging from darkness Shot 5 (24-30s): Wide pullback — multiple bioluminescent organisms, awe-inspiring closing</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@massgen/skills/video-generation/references/production.mdaround lines 22 -
28, The fenced shot-list block containing "Shot 1...Shot 5" is missing a
language identifier; update the triple-backtick fence for that shot-list example
to use thetextlanguage (i.e., replacewithtext) so the markdown
linter MD040 is satisfied while preserving the existing shot lines (Shot 1, Shot
2, Shot 3, Shot 4, Shot 5).</details> </blockquote></details> <details> <summary>massgen/tests/test_evaluation_criteria_generator.py-258-267 (1)</summary><blockquote> `258-267`: _⚠️ Potential issue_ | _🟡 Minor_ **Add a Google-style docstring to the new prompt helper.** The helper is newly introduced/changed and should be documented with Args/Returns to match repository docstring standards. <details> <summary>📝 Suggested fix</summary> ```diff def _make_prompt(self, task="Build a website", has_changedoc=False, min_criteria=4, max_criteria=10): + """Build a criteria-generation prompt for test assertions. + + Args: + task: Task text to embed in the prompt. + has_changedoc: Whether changedoc instructions should be included. + min_criteria: Minimum criteria count. + max_criteria: Maximum criteria count. + + Returns: + The generated prompt string. + """ from massgen.evaluation_criteria_generator import EvaluationCriteriaGenerator ``` </details> As per coding guidelines `**/*.py`: For new or changed functions, include Google-style docstrings. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_evaluation_criteria_generator.py` around lines 258 - 267, Add a Google-style docstring to the test helper function _make_prompt describing its purpose, parameters, and return value: document Args for task (str), has_changedoc (bool), min_criteria (int), max_criteria (int) and what type/value is returned (the prompt string returned by EvaluationCriteriaGenerator._build_generation_prompt); place the docstring immediately under the def _make_prompt(...) declaration so it follows repository docstring standards. ``` </details> </blockquote></details> <details> <summary>massgen/skills/audio-generation/SKILL.md-63-66 (1)</summary><blockquote> `63-66`: _⚠️ Potential issue_ | _🟡 Minor_ **Add blank lines around the ElevenLabs voice table.** This table currently violates markdownlint MD058 (`blanks-around-tables`). <details> <summary>📝 Suggested fix</summary> ```diff **ElevenLabs** (top voices): + | Voice | Character | |-------|-----------| | Rachel | Warm, conversational female | | Sarah | Clear, professional female | | Josh | Friendly male | | Adam | Deep, authoritative male | | Emily | Bright, energetic female | + ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/skills/audio-generation/SKILL.md` around lines 63 - 66, The markdown table under the "ElevenLabs (top voices):" heading (the table starting with "| Voice | Character |") needs blank lines inserted before and after it to satisfy markdownlint MD058; update SKILL.md by adding an empty line above the "**ElevenLabs** (top voices):" table block (between the preceding paragraph/heading and the table) and another empty line after the table to separate it from the following content. ``` </details> </blockquote></details> <details> <summary>massgen/tests/test_claude_code_context_sharing.py-50-51 (1)</summary><blockquote> `50-51`: _⚠️ Potential issue_ | _🟡 Minor_ **Make the mock snapshot API fully observable and documented.** The new `preserve_existing_snapshot` parameter is currently ignored in recorded calls, so tests can’t assert propagation of that flag. Also, the changed function should include a Google-style docstring. <details> <summary>♻️ Suggested update</summary> ```diff - self.save_snapshot_calls: list[tuple[str | None, bool]] = [] + self.save_snapshot_calls: list[tuple[str | None, bool, bool]] = [] - async def save_snapshot(self, timestamp=None, is_final=False, preserve_existing_snapshot=False) -> None: - self.save_snapshot_calls.append((timestamp, is_final)) + async def save_snapshot( + self, + timestamp: str | None = None, + is_final: bool = False, + preserve_existing_snapshot: bool = False, + ) -> None: + """Save a snapshot of the mock workspace. + + Args: + timestamp: Optional snapshot timestamp directory. + is_final: Whether this is a final snapshot. + preserve_existing_snapshot: Whether existing snapshot content should be preserved. + """ + self.save_snapshot_calls.append((timestamp, is_final, preserve_existing_snapshot)) ``` </details> As per coding guidelines `**/*.py`: For new or changed functions, include Google-style docstrings. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_claude_code_context_sharing.py` around lines 50 - 51, The mock method save_snapshot currently ignores the new preserve_existing_snapshot parameter and lacks documentation; update the save_snapshot method to record all parameters by appending (timestamp, is_final, preserve_existing_snapshot) to self.save_snapshot_calls so tests can assert the flag propagation, and add a Google-style docstring above save_snapshot describing parameters (timestamp, is_final, preserve_existing_snapshot) and return type for clarity and adherence to guidelines. ``` </details> </blockquote></details> <details> <summary>massgen/agent_config.py-289-297 (1)</summary><blockquote> `289-297`: _⚠️ Potential issue_ | _🟡 Minor_ **Use a Google-style docstring for the new validator.** Please expand this new method docstring to include `Args` and `Raises` sections for consistency with repo docstring standards. <details> <summary>📝 Suggested fix</summary> ```diff def _validate_pre_collab_voting_threshold(self): - """Validate optional pre-collab checklist threshold override.""" + """Validate optional pre-collab checklist threshold override. + + Args: + None. + + Raises: + ValueError: If `pre_collab_voting_threshold` is not None and not a positive integer. + """ threshold = self.pre_collab_voting_threshold ``` </details> As per coding guidelines `**/*.py`: For new or changed functions, include Google-style docstrings. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/agent_config.py` around lines 289 - 297, Update the docstring for the _validate_pre_collab_voting_threshold method to a Google-style docstring that includes an Args section describing the parameter (self.pre_collab_voting_threshold: optional int or None) and a Raises section documenting the ValueError raised when the threshold is not a positive integer; keep the one-line summary and ensure the Args and Raises clearly state accepted types and the error condition so reviewers can see the contract for _validate_pre_collab_voting_threshold. ``` </details> </blockquote></details> <details> <summary>massgen/skills/video-generation/references/editing.md-68-75 (1)</summary><blockquote> `68-75`: _⚠️ Potential issue_ | _🟡 Minor_ **Clarify the “8 seconds” vs “up to 7 seconds” extension behavior.** These two bullets currently read as conflicting. Add one short explanation (for example, net timeline growth vs generated clip length) so users can reason about expected output duration. Based on learnings: Documentation must be consistent with implementation, concise, and usable. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/skills/video-generation/references/editing.md` around lines 68 - 75, The two bullets "Duration is always 8 seconds per extension" and "Each extension appends up to 7 seconds to the timeline" conflict; update the doc to clarify that each generated extension clip is 8 seconds long but the timeline only grows by up to 7 seconds because extensions overlap the last second of the previous clip (so generated clip length = 8s, net timeline growth per extension = +7s); replace or append a single short sentence explaining this net-growth vs clip-length behavior so readers can predict final duration. ``` </details> </blockquote></details> <details> <summary>massgen/tests/test_grok_multimedia_generation.py-23-758 (1)</summary><blockquote> `23-758`: _⚠️ Potential issue_ | _🟡 Minor_ **Apply Google-style docstrings for new test functions.** The new test methods have summary-only docstrings; repository rules require Google-style docstrings for changed Python functions. As per coding guidelines: `**/*.py`: For new or changed functions, include Google-style docstrings. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_grok_multimedia_generation.py` around lines 23 - 758, The test methods in massgen/tests/test_grok_multimedia_generation.py (e.g., test_grok_image_returns_error_when_no_api_key, test_grok_image_calls_xai_sdk_image_sample, test_grok_image_saves_base64_to_output_path, test_grok_image_returns_continuation_id, test_grok_image_continuation_passes_image_url, test_grok_image_continuation_not_found_returns_error, test_grok_image_aspect_ratio_passed_through, test_grok_image_size_always_maps_to_1k, test_grok_image_to_image_passes_image_url, and the Grok video tests like test_grok_video_returns_error_when_no_api_key, test_grok_video_calls_xai_sdk_video_generate, etc.) use summary-only docstrings; update each of these new/changed test functions to use Google-style docstrings (one-line summary, blank line, and explicit sections such as Args: and Returns: or Raises: as appropriate) following the repo guideline for Python files so they conform to the required docstring format. ``` </details> </blockquote></details> <details> <summary>massgen/config_builder.py-56-87 (1)</summary><blockquote> `56-87`: _⚠️ Potential issue_ | _🟡 Minor_ **Use Google-style docstrings for newly added helper functions.** `normalize_quickstart_config_filename` and `build_quickstart_config_path` are new functions, but their docstrings currently omit structured `Args`/`Returns` sections. <details> <summary>📝 Suggested docstring update</summary> ```diff def normalize_quickstart_config_filename( filename: str | None, default: str = DEFAULT_QUICKSTART_CONFIG_FILENAME, ) -> str: - """Normalize a quickstart config filename. - - - Keeps only the basename so quickstart controls the target directory. - - Falls back to ``default`` when value is empty/invalid. - - Appends ``.yaml`` when no extension is provided. - """ + """Normalize a quickstart config filename. + + Args: + filename: Candidate filename from user or CLI input. + default: Fallback filename when the candidate is empty or invalid. + + Returns: + A sanitized basename. Adds ``.yaml`` when no extension is provided. + """ @@ def build_quickstart_config_path( @@ ) -> Path: - """Build a quickstart config output path from location + filename.""" + """Build the output path for a quickstart config file. + + Args: + location: Target scope for config storage (`"project"` or `"global"`). + filename: Optional filename to normalize before path construction. + cwd: Optional project-directory override (primarily for tests). + home_dir: Optional home-directory override (primarily for tests). + + Returns: + Full filesystem path to the quickstart config file. + """ ``` </details> As per coding guidelines: `**/*.py`: For new or changed functions, include Google-style docstrings. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/config_builder.py` around lines 56 - 87, The docstrings for normalize_quickstart_config_filename and build_quickstart_config_path lack Google-style sections; update each function docstring to include an Args: section listing parameters (filename, default for normalize_quickstart_config_filename; location, filename, cwd, home_dir for build_quickstart_config_path) with types and default values, a Returns: section describing the return type (str for normalize_quickstart_config_filename and Path for build_quickstart_config_path), and any Notes/Raises if applicable, keeping the existing short description intact and using the Google docstring format throughout. ``` </details> </blockquote></details> <details> <summary>massgen/tests/test_cli_mode_flags.py-26-26 (1)</summary><blockquote> `26-26`: _⚠️ Potential issue_ | _🟡 Minor_ **Typo in test class names: "Flarg" should be "Flag".** Two test class names contain a typo: - Line 26: `TestModeFlargParsing` → `TestModeFlagParsing` - Line 137: `TestModeFlargValidation` → `TestModeFlagValidation` This doesn't affect test execution but reduces discoverability and readability. <details> <summary>🔧 Proposed fix</summary> ```diff -class TestModeFlargParsing: +class TestModeFlagParsing: """Test that mode flags are accepted and parsed correctly.""" ``` ```diff -class TestModeFlargValidation: +class TestModeFlagValidation: """Test validation of incompatible flag combinations.""" ``` </details> Also applies to: 137-137 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_cli_mode_flags.py` at line 26, The test class names contain a typo: rename TestModeFlargParsing to TestModeFlagParsing and TestModeFlargValidation to TestModeFlagValidation; update the class declarations (and any other occurrences or references to those symbols in the test file, e.g., test discovery, imports or uses) so the new identifiers are consistent and run correctly. ``` </details> </blockquote></details> <details> <summary>massgen/skills/image-generation/references/extra_params.md-11-11 (1)</summary><blockquote> `11-11`: _⚠️ Potential issue_ | _🟡 Minor_ **Align backend support claims for `seed` across this page.** Line 11 says `seed` is supported by ElevenLabs, but the backend matrix omits ElevenLabs entirely. Please reconcile these so users get one consistent support contract. As per coding guidelines: Ensure reference files are accurate. Also applies to: 67-82 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/skills/image-generation/references/extra_params.md` at line 11, The `seed` parameter is inconsistently documented: it is listed as supported by ElevenLabs on the `seed` row but ElevenLabs is missing from the backend support matrix; update the `seed` support claims so they match by either adding ElevenLabs to the backend matrix or removing ElevenLabs from the `seed` providers list in this file (`extra_params.md`), and apply the same reconciliation to the similar entries in the later block (lines referenced 67-82) so both the parameter row and the backend matrix present a single consistent support contract for `seed`. ``` </details> </blockquote></details> <details> <summary>massgen/tests/frontend/test_subagent_modal_index.py-211-211 (1)</summary><blockquote> `211-211`: _⚠️ Potential issue_ | _🟡 Minor_ **Unused callback parameter will trigger ARG001.** Line 211 defines `all_subagents` but never uses it in the test callback. <details> <summary>Proposed fix</summary> ```diff - def _callback(subagent, all_subagents, index): + def _callback(subagent, _all_subagents, index): ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/frontend/test_subagent_modal_index.py` at line 211, The test callback _callback currently declares a parameter named all_subagents that is never used and triggers ARG001; fix by either removing the unused parameter from the _callback signature or by renaming it to _all_subagents (or prefixing with an underscore) so the linter ignores it, or alternatively use the parameter in the function body if it was intended to be used; update the definition of _callback accordingly to eliminate the unused-argument warning. ``` </details> </blockquote></details> <details> <summary>massgen/backend/docs/permissions_and_context_files.md-384-389 (1)</summary><blockquote> `384-389`: _⚠️ Potential issue_ | _🟡 Minor_ **Align peer-update semantics with the surrounding lifecycle narrative.** This section says peers update agents mid-stream *without* restart, but the nearby “Partial Work on Restart” narrative still describes restart-based interruption. The mixed model is confusing for backend implementers and should be reconciled in one direction. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/backend/docs/permissions_and_context_files.md` around lines 384 - 389, The "Peer Update Delivery" section conflicts with the "Partial Work on Restart" narrative by describing mid-stream hook injection while the other text implies restart-based interruption; pick one canonical semantics (either true mid-stream hook delivery or always-restart delivery) and make the document consistent: update the "Peer Update Delivery" header and paragraph and any fallback text (e.g., the sentence about backends without hook support) to match the chosen model, and update the "Partial Work on Restart" section to reflect that behavior; search for the section titles "Peer Update Delivery" and "Partial Work on Restart" to locate all related paragraphs, examples and any mentions of hooks/enforcement message injection and align their wording to one coherent lifecycle model. ``` </details> </blockquote></details> <details> <summary>massgen/tests/test_multimedia_skills.py-137-147 (1)</summary><blockquote> `137-147`: _⚠️ Potential issue_ | _🟡 Minor_ **Strengthen symlink validation to assert symlink type, not just path existence.** `test_symlink_exists` currently passes if a regular file exists at that path. That can hide broken symlink setups. <details> <summary>Suggested test hardening</summary> ```diff def test_symlink_exists(self, skill_name: str) -> None: symlink = self.CLAUDE_SKILLS_DIR / skill_name - assert symlink.exists(), f"Symlink missing: {symlink}. " f"Run: ln -sf '../../massgen/skills/{skill_name}' " f"'.claude/skills/{skill_name}'" + assert symlink.exists(), f"Symlink missing: {symlink}. " f"Run: ln -sf '../../massgen/skills/{skill_name}' " f"'.claude/skills/{skill_name}'" + assert symlink.is_symlink(), f"Expected symlink, found non-symlink path: {symlink}" ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_multimedia_skills.py` around lines 137 - 147, test_symlink_exists currently only asserts path existence which allows regular files; change it to assert the path is a symlink by using symlink.is_symlink() (you can also keep symlink.exists() for clearer messaging) inside the test function test_symlink_exists that references CLAUDE_SKILLS_DIR and EXPECTED_SKILLS; keep test_symlink_target_resolves as-is but ensure it still uses symlink.resolve() and target.is_dir() to validate the link target. ``` </details> </blockquote></details> <details> <summary>docs/source/reference/cli.rst-53-55 (1)</summary><blockquote> `53-55`: _⚠️ Potential issue_ | _🟡 Minor_ **Document `--quickstart` as a first-class CLI option in this reference section.** `--config` now has special behavior when `--quickstart` is used, but this block doesn’t introduce `--quickstart` itself. Adding a dedicated row (or direct cross-reference) would make this behavior discoverable. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@docs/source/reference/cli.rst` around lines 53 - 55, Add a dedicated entry documenting the --quickstart CLI option and its interaction with --config: describe that --quickstart is a first-class flag, when supplied the --config value is treated as an output filename under .massgen/ (e.g., --quickstart --config team-config writes .massgen/team-config.yaml), note any defaults and whether the flag can be combined with other options, and add a cross-reference from the existing --config paragraph to the new --quickstart entry so readers can discover the special behavior easily. ``` </details> </blockquote></details> <details> <summary>massgen/tests/test_checklist_criteria_presets.py-481-481 (1)</summary><blockquote> `481-481`: _⚠️ Potential issue_ | _🟡 Minor_ **Drop the unused `verify_by` binding to satisfy lint.** `verify_by` is unpacked but never read in this test. <details> <summary>🔧 Small cleanup</summary> ```diff - items, categories, verify_by = orch._get_active_criteria() + items, categories, _verify_by = orch._get_active_criteria() ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_checklist_criteria_presets.py` at line 481, The test unpacks a third value from orch._get_active_criteria into verify_by but never uses it, triggering a lint warning; change the unpacking to drop the unused value by replacing the third target with an underscore (e.g., items, categories, _ = orch._get_active_criteria()) so only items and categories are bound and verify_by is not created. ``` </details> </blockquote></details> <details> <summary>massgen/tests/test_injection_count_removal.py-146-146 (1)</summary><blockquote> `146-146`: _⚠️ Potential issue_ | _🟡 Minor_ **Remove the unused unpacked variable.** Line 146 unpacks `had_unseen_updates` but never uses it. <details> <summary>💡 Proposed fix</summary> ```diff - selected_answers, had_unseen_updates = orch._select_midstream_answer_updates( + selected_answers, _ = orch._select_midstream_answer_updates( ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_injection_count_removal.py` at line 146, The call to orch._select_midstream_answer_updates currently unpacks two values into selected_answers and had_unseen_updates but had_unseen_updates is never used; change the call to only capture the needed value (e.g., assign the single returned value to selected_answers) or replace had_unseen_updates with a discard name (_) so the unused value is not kept—update the call site where orch._select_midstream_answer_updates is invoked and ensure any downstream code expects only selected_answers. ``` </details> </blockquote></details> <details> <summary>massgen/mcp_tools/checklist_tools_server.py-803-803 (1)</summary><blockquote> `803-803`: _⚠️ Potential issue_ | _🟡 Minor_ **Use explicit `T | None` instead of implicit `Optional`.** The `preserve` parameter defaults to `None` but lacks type annotation indicating it's optional. Per PEP 484 and static analysis (RUF013), this should be explicit. <details> <summary>🐛 Proposed fix</summary> ```diff - async def propose_improvements(improvements: dict, preserve: dict = None) -> str: + async def propose_improvements(improvements: dict, preserve: dict | None = None) -> str: ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/mcp_tools/checklist_tools_server.py` at line 803, The function propose_improvements has a parameter preserve that defaults to None but lacks an explicit optional type; update its annotation to use the explicit PEP 604 form (dict | None) so the signature reads propose_improvements(improvements: dict, preserve: dict | None = None) -> str and adjust any related type hints/usages in the function body if needed (reference: propose_improvements). ``` </details> </blockquote></details> <details> <summary>massgen/tests/test_concurrent_logging_sessions.py-750-754 (1)</summary><blockquote> `750-754`: _⚠️ Potential issue_ | _🟡 Minor_ **Catch `ValueError` instead of generic `Exception` to make the test more precise and eliminate the B017 static analysis warning.** The test passes `config_dict={"agents": []}`, which triggers `massgen.run()` to raise `ValueError("No agents configured")` at line 509 of massgen/__init__.py. Use `pytest.raises(ValueError)` to catch this specific exception—it's more self-documenting, resolves the linting issue, and still verifies that the ContextVar is properly reset even when an error occurs. <details> <summary>Current code</summary> ```python with pytest.raises(Exception): await massgen.run( query="test", config_dict={"agents": []}, # Invalid: no agents ) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/tests/test_concurrent_logging_sessions.py` around lines 750 - 754, The test currently catches a broad Exception; change the context manager to catch the specific ValueError raised by massgen.run (which raises ValueError("No agents configured") in massgen.__init__), i.e. replace pytest.raises(Exception) with pytest.raises(ValueError) in test_concurrent_logging_sessions.py so the test remains the same but is more precise and removes the B017 lint warning while still awaiting massgen.run(...) to exercise the ContextVar reset. ``` </details> </blockquote></details> <details> <summary>massgen/frontend/displays/textual_terminal_display.py-3686-3692 (1)</summary><blockquote> `3686-3692`: _⚠️ Potential issue_ | _🟡 Minor_ **Validate `default_selected_agent` against known agent IDs.** Current logic accepts any non-empty value, which can leave single-agent mode pointing to a non-existent agent. <details> <summary>Suggested patch</summary> ```diff if self.coordination_display.default_agent_mode == "single": self._mode_state.agent_mode = "single" - if self.coordination_display.default_selected_agent: - self._mode_state.selected_single_agent = self.coordination_display.default_selected_agent + selected = self.coordination_display.default_selected_agent + if selected in self.coordination_display.agent_ids: + self._mode_state.selected_single_agent = selected elif self.coordination_display.agent_ids: self._mode_state.selected_single_agent = self.coordination_display.agent_ids[0] ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/frontend/displays/textual_terminal_display.py` around lines 3686 - 3692, Validate that coordination_display.default_selected_agent is one of the known IDs before assigning it to _mode_state.selected_single_agent: in the block that sets self._mode_state.agent_mode = "single" check whether self.coordination_display.default_selected_agent is not only truthy but also contained in self.coordination_display.agent_ids (or matches an element of that list); if it is, assign it, otherwise fall back to using the first entry of self.coordination_display.agent_ids (if present) or leave selected_single_agent unset/None to avoid pointing to a non-existent agent. ``` </details> </blockquote></details> <details> <summary>massgen/frontend/displays/textual_terminal_display.py-7674-7682 (1)</summary><blockquote> `7674-7682`: _⚠️ Potential issue_ | _🟡 Minor_ **Use a full Google-style docstring for this changed method.** The method is changed but the docstring omits `Args`. <details> <summary>Suggested patch</summary> ```diff def set_evaluation_criteria( self, criteria: list[dict], source: str = "default", ) -> None: - """Store the active evaluation criteria for display via Ctrl+E.""" + """Store active evaluation criteria for display via Ctrl+E. + + Args: + criteria: Active evaluation criteria entries. + source: Criteria source label shown in the modal. + """ self._runtime_evaluation_criteria = list(criteria) if criteria else [] self._runtime_evaluation_criteria_source = source ``` </details> As per coding guidelines `**/*.py`: For new or changed functions, include Google-style docstrings. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/frontend/displays/textual_terminal_display.py` around lines 7674 - 7682, Update the docstring for set_evaluation_criteria to a full Google-style docstring: describe the method purpose, add an Args section documenting `criteria: list[dict]` (what list entries represent and that empty/None clears criteria) and `source: str` (default value and meaning), add a Returns section stating it returns None, and mention any side-effects (it sets `_runtime_evaluation_criteria` and `_runtime_evaluation_criteria_source`); keep the docstring inside the set_evaluation_criteria method and preserve existing behavior. ``` </details> </blockquote></details> <details> <summary>massgen/frontend/displays/textual_terminal_display.py-438-460 (1)</summary><blockquote> `438-460`: _⚠️ Potential issue_ | _🟡 Minor_ **Drop unused `sa_data` and make this new helper docstring Google-style.** `sa_data` is currently unused, and this new function’s docstring is missing structured `Args`/`Returns`. <details> <summary>Suggested patch</summary> ```diff def _build_context_paths_labeled( context_paths: list[str], workspace_path: str, - sa_data: dict[str, Any], existing: SubagentDisplayData | None, ) -> list[dict[str, str]]: - """Build labeled context paths for the SubagentContextModal.""" + """Build labeled context paths for the SubagentContextModal. + + Args: + context_paths: Normalized context paths attached to the subagent. + workspace_path: Subagent workspace path used for label inference. + existing: Existing display snapshot used for fallback labels. + + Returns: + List of dictionaries containing `path`, `label`, and `permission`. + """ ``` ```diff context_paths_labeled = _build_context_paths_labeled( context_paths, workspace_path, - sa_data, existing, ) ``` </details> As per coding guidelines `**/*.py`: For new or changed functions, include Google-style docstrings. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/frontend/displays/textual_terminal_display.py` around lines 438 - 460, The function _build_context_paths_labeled currently accepts an unused parameter sa_data and has a short non-Google-style docstring; remove the unused parameter (sa_data) from the signature and all callers, and replace the docstring with a Google-style docstring that documents Args (context_paths: list[str], workspace_path: str, existing: SubagentDisplayData | None) and Returns (list[dict[str, str]]), including a brief description of behavior and returned structure; update any references to _build_context_paths_labeled to match the new signature. ``` </details> </blockquote></details> <details> <summary>massgen/frontend/displays/textual/widgets/modals/content_modals.py-56-60 (1)</summary><blockquote> `56-60`: _⚠️ Potential issue_ | _🟡 Minor_ **Add Google-style docstrings for newly introduced methods in `EvaluationCriteriaModal`.** `__init__`, `compose`, and `on_button_pressed` are new/changed methods but do not include Google-style docstrings. As per coding guidelines, `**/*.py`: For new or changed functions, include Google-style docstrings. Also applies to: 115-131 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/frontend/displays/textual/widgets/modals/content_modals.py` around lines 56 - 60, Add Google-style docstrings to the new/modified methods of EvaluationCriteriaModal: provide a short summary, Args, Returns (if any), and Raises (if applicable) for __init__, compose, and on_button_pressed. For __init__, document parameters criteria: list[dict] | None and source: str with default, and describe instance attributes set; for compose, describe what UI components it yields/returns and any side effects; for on_button_pressed, document the event parameter, what button actions are handled, and any state changes or exceptions. Place the docstrings immediately under each method definition following Google style (triple-quoted) so linters and docs pick them up. ``` </details> </blockquote></details> <details> <summary>massgen/frontend/displays/textual/widgets/modals/content_modals.py-662-666 (1)</summary><blockquote> `662-666`: _⚠️ Potential issue_ | _🟡 Minor_ **Add Google-style docstrings for new `SubagentContextModal` methods.** `__init__`, `compose`, and `_build_path_rows` are new methods and should include Google-style docstrings. As per coding guidelines, `**/*.py`: For new or changed functions, include Google-style docstrings. Also applies to: 677-705 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/frontend/displays/textual/widgets/modals/content_modals.py` around lines 662 - 666, Add Google-style docstrings to the new methods on SubagentContextModal: __init__, compose, and _build_path_rows. For __init__, document the parameters (context_paths_labeled: list[dict[str, str]]) and any instance attributes set (e.g., self._paths); for compose, include return type info (ComposeResult) and a brief description of what UI elements it composes; for _build_path_rows, describe its purpose, parameters, return value (e.g., list of row widgets or similar), and any side effects. Keep each docstring concise, use Args:/Returns:/Raises: sections where appropriate, and place them directly above the respective method definitions. ``` </details> </blockquote></details> <details> <summary>massgen/frontend/displays/textual/widgets/modals/content_modals.py-50-54 (1)</summary><blockquote> `50-54`: _⚠️ Potential issue_ | _🟡 Minor_ **Use `ClassVar` to indicate intentional shared class-level state (RUF012).** This mutable class attribute triggers RUF012. Since `_CATEGORY_COLORS` is intentionally shared across instances (not a mutable default), annotate it with `ClassVar` to make that intent explicit and satisfy the linter: <details> <summary>Proposed fix</summary> ```diff from typing import TYPE_CHECKING, Any, ClassVar ... _CATEGORY_COLORS: ClassVar[dict[str, str]] = { "must": "[bold red]MUST[/]", "should": "[yellow]SHOULD[/]", "could": "[green]COULD[/]", } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/frontend/displays/textual/widgets/modals/content_modals.py` around lines 50 - 54, The class-level mutable dictionary _CATEGORY_COLORS is intended to be shared across instances but is not annotated, triggering RUF012; update its annotation to ClassVar[dict[str, str]] and add the corresponding import from typing (ClassVar) so the intent is explicit (locate the _CATEGORY_COLORS symbol in the class in content_modals.py and change its type hint to use ClassVar). ``` </details> </blockquote></details> <details> <summary>massgen/orchestrator.py-796-803 (1)</summary><blockquote> `796-803`: _⚠️ Potential issue_ | _🟡 Minor_ **Avoid silent failure on criteria-display pushes.** Line 801 and Line 10032 swallow broad exceptions. Keep display updates non-fatal, but log failures with traceback so regressions are diagnosable. <details> <summary>💡 Proposed fix</summary> ```diff - except Exception: - pass # TUI notification is non-critical + except Exception: + logger.debug( + "[Orchestrator] Non-critical: failed to push evaluation criteria to display", + exc_info=True, + ) ``` </details> Also applies to: 10032-10033 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@massgen/orchestrator.py` around lines 796 - 803, Replace the broad silent except around the display update so failures are logged with a traceback: in the try block that calls coordination_ui.display.set_evaluation_criteria and sets self._criteria_pushed_to_display, catch Exception as e and log the error (e.g., logger.exception or process_logger.error with traceback) instead of just passing; do the same for the other symmetric swallow (the similar block around lines handling display updates mentioned in the comment) so display-update failures remain non-fatal but are diagnosable. ``` </details> </blockquote></details> </blockquote></details> --- <details> <summary>ℹ️ Review info</summary> **Configuration used**: Path: .coderabbit.yaml **Review profile**: CHILL **Plan**: Pro <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 70a8784357cf3ef9e0667ec4713c0548f65641e8 and 66822a82203cd475e2e5371acf121718d5d40af7. </details> <details> <summary>⛔ Files ignored due to path filters (1)</summary> * `massgen/tests/frontend/__snapshots__/test_timeline_snapshot_scaffold/test_subagent_card_snapshot_option_a.svg` is excluded by `!**/*.svg`, `!**/*.svg` </details> <details> <summary>📒 Files selected for processing (133)</summary> * `.gitignore` * `AGENTS.md` * `CLAUDE.md` * `docs/modules/architecture.md` * `docs/modules/coordination_workflow.md` * `docs/modules/subagents.md` * `docs/source/quickstart/configuration.rst` * `docs/source/quickstart/installation.rst` * `docs/source/quickstart/running-massgen.rst` * `docs/source/reference/cli.rst` * `docs/source/reference/yaml_schema.rst` * `docs/source/user_guide/advanced/subagents.rst` * `docs/source/user_guide/multimodal.rst` * `massgen/__init__.py` * `massgen/agent_config.py` * `massgen/backend/docs/permissions_and_context_files.md` * `massgen/cli.py` * `massgen/config_builder.py` * `massgen/config_validator.py` * `massgen/configs/basic/multi/persona_diversity_example.yaml` * `massgen/coordination_tracker.py` * `massgen/evaluation_criteria_generator.py` * `massgen/events.py` * `massgen/filesystem_manager/__init__.py` * `massgen/filesystem_manager/_docker_manager.py` * `massgen/filesystem_manager/_filesystem_manager.py` * `massgen/frontend/displays/textual/__init__.py` * `massgen/frontend/displays/textual/widgets/__init__.py` * `massgen/frontend/displays/textual/widgets/modals/__init__.py` * `massgen/frontend/displays/textual/widgets/modals/content_modals.py` * `massgen/frontend/displays/textual_terminal_display.py` * `massgen/frontend/displays/textual_widgets/plan_options.py` * `massgen/frontend/displays/textual_widgets/quickstart_wizard.py` * `massgen/frontend/displays/textual_widgets/subagent_card.py` * `massgen/frontend/displays/textual_widgets/subagent_modal.py` * `massgen/frontend/displays/textual_widgets/subagent_screen.py` * `massgen/frontend/displays/textual_widgets/subagent_tui_modal.py` * `massgen/frontend/displays/textual_widgets/tool_card.py` * `massgen/frontend/displays/tui_modes.py` * `massgen/frontend/web/server.py` * `massgen/logger_config.py` * `massgen/mcp_tools/checklist_tools_server.py` * `massgen/mcp_tools/subagent/_subagent_mcp_server.py` * `massgen/orchestrator.py` * `massgen/persona_generator.py` * `massgen/session/_registry.py` * `massgen/shadow_agent.py` * `massgen/skills/audio-generation/SKILL.md` * `massgen/skills/audio-generation/references/advanced.md` * `massgen/skills/audio-generation/references/music_and_sfx.md` * `massgen/skills/audio-generation/references/voices.md` * `massgen/skills/image-generation/SKILL.md` * `massgen/skills/image-generation/references/backends.md` * `massgen/skills/image-generation/references/editing.md` * `massgen/skills/image-generation/references/extra_params.md` * `massgen/skills/multimedia-backend-integrator/SKILL.md` * `massgen/skills/video-generation/SKILL.md` * `massgen/skills/video-generation/references/backends.md` * `massgen/skills/video-generation/references/editing.md` * `massgen/skills/video-generation/references/production.md` * `massgen/subagent/manager.py` * `massgen/subagent/models.py` * `massgen/subagent_types/evaluator/SUBAGENT.md` * `massgen/subagent_types/quality_rethinking/SUBAGENT.md` * `massgen/system_message_builder.py` * `massgen/system_prompt_sections.py` * `massgen/task_decomposer.py` * `massgen/tests/conftest.py` * `massgen/tests/frontend/test_evaluation_criteria_modal.py` * `massgen/tests/frontend/test_quickstart_wizard_config_filename.py` * `massgen/tests/frontend/test_subagent_modal_index.py` * `massgen/tests/frontend/test_subagent_screen_context_paths.py` * `massgen/tests/frontend/test_subagent_screen_input.py` * `massgen/tests/frontend/test_timeline_snapshot_scaffold.py` * `massgen/tests/frontend/test_tool_card_click_behavior.py` * `massgen/tests/integration/test_subagent_message_e2e.py` * `massgen/tests/test_answer_count_increment.py` * `massgen/tests/test_checklist_criteria_presets.py` * `massgen/tests/test_checklist_label_refresh_ordering.py` * `massgen/tests/test_checklist_tools_server.py` * `massgen/tests/test_claude_code_context_sharing.py` * `massgen/tests/test_cli_mode_flags.py` * `massgen/tests/test_cli_plan_mode_eval_criteria.py` * `massgen/tests/test_cli_prompt_context_paths_toggle.py` * `massgen/tests/test_concurrent_logging_sessions.py` * `massgen/tests/test_config_builder.py` * `massgen/tests/test_convergence_novelty.py` * `massgen/tests/test_evaluation_criteria_generator.py` * `massgen/tests/test_final_presentation_fallback.py` * `massgen/tests/test_generate_media_live.py` * `massgen/tests/test_gepa_evaluation_flow.py` * `massgen/tests/test_grok_multimedia_backend_selection.py` * `massgen/tests/test_grok_multimedia_generation.py` * `massgen/tests/test_grok_multimedia_live.py` * `massgen/tests/test_injection_count_removal.py` * `massgen/tests/test_multimedia_skills.py` * `massgen/tests/test_multimodal_audio_backend_selection.py` * `massgen/tests/test_multimodal_audio_types.py` * `massgen/tests/test_multimodal_image_backend_selection.py` * `massgen/tests/test_persona_easing.py` * `massgen/tests/test_persona_generator.py` * `massgen/tests/test_plan_execution_chunked.py` * `massgen/tests/test_plan_mode_chunk_defaults.py` * `massgen/tests/test_quickstart_skills_setup.py` * `massgen/tests/test_refinement_quality.py` * `massgen/tests/test_snapshot_preservation.py` * `massgen/tests/test_spec_prompt_prefix.py` * `massgen/tests/test_specialized_subagents.py` * `massgen/tests/test_subagent.py` * `massgen/tests/test_subagent_manager.py` * `massgen/tests/test_subagent_mcp_server.py` * `massgen/tests/test_task_decomposer_subagent.py` * `massgen/tests/test_tui_spec_mode_state.py` * `massgen/tests/test_v3_three_agents.py` * `massgen/tests/unit/test_coordination_tracker.py` * `massgen/tool/_multimodal_tools/TOOL.md` * `massgen/tool/_multimodal_tools/__init__.py` * `massgen/tool/_multimodal_tools/generation/_audio.py` * `massgen/tool/_multimodal_tools/generation/_base.py` * `massgen/tool/_multimodal_tools/generation/_image.py` * `massgen/tool/_multimodal_tools/generation/_video.py` * `massgen/tool/_multimodal_tools/generation/generate_media.py` * `massgen/tool/_multimodal_tools/understand_audio.py` * `openspec/changes/add-audio-understanding-and-editing/design.md` * `openspec/changes/add-audio-understanding-and-editing/proposal.md` * `openspec/changes/add-audio-understanding-and-editing/specs/multimedia-generation/spec.md` * `openspec/changes/add-audio-understanding-and-editing/tasks.md` * `openspec/changes/add-multimedia-editing-capabilities/design.md` * `openspec/changes/add-multimedia-editing-capabilities/proposal.md` * `openspec/changes/add-multimedia-editing-capabilities/specs/multimedia-generation/spec.md` * `openspec/changes/add-multimedia-editing-capabilities/tasks.md` * `pyproject.toml` * `requirements.txt` </details> <details> <summary>💤 Files with no reviewable changes (3)</summary> * massgen/tests/frontend/test_timeline_snapshot_scaffold.py * massgen/tests/test_gepa_evaluation_flow.py * massgen/tests/test_convergence_novelty.py </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| # Create an isolated logging session for this run so concurrent massgen.run() | ||
| # calls don't share globals (see MAS-274). | ||
| _run_session = LoggingSession.create() | ||
| _session_token = set_current_session(_run_session) | ||
|
|
||
| def _cleanup_session(): | ||
| """Ensure session handlers are closed and ContextVar is restored.""" | ||
| _run_session.close() | ||
| _current_session.reset(_session_token) | ||
|
|
||
| from .utils import get_backend_type_from_model | ||
|
|
||
| # Initialize logging for programmatic API | ||
| # This ensures massgen.log is created and captures INFO+ messages | ||
| setup_logging(debug=False) | ||
| setup_logging(debug=False, session=_run_session) | ||
|
|
||
| # Generate session ID | ||
| session_id = f"api_session_{datetime.now().strftime('%Y%m%d_%H%M%S')}" | ||
|
|
||
| # Determine config to use (priority order) | ||
| final_config_dict = None | ||
| raw_config_for_metadata = None # Raw config (unexpanded env vars) for safe logging | ||
| config_path_used = None | ||
|
|
||
| if config_dict: | ||
| # 1. Pre-built config dict provided directly | ||
| final_config_dict = config_dict | ||
| raw_config_for_metadata = config_dict # No env expansion for dict input | ||
| config_path_used = "config_dict" | ||
|
|
||
| elif models: | ||
| # 2. Multiple models specified - build multi-agent config | ||
| final_config_dict = build_config( | ||
| models=models, | ||
| use_docker=use_docker, | ||
| context_paths=kwargs.get("context_paths"), | ||
| ) | ||
| raw_config_for_metadata = final_config_dict # No env expansion for built config | ||
| config_path_used = f"multi-agent:{','.join(models)}" | ||
|
|
||
| elif model and enable_filesystem: | ||
| # 3. Model with filesystem support (default) - use full config | ||
| final_config_dict = build_config( | ||
| num_agents=num_agents or 1, | ||
| model=model, | ||
| use_docker=use_docker, | ||
| context_paths=kwargs.get("context_paths"), | ||
| ) | ||
| raw_config_for_metadata = final_config_dict # No env expansion for built config | ||
| config_path_used = f"agent:{model}x{num_agents or 1}" | ||
| try: |
There was a problem hiding this comment.
Ensure session cleanup also runs when logging setup fails.
Lines 374-388 create/bind a session before entering the try block. If setup_logging(...) raises, _cleanup_session() is skipped and session state can leak.
💡 Proposed fix
- # Initialize logging for programmatic API
- # This ensures massgen.log is created and captures INFO+ messages
- setup_logging(debug=False, session=_run_session)
-
- # Generate session ID
- session_id = f"api_session_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
-
- try:
+ try:
+ # Initialize logging for programmatic API
+ # This ensures massgen.log is created and captures INFO+ messages
+ setup_logging(debug=False, session=_run_session)
+
+ # Generate session ID
+ session_id = f"api_session_{datetime.now().strftime('%Y%m%d_%H%M%S')}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/__init__.py` around lines 374 - 393, The session created by
LoggingSession.create() and bound via set_current_session(_run_session) can leak
if setup_logging(...) raises because _cleanup_session() is only called in the
later finally block; wrap the logging setup (setup_logging) and any code that
depends on it in a try/finally or call _cleanup_session() immediately on
exceptions so the ContextVar and session handlers are always restored/closed.
Specifically, ensure any exception raised by setup_logging(...) triggers
_cleanup_session() (referencing LoggingSession.create, set_current_session,
_run_session, _cleanup_session, _current_session.reset) before re-raising the
error so no session state persists.
| existing_paths = {p.get("path") for p in config_dict["orchestrator"]["context_paths"]} | ||
| for ctx in parsed.context_paths: | ||
| if ctx["path"] not in existing_paths: | ||
| config_dict["orchestrator"]["context_paths"].append(ctx) | ||
| existing_paths.add(ctx["path"]) |
There was a problem hiding this comment.
Handle mixed context_paths entry types before dedupe.
Line 482 currently assumes every entry has .get("path"). The same function’s API allows string entries, so this can crash with AttributeError when parse_at_references=True.
💡 Proposed fix
- existing_paths = {p.get("path") for p in config_dict["orchestrator"]["context_paths"]}
+ existing_paths: set[str] = set()
+ for p in config_dict["orchestrator"]["context_paths"]:
+ if isinstance(p, dict):
+ path_value = p.get("path")
+ else:
+ path_value = p
+ if path_value:
+ existing_paths.add(str(path_value))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/__init__.py` around lines 482 - 486, existing_paths is built assuming
each orchestrator.context_paths entry is a dict with .get("path"), which breaks
when entries are strings; change the dedupe logic to normalize path strings
before comparing: build existing_paths by iterating
config_dict["orchestrator"]["context_paths"] and using entry if
isinstance(entry, str) else entry.get("path"), then when iterating
parsed.context_paths extract path the same way (path = ctx if isinstance(ctx,
str) else ctx.get("path")) and only append the original ctx to
config_dict["orchestrator"]["context_paths"] if path not in existing_paths,
adding path to existing_paths afterward; update code references to
existing_paths, parsed.context_paths, and
config_dict["orchestrator"]["context_paths"] accordingly.
| try: | ||
| session_root_name = _get_root().name | ||
| return str(_Path(base) / session_root_name) | ||
| except Exception: | ||
| return base # Fallback: return unchanged if session root unavailable |
There was a problem hiding this comment.
Silent fallback can break workspace isolation under failure.
When scoping fails, both helpers silently return the unscoped base path. That can reintroduce cross-run path sharing and cleanup collisions.
💡 Suggested hardening
def _scope_snapshot_storage(base: str | None) -> str | None:
@@
- except Exception:
- return base # Fallback: return unchanged if session root unavailable
+ except Exception as exc:
+ logger.warning(
+ f"[CLI] Failed to scope snapshot_storage, using timestamp fallback: {exc}",
+ )
+ fallback_suffix = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
+ return str(_Path(base) / f"fallback_{fallback_suffix}")
@@
def _scope_agent_temporary_workspace(base: str | None) -> str | None:
@@
- except Exception:
- return base # Fallback: return unchanged if session root unavailable
+ except Exception as exc:
+ logger.warning(
+ f"[CLI] Failed to scope agent_temporary_workspace, using timestamp fallback: {exc}",
+ )
+ fallback_suffix = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
+ return str(_Path(base) / f"fallback_{fallback_suffix}")Also applies to: 2677-2681
🧰 Tools
🪛 Ruff (0.15.2)
[warning] 2652-2652: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/cli.py` around lines 2649 - 2653, The helpers currently swallow
exceptions from _get_root() and return the unscoped base, risking cross-run
collisions; change both places that call _get_root() (the blocks that compute
session_root_name using _get_root().name and build a path with _Path(base) /
session_root_name) to NOT silently return base on error — instead surface the
failure by raising a clear exception (e.g., RuntimeError) with context including
the caught exception and the base value, or if you must continue, produce a
safe, isolated fallback (e.g., append a deterministic unique suffix/UUID) so
that the unscoped base is never reused; update both occurrences (the shown block
and the similar block around the other helper) accordingly.
| def validate_mode_flag_combinations(args: argparse.Namespace) -> list[str]: | ||
| """Validate that CLI mode flag combinations are compatible. | ||
|
|
||
| Returns a list of error messages (empty if valid). | ||
| """ | ||
| errors: list[str] = [] | ||
|
|
||
| if getattr(args, "single_agent", None) is not None and getattr(args, "coordination_mode", None) == "decomposition": | ||
| errors.append( | ||
| "--single-agent and --coordination-mode decomposition are incompatible. " "Decomposition requires multiple agents.", | ||
| ) | ||
|
|
||
| personas = getattr(args, "personas", None) | ||
| if personas is not None and personas != "off" and getattr(args, "coordination_mode", None) == "decomposition": | ||
| errors.append( | ||
| "--personas requires parallel coordination mode, not decomposition.", | ||
| ) | ||
|
|
||
| return errors |
There was a problem hiding this comment.
--single-agent validation misses config-driven decomposition mode.
Current validation only checks args.coordination_mode. If config already sets decomposition, Line 9504 still filters to one agent and can produce an invalid runtime combination.
💡 Suggested fix (validate effective config, not only raw args)
# Apply CLI mode flags (--quick, --coordination-mode, --personas, --single-agent)
apply_mode_flags_to_config(config, args)
+
+ effective_coordination_mode = config.get("orchestrator", {}).get("coordination_mode")
+ if args.single_agent is not None and effective_coordination_mode == "decomposition":
+ raise ConfigurationError(
+ "--single-agent is incompatible with decomposition coordination mode."
+ )
+ if args.personas not in (None, "off") and effective_coordination_mode == "decomposition":
+ raise ConfigurationError(
+ "--personas requires parallel coordination mode, not decomposition."
+ )Also applies to: 9270-9272, 9504-9512
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/cli.py` around lines 2856 - 2874, The validation in
validate_mode_flag_combinations only inspects raw args.coordination_mode and
misses when a loaded config forces decomposition; update
validate_mode_flag_combinations to validate against the effective coordination
mode and persona settings (derive effective_mode = args.coordination_mode if
present else config.get("coordination_mode") or via the project’s existing
effective-config resolver), and likewise derive effective_personas and
effective_single_agent from args fallback to config; then use those effective
values when adding the same error messages; also update callers (the places
around the other checks you noted) to pass or compute the effective config
before calling validate_mode_flag_combinations so runtime filtering (lines
referenced) cannot produce an invalid single-agent + decomposition combination.
| if use_interactive_planning_subprocess: | ||
| # Interactive planning (Textual) needs direct terminal ownership. | ||
| log_base_dir = Path(os.getenv("MASSGEN_LOG_BASE_DIR", ".massgen/massgen_logs")) | ||
| existing_log_dirs = {p.name for p in log_base_dir.glob("log_*")} if log_base_dir.exists() else set() | ||
|
|
||
| result = subprocess.run(cmd) | ||
| if result.returncode != 0: | ||
| raise RuntimeError(f"Planning subprocess failed with exit code {result.returncode}") | ||
|
|
||
| if not log_base_dir.exists(): | ||
| raise RuntimeError("Planning completed but no log directory base was found") | ||
|
|
||
| # Wait for process to complete | ||
| process.wait() | ||
| created_logs = [p for p in log_base_dir.glob("log_*") if p.name not in existing_log_dirs] | ||
| if created_logs: | ||
| log_root = max(created_logs, key=lambda p: p.stat().st_mtime) | ||
| else: | ||
| all_logs = list(log_base_dir.glob("log_*")) | ||
| if not all_logs: | ||
| raise RuntimeError("Planning completed but no log session directory was found") | ||
| log_root = max(all_logs, key=lambda p: p.stat().st_mtime) | ||
|
|
There was a problem hiding this comment.
Interactive planning log discovery is race-prone with concurrent runs.
Selecting the “newest” log_* directory from a shared base can bind to another process’s run. This can execute the wrong plan session.
💡 Suggested fix (use a dedicated log base per subprocess run)
- log_base_dir = Path(os.getenv("MASSGEN_LOG_BASE_DIR", ".massgen/massgen_logs"))
- existing_log_dirs = {p.name for p in log_base_dir.glob("log_*")} if log_base_dir.exists() else set()
+ log_base_dir = Path(os.getenv("MASSGEN_LOG_BASE_DIR", ".massgen/massgen_logs"))
+ run_scope = f"plan_subprocess_{datetime.now().strftime('%Y%m%d_%H%M%S_%f')}"
+ scoped_log_base = log_base_dir / run_scope
+ env = os.environ.copy()
+ env["MASSGEN_LOG_BASE_DIR"] = str(scoped_log_base)
- result = subprocess.run(cmd)
+ result = subprocess.run(cmd, env=env)
if result.returncode != 0:
raise RuntimeError(f"Planning subprocess failed with exit code {result.returncode}")
- if not log_base_dir.exists():
+ if not scoped_log_base.exists():
raise RuntimeError("Planning completed but no log directory base was found")
-
- created_logs = [p for p in log_base_dir.glob("log_*") if p.name not in existing_log_dirs]
- if created_logs:
- log_root = max(created_logs, key=lambda p: p.stat().st_mtime)
- else:
- all_logs = list(log_base_dir.glob("log_*"))
- if not all_logs:
- raise RuntimeError("Planning completed but no log session directory was found")
- log_root = max(all_logs, key=lambda p: p.stat().st_mtime)
+ all_logs = sorted(scoped_log_base.glob("log_*"), key=lambda p: p.stat().st_mtime)
+ if not all_logs:
+ raise RuntimeError("Planning completed but no log session directory was found")
+ log_root = all_logs[-1]🧰 Tools
🪛 Ruff (0.15.2)
[error] 8755-8755: subprocess call: check for execution of untrusted input
(S603)
[warning] 8757-8757: Abstract raise to an inner function
(TRY301)
[warning] 8757-8757: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 8760-8760: Abstract raise to an inner function
(TRY301)
[warning] 8760-8760: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 8768-8768: Abstract raise to an inner function
(TRY301)
[warning] 8768-8768: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/cli.py` around lines 8750 - 8770, Interactive planning log discovery
is race-prone: when use_interactive_planning_subprocess is true the code scans
log_base_dir for the newest log_* and may pick another process's directory; to
fix, create a unique per-run log base and ensure the subprocess writes into it
(e.g., generate a unique directory name before calling subprocess.run(cmd), set
that path as log_base_dir for the child via environment or an argument, and pass
that same path into the post-run discovery logic) — update references in this
block (use_interactive_planning_subprocess, log_base_dir, existing_log_dirs,
subprocess.run(cmd), created_logs, log_root) so the subprocess is guaranteed to
produce logs under the dedicated directory and then pick the newest log only
within that dedicated base.
| try: | ||
| process.wait(timeout=3) | ||
| except subprocess.TimeoutExpired: | ||
| logger.warning(f"[SubagentMCP] Force killing {subagent_id}") | ||
| process.kill() | ||
| except Exception as e: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Verify active process type in manager
echo "=== Checking _active_processes type declaration ==="
rg -n "_active_processes" massgen/mcp_tools/subagent/_subagent_mcp_server.py -A 2 | head -20
echo "=== Checking create_subprocess_exec usage ==="
rg -n "create_subprocess_exec" massgen/mcp_tools/subagent/_subagent_mcp_server.py -B 2 -A 2
# 2) Verify timeout-style wait usage in cleanup
echo "=== Checking process.wait(timeout=...) usage ==="
rg -n "process\.wait\(timeout=" massgen/mcp_tools/subagent/_subagent_mcp_server.py -B 3 -A 3
# 3) Show asyncio Process.wait signature
echo "=== asyncio.subprocess.Process.wait signature ==="
python3 <<'PY'
import asyncio
import inspect
# Get the signature
sig = inspect.signature(asyncio.subprocess.Process.wait)
print(f"asyncio.subprocess.Process.wait signature: {sig}")
# Show the docstring to confirm it's async-only
print(f"\nDocstring: {asyncio.subprocess.Process.wait.__doc__}")
PYRepository: massgen/MassGen
Length of output: 469
🏁 Script executed:
# Check the full context around lines 1224-1229
echo "=== Full context of lines 1220-1235 ==="
sed -n '1220,1235p' massgen/mcp_tools/subagent/_subagent_mcp_server.pyRepository: massgen/MassGen
Length of output: 707
🏁 Script executed:
# Verify whether subprocess module is imported
echo "=== Checking imports ==="
head -50 massgen/mcp_tools/subagent/_subagent_mcp_server.py | grep -E "^import|^from"Repository: massgen/MassGen
Length of output: 422
🏁 Script executed:
# Search in manager.py for _active_processes definition
echo "=== Searching manager.py for _active_processes ==="
rg -n "_active_processes" massgen/subagent/manager.py -B 1 -A 1 | head -30
# Also check the full file path
find . -name "manager.py" -path "*/subagent/*" 2>/dev/nullRepository: massgen/MassGen
Length of output: 1399
🏁 Script executed:
# Get asyncio Process API details
python3 <<'PY'
import asyncio
import inspect
# Check the wait method
wait_method = asyncio.subprocess.Process.wait
print("asyncio.subprocess.Process.wait is a coroutine?", inspect.iscoroutinefunction(wait_method))
print("Signature:", inspect.signature(wait_method))
# Check subprocess.Process for comparison
import subprocess
print("\nsubprocess.Process.wait signature:", inspect.signature(subprocess.Process.wait))
print("subprocess.Process.wait has timeout parameter?", 'timeout' in str(inspect.signature(subprocess.Process.wait)))
# Show whether timeout is supported
sig = inspect.signature(subprocess.Process.wait)
for param_name, param in sig.parameters.items():
if 'timeout' in param_name.lower():
print(f"Found timeout param: {param_name}")
PYRepository: massgen/MassGen
Length of output: 283
🏁 Script executed:
# Find where processes are created/stored in manager
echo "=== Looking for process creation in manager.py ==="
rg -n "asyncio.subprocess\|subprocess.Popen\|create_subprocess" massgen/subagent/manager.py | head -20Repository: massgen/MassGen
Length of output: 109
🏁 Script executed:
# Verify the exact error that would occur
python3 <<'PY'
import asyncio
import subprocess
# Create a dummy async process-like object to test
class DummyProcess:
returncode = None
# Try calling wait with timeout on asyncio.subprocess.Process signature
try:
# Simulate what happens when calling wait(timeout=3) on asyncio.subprocess.Process
import inspect
sig = inspect.signature(asyncio.subprocess.Process.wait)
# Call with timeout parameter
bound = sig.bind_partial(None, timeout=3)
except TypeError as e:
print(f"Error when calling with timeout: {e}")
# Confirm subprocess.TimeoutExpired is from subprocess module
print(f"subprocess.TimeoutExpired exists: {hasattr(subprocess, 'TimeoutExpired')}")
print(f"asyncio.subprocess.Process.wait parameters: {list(inspect.signature(asyncio.subprocess.Process.wait).parameters.keys())}")
PYRepository: massgen/MassGen
Length of output: 228
process.wait(timeout=3) is invalid for asyncio.subprocess.Process.
_active_processes contains asyncio.subprocess.Process objects, but line 1225 attempts to call wait(timeout=3), which does not support the timeout parameter. This raises TypeError immediately, not subprocess.TimeoutExpired, causing the exception handler to fail and skip the intended force-kill path.
Fix by checking returncode directly in this synchronous atexit context:
Minimal safe fix
- try:
- process.wait(timeout=3)
- except subprocess.TimeoutExpired:
- logger.warning(f"[SubagentMCP] Force killing {subagent_id}")
- process.kill()
+ if process.returncode is None:
+ logger.warning(f"[SubagentMCP] Force killing {subagent_id}")
+ process.kill()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| process.wait(timeout=3) | |
| except subprocess.TimeoutExpired: | |
| logger.warning(f"[SubagentMCP] Force killing {subagent_id}") | |
| process.kill() | |
| except Exception as e: | |
| if process.returncode is None: | |
| logger.warning(f"[SubagentMCP] Force killing {subagent_id}") | |
| process.kill() | |
| except Exception as e: |
🧰 Tools
🪛 Ruff (0.15.2)
[warning] 1229-1229: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/mcp_tools/subagent/_subagent_mcp_server.py` around lines 1224 - 1229,
The code is calling process.wait(timeout=3) on asyncio.subprocess.Process
objects in _active_processes (in the SubagentMCP shutdown/atexit path), which
doesn't support a timeout and raises TypeError; replace that logic with a
synchronous check of process.returncode and if it's still None then log and call
process.kill() (use the existing variables process and subagent_id), avoiding
awaiting coroutines or using subprocess.TimeoutExpired; this ensures the
intended force-kill path runs in the synchronous atexit context.
| if result.get("verdict") == iterate_action: | ||
| result = dict(result) | ||
| result["explanation"] = ( | ||
| result.get("explanation", "") + " You now have your improvement plan. Implement the changes identified above, " | ||
| "verify them (screenshots, file checks, confirming changes landed correctly), " | ||
| "then call the `new_answer` **workflow tool** to submit your completed work. " | ||
| "Do not call `submit_checklist` again — verifying your own implementation " | ||
| "after making changes is expected and correct, but `submit_checklist` is " | ||
| "used once per answer." | ||
| result.get("explanation", "") + " NEXT: Call `propose_improvements` with specific improvements for each " "failing criterion. Then implement your plan and call `new_answer`." | ||
| ) |
There was a problem hiding this comment.
Gate propose_improvements guidance to accepted checklist submissions only.
At Line 1170, the NEXT: Call propose_improvements... instruction is appended for every iterate verdict, including invalid/incomplete submissions. That conflicts with the resubmission path and can misdirect the agent.
💡 Proposed fix
- if result.get("verdict") == iterate_action:
+ if result.get("verdict") == iterate_action and not submission_has_validation_error:
result = dict(result)
result["explanation"] = (
result.get("explanation", "") + " NEXT: Call `propose_improvements` with specific improvements for each " "failing criterion. Then implement your plan and call `new_answer`."
)| self._register_injected_answer_updates(agent_id, list(selected_answers.keys())) | ||
| self._refresh_checklist_state_for_agent(agent_id) | ||
| self.agent_states[agent_id].restart_pending = self._has_unseen_answer_updates(agent_id) | ||
|
|
||
| _inj_emitter = get_event_emitter() | ||
| if _inj_emitter: | ||
| _inj_emitter.emit_injection_received( | ||
| agent_id=agent_id, | ||
| source_agents=list(selected_answers.keys()), | ||
| injection_type="mid_stream", | ||
| ) | ||
| logger.info( | ||
| "[Orchestrator] MCP hook: injecting %d peer answer(s) for %s", | ||
| len(selected_answers), | ||
| agent_id, | ||
| ) | ||
|
|
||
| self.coordination_tracker.track_agent_action( | ||
| agent_id, | ||
| ActionType.UPDATE_INJECTED, | ||
| f"Mid-stream (MCP hook): {len(selected_answers)} answer(s)", | ||
| ) | ||
| self.coordination_tracker.update_agent_context_with_new_answers( | ||
| agent_id, | ||
| list(selected_answers.keys()), | ||
| _inj_emitter = get_event_emitter() | ||
| if _inj_emitter: | ||
| _inj_emitter.emit_injection_received( | ||
| agent_id=agent_id, | ||
| source_agents=list(selected_answers.keys()), | ||
| injection_type="mid_stream", | ||
| ) | ||
|
|
||
| self.coordination_tracker.track_agent_action( | ||
| agent_id, | ||
| ActionType.UPDATE_INJECTED, | ||
| f"Mid-stream (MCP hook): {len(selected_answers)} answer(s)", | ||
| ) | ||
| self.coordination_tracker.update_agent_context_with_new_answers( | ||
| agent_id, | ||
| list(selected_answers.keys()), | ||
| ) | ||
|
|
There was a problem hiding this comment.
Refresh checklist state after context-label updates in this MCP hook path.
Line 7025 refreshes checklist state before Line 7047 updates context labels, so available_agent_labels can be stale during this turn.
💡 Proposed fix
self._register_injected_answer_updates(agent_id, list(selected_answers.keys()))
- self._refresh_checklist_state_for_agent(agent_id)
self.agent_states[agent_id].restart_pending = self._has_unseen_answer_updates(agent_id)
@@
self.coordination_tracker.update_agent_context_with_new_answers(
agent_id,
list(selected_answers.keys()),
)
+ self._refresh_checklist_state_for_agent(agent_id)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self._register_injected_answer_updates(agent_id, list(selected_answers.keys())) | |
| self._refresh_checklist_state_for_agent(agent_id) | |
| self.agent_states[agent_id].restart_pending = self._has_unseen_answer_updates(agent_id) | |
| _inj_emitter = get_event_emitter() | |
| if _inj_emitter: | |
| _inj_emitter.emit_injection_received( | |
| agent_id=agent_id, | |
| source_agents=list(selected_answers.keys()), | |
| injection_type="mid_stream", | |
| ) | |
| logger.info( | |
| "[Orchestrator] MCP hook: injecting %d peer answer(s) for %s", | |
| len(selected_answers), | |
| agent_id, | |
| ) | |
| self.coordination_tracker.track_agent_action( | |
| agent_id, | |
| ActionType.UPDATE_INJECTED, | |
| f"Mid-stream (MCP hook): {len(selected_answers)} answer(s)", | |
| ) | |
| self.coordination_tracker.update_agent_context_with_new_answers( | |
| agent_id, | |
| list(selected_answers.keys()), | |
| _inj_emitter = get_event_emitter() | |
| if _inj_emitter: | |
| _inj_emitter.emit_injection_received( | |
| agent_id=agent_id, | |
| source_agents=list(selected_answers.keys()), | |
| injection_type="mid_stream", | |
| ) | |
| self.coordination_tracker.track_agent_action( | |
| agent_id, | |
| ActionType.UPDATE_INJECTED, | |
| f"Mid-stream (MCP hook): {len(selected_answers)} answer(s)", | |
| ) | |
| self.coordination_tracker.update_agent_context_with_new_answers( | |
| agent_id, | |
| list(selected_answers.keys()), | |
| ) | |
| self._register_injected_answer_updates(agent_id, list(selected_answers.keys())) | |
| self.agent_states[agent_id].restart_pending = self._has_unseen_answer_updates(agent_id) | |
| logger.info( | |
| "[Orchestrator] MCP hook: injecting %d peer answer(s) for %s", | |
| len(selected_answers), | |
| agent_id, | |
| ) | |
| _inj_emitter = get_event_emitter() | |
| if _inj_emitter: | |
| _inj_emitter.emit_injection_received( | |
| agent_id=agent_id, | |
| source_agents=list(selected_answers.keys()), | |
| injection_type="mid_stream", | |
| ) | |
| self.coordination_tracker.track_agent_action( | |
| agent_id, | |
| ActionType.UPDATE_INJECTED, | |
| f"Mid-stream (MCP hook): {len(selected_answers)} answer(s)", | |
| ) | |
| self.coordination_tracker.update_agent_context_with_new_answers( | |
| agent_id, | |
| list(selected_answers.keys()), | |
| ) | |
| self._refresh_checklist_state_for_agent(agent_id) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/orchestrator.py` around lines 7024 - 7051, The checklist is refreshed
too early in the MCP hook path: _refresh_checklist_state_for_agent(agent_id) is
called before
coordination_tracker.update_agent_context_with_new_answers(agent_id, ...),
causing available_agent_labels to be stale; move or add the refresh so it runs
after update_agent_context_with_new_answers (or re-run it immediately after) and
ensure agent_states[agent_id].restart_pending =
self._has_unseen_answer_updates(agent_id) is computed after the refresh/update
to reflect the new context labels.
| def _save_registry(self, data: dict[str, list[dict[str, Any]]]) -> None: | ||
| """Save registry to disk.""" | ||
| """Save registry to disk (no lock — callers must hold a lock themselves).""" |
There was a problem hiding this comment.
Locking contract is incomplete for one write path.
Line 110 says callers must hold a lock before _save_registry, but delete_session still performs an unlocked read-modify-write and calls _save_registry directly (Line 337). Under concurrent register/complete/delete flows, updates can be lost.
🔧 Suggested fix (route delete through `_update_registry`)
def delete_session(self, session_id: str) -> bool:
@@
- registry = self._load_registry()
-
- initial_count = len(registry["sessions"])
- registry["sessions"] = [s for s in registry["sessions"] if s["session_id"] != session_id]
-
- if len(registry["sessions"]) < initial_count:
- self._save_registry(registry)
- logger.info(f"Deleted session from registry: {session_id}")
- return True
-
- return False
+ deleted = False
+
+ def _do_delete(registry: dict[str, list[dict[str, Any]]]) -> dict[str, list[dict[str, Any]]]:
+ nonlocal deleted
+ initial_count = len(registry["sessions"])
+ registry["sessions"] = [s for s in registry["sessions"] if s["session_id"] != session_id]
+ deleted = len(registry["sessions"]) < initial_count
+ if deleted:
+ logger.info(f"Deleted session from registry: {session_id}")
+ return registry
+
+ self._update_registry(_do_delete)
+ return deleted📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _save_registry(self, data: dict[str, list[dict[str, Any]]]) -> None: | |
| """Save registry to disk.""" | |
| """Save registry to disk (no lock — callers must hold a lock themselves).""" | |
| def delete_session(self, session_id: str) -> bool: | |
| deleted = False | |
| def _do_delete(registry: dict[str, list[dict[str, Any]]]) -> dict[str, list[dict[str, Any]]]: | |
| nonlocal deleted | |
| initial_count = len(registry["sessions"]) | |
| registry["sessions"] = [s for s in registry["sessions"] if s["session_id"] != session_id] | |
| deleted = len(registry["sessions"]) < initial_count | |
| if deleted: | |
| logger.info(f"Deleted session from registry: {session_id}") | |
| return registry | |
| self._update_registry(_do_delete) | |
| return deleted |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/session/_registry.py` around lines 109 - 110, delete_session performs
an unlocked read-modify-write then calls _save_registry directly, breaking the
locking contract of _save_registry; modify delete_session to stop touching the
file/registry directly and instead route the deletion through the existing
_update_registry helper (or wrap the read-modify-write+_save_registry call with
the same lock logic used by _update_registry) so that the registry modification
and the call to _save_registry happen while holding the lock; reference the
_save_registry, delete_session, and _update_registry symbols when making the
change and remove the direct unlocked call to _save_registry in delete_session.
| # Race condition guard: subprocess may have finished but parent status | ||
| # hasn't been updated yet. Check for answer.txt as a completion signal. | ||
| answer_file = workspace / "answer.txt" | ||
| if answer_file.exists(): | ||
| msg = f"Subagent '{subagent_id}' has already completed" " (answer.txt found in workspace)" | ||
| logger.warning(f"[SubagentManager] Cannot send message: {msg}") | ||
| return (False, msg) |
There was a problem hiding this comment.
Avoid false “already completed” rejections for active continuations.
Line 207 currently treats any existing answer.txt as terminal. In continuation flows that reuse the same workspace, a stale answer.txt from an earlier run can cause runtime messages to be rejected even while the subagent is actively running.
💡 Suggested guard refinement
- answer_file = workspace / "answer.txt"
- if answer_file.exists():
+ answer_file = workspace / "answer.txt"
+ has_active_process = any(
+ proc.returncode is None
+ and (
+ proc_id == subagent_id
+ or proc_id.startswith(f"{subagent_id}_cont_")
+ )
+ for proc_id, proc in self._active_processes.items()
+ )
+ if answer_file.exists() and not has_active_process:
msg = f"Subagent '{subagent_id}' has already completed" " (answer.txt found in workspace)"
logger.warning(f"[SubagentManager] Cannot send message: {msg}")
return (False, msg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@massgen/subagent/manager.py` around lines 204 - 210, The current guard treats
any existing answer.txt as terminal and can incorrectly reject messages for
active continuations; change the check to only treat answer.txt as completion
when the subagent is not currently running — i.e., replace the unconditional
answer_file.exists() branch with a combined check like if answer_file.exists()
and not self._is_subagent_running(subagent_id) (or check membership in whatever
running-subagent registry you maintain), then log and return the rejection;
reference symbols: answer_file, workspace, subagent_id, and the
SubagentManager's running-subagent check method/structure (implement
_is_subagent_running if it doesn't exist).
PR Title Format
Your PR title must follow the format:
<type>: <brief description>Valid types:
fix:- Bug fixesfeat:- New featuresbreaking:- Breaking changesdocs:- Documentation updatesrefactor:- Code refactoringtest:- Test additions/modificationschore:- Maintenance tasksperf:- Performance improvementsstyle:- Code style changesci:- CI/CD configuration changesExamples:
fix: resolve memory leak in data processingfeat: add export to CSV functionalitybreaking: change API response formatdocs: update installation guideDescription
Brief description of the changes in this PR
Type of change
fix:) - Non-breaking change which fixes an issuefeat:) - Non-breaking change which adds functionalitybreaking:) - Fix or feature that would cause existing functionality to not work as expecteddocs:) - Documentation updatesrefactor:) - Code changes that neither fix a bug nor add a featuretest:) - Adding missing tests or correcting existing testschore:) - Maintenance tasks, dependency updates, etc.perf:) - Code changes that improve performancestyle:) - Changes that do not affect the meaning of the code (formatting, missing semi-colons, etc.)ci:) - Changes to CI/CD configuration files and scriptsChecklist
Pre-commit status
How to Test
Add test method for this PR.
Test CLI Command
Write down the test bash command. If there is pre-requests, please emphasize.
Expected Results
Description/screenshots of expected results.
Additional context
Add any other context about the PR here.
Summary by CodeRabbit
New Features
Documentation
UI/UX