Skip to content

[FEATURE] Formalize Changedoc + Memory → Skill Pipeline in Coordination Modes #1018

@ncrispino

Description

@ncrispino

Context

MassGen already has the building blocks for a changedoc+memory-to-skill pipeline, but they're loosely connected:

  1. Changedocs (tasks/changedoc.md) — structured decision journals agents produce during coordination. Every DEC-NNN entry captures choice, rationale, alternatives rejected, and implementation references. Created in real-time, inherited across agents, and saved to logs.
  2. Session memory (memory/short_term/ and memory/long_term/) — two-tier filesystem memory that agents write during runs. Short-term memories (verification replays, quick notes, task context, user prefs) auto-inject every turn. Long-term memories (skill effectiveness analysis, approach patterns, post-mortems) persist for on-demand use. Together these capture the operational knowledge agents accumulate — what worked, what didn't, user preferences, and pattern observations.
  3. Evolving skills (tasks/evolving_skill/SKILL.md) — workflow plans agents create during runs. The system prompt says "Use tasks/changedoc.md as the canonical decision log for your evolving skill" (system_prompt_sections.py:5552), but this connection is a one-liner hint — agents don't systematically distill changedoc decisions or memory content into skill structure.
  4. Analysis mode (user profile) — reads logs post-run and distills them into reusable skills in .agent/skills/. Has lifecycle modes (create_or_update, create_new, consolidate). But it works from raw logs, not from the structured changedoc + memory data sitting right there.
  5. Previous session skill loadingscan_previous_session_skills() can find evolving skills from past runs and make them available. But changedoc and memory content don't flow into this.
  6. Skill organization mode — merges overlapping skills, creates SKILL_REGISTRY.md. Operates on the skill side only, unaware of changedocs or memory.

Problem

The changedoc + memory → skill connection is informal. In practice:

  • Changedocs have rich structured data (decisions, alternatives, implementations) that evolving skills and analysis mode don't systematically leverage
  • Session memory captures operational knowledge (what worked, patterns observed, user preferences, verification outcomes) that should directly feed skill "Tips/Gotchas/Learnings" sections — but doesn't
  • Analysis mode "user profile" reads raw logs to create skills but doesn't use the already-parsed changedoc structure or memory files
  • Evolving skills are told to use changedoc as their "decision log" but don't actually ingest/transform DEC entries into skill workflow steps, and don't pull from memory tiers
  • Across many runs, recurring decision patterns in changedocs and recurring observations in memory (e.g., "chose CSS-only animation over JS", "approach X consistently outperforms Y") could inform skill creation, but nothing aggregates these patterns

Proposal

Make the changedoc + memory → skill pipeline explicit and structured across the relevant coordination modes:

1. Evolving Skill Section: Structured Changedoc + Memory Ingestion

Instead of just "use changedoc as your decision log," the evolving skill system prompt should instruct agents to:

  • Map each DEC-NNN to a workflow step in the skill
  • Convert "Alternatives considered" into the skill's "Tips/Gotchas" or anti-patterns section
  • Convert "Implementation" references into the skill's "Tools to Use" or file reference sections
  • Use the Deliberation Trail to identify which approaches survived refinement (= proven techniques worth encoding)
  • Pull from memory/short_term/verification_latest.md to capture what verification strategies worked
  • Pull from memory/long_term/approach_patterns.md and skill_effectiveness.md to encode proven patterns
  • Pull from memory/short_term/user_prefs.md to capture user-specific preferences into the skill

2. Analysis Mode: Changedoc + Memory-Aware Skill Distillation

The user-profile analysis mode (get_log_analysis_prompt_prefix with profile="user") should:

  • Explicitly look for changedoc.md files in the log directory, not just raw logs
  • Explicitly look for memory/short_term/ and memory/long_term/ files in agent workspaces
  • Use the changedoc's structured decisions as primary input for skill workflow steps
  • Use memory content as primary input for skill learnings, tips, and preferences
  • Map: DEC entries → skill workflow, alternatives → anti-patterns, memory observations → learnings, verification replays → verification section

3. Cross-Session Changedoc + Memory Aggregation

New capability: scan changedocs and memory files across multiple log sessions to identify recurring patterns:

  • Similar to how scan_previous_session_skills() finds evolving skills across logs
  • A scan_previous_session_changedocs() could extract DEC entries across runs
  • A scan_previous_session_memories() could extract long-term memory observations across runs
  • When multiple runs in the same domain produce similar decisions or memory observations, that's a strong signal for skill creation
  • This could feed into analysis mode or skill organization mode

4. Skill Lifecycle Integration

Connect changedoc + memory provenance to the skill lifecycle:

  • Skills created from changedocs + memory should include changedoc_origin and memory_sources metadata
  • When create_or_update matches an existing skill, new changedoc decisions and memory observations should be merged into the skill's workflow/learnings
  • Consolidation mode should consider changedoc and memory overlap when merging skills

Key Files

File Role
massgen/system_prompt_sections.py:4237-4400 Changedoc creation/inheritance prompts
massgen/system_prompt_sections.py:2460-2600 Memory filesystem instructions (short-term/long-term tiers, examples)
massgen/system_prompt_sections.py:5423-5580 Evolving skill section (where changedoc hint lives)
massgen/system_prompt_sections.py:4540-4542 Memory write instructions in learning capture
massgen/cli.py:1548-1652 Analysis mode user profile (skill distillation)
massgen/cli.py:1655-1734 Skill organization mode
massgen/filesystem_manager/skills_manager.py Skill scanning, previous session skills
massgen/changedoc.py Changedoc reader utility
massgen/system_message_builder.py:1048-1080 Memory tier loading (short_term + long_term)
massgen/orchestrator.py:6532-6546 Changedoc extraction from workspace
docs/source/user_guide/tools/skills_lifecycle_and_consolidation.rst Lifecycle docs

Acceptance Criteria

  • Evolving skill prompt systematically maps DEC entries → skill sections (not just "use as decision log")
  • Evolving skill prompt systematically pulls from memory tiers → skill learnings/tips/preferences
  • Analysis mode user profile reads changedoc.md + memory files from logs and uses them for skill generation
  • Cross-session changedoc scanning function exists (parallel to scan_previous_session_skills)
  • Cross-session memory scanning function exists for recurring observations
  • Skills generated from changedocs + memory include provenance metadata (changedoc_origin, memory_sources)
  • Tests covering changedoc + memory → skill transformation
  • At least 2 real runs demonstrate the pipeline producing higher-quality skills than the current approach

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions