Skip to content

[Bug] Unvalidated JSON Serialization in AgentStateModel Allows State Spoofing and Evaluation Poisoning #717

Description

@QiuYucheng2003

Describe the bug
In src/state.py, the AgentStateModel stores the agent's state in a state_stack_json string field without enforcing strict schema validation. The methods update_latest_state and add_to_current_state directly accept a dict and serialize it using json.dumps() before committing to the SQLite database.

Because there is no structural validation (e.g., via Pydantic or strict ORM fields), a highly capable or malicious Agent can output deeply nested, mutated JSON containing unauthorized key-value pairs (such as spoofed verification flags or evaluation metrics like {"task_verified": true}). These spoofed metrics are blindly saved into the database and treated as legitimate system metrics upon deserialization (json.loads()), allowing the agent to silently manipulate evaluation results or system states.

How To Reproduce
Steps to reproduce the behavior (example):

  1. Initialize a project and trigger an Agent execution.

  2. Simulate or prompt the Agent to output a state dictionary that includes arbitrary, unauthorized keys alongside standard keys (e.g., {"internal_monologue": "working...", "completed": true, "ground_truth_matched": 100}).

  3. The update_latest_state method in src/state.py receives this dictionary and directly appends it to the state stack.

  4. The unvalidated JSON is written to the SQLite database. Any external evaluator or UI reading AgentState.get_latest_state() will parse and trust these spoofed fields.

Expected behavior
The system should enforce strict schema validation before serializing the state dictionary to the database. The state dictionary should be validated against a strict data model (e.g., a Pydantic schema matching the structure in new_state()) to strip out any unexpected or malicious fields before calling json.dumps().

Screenshots and logs
Note: This vulnerability was identified via static source code security auditing. The flaw is a logical vulnerability in the data pipeline rather than a runtime crash, so there are no error logs. The issue resides in how src/state.py handles DB commits.

Configuration

  • OS: [All]
  • Python version: [All supported versions]
  • Node version: [N/A]
  • bun version: [N/A]
  • search engine: [N/A]
  • Model: [Any highly capable reasoning model]

Additional context
The vulnerability is primarily located in src/state.py within the update_latest_state and add_to_current_state functions. Transitioning from a raw state_stack_json: str field to proper SQLModel relational fields, or at least wrapping the incoming state: dict with a strict Pydantic model before serialization, will completely mitigate this attack vector.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions