Skip to content

Automatic context compaction and context usage visibility for BI Copilot agent #188

@NaveenSanjaya

Description

@NaveenSanjaya

Problem

Description:
During extended Ballerina AI Copilot agent sessions, such as multi-step code generation, iterative bug fixing, or large-scale refactoring, the accumulated conversation history, tool call records, and tool results can grow large enough to exhaust the model's context window (200K tokens for Claude models).

When this limit is reached:

  • The agent fails to produce relevant responses
  • The user loses the entire session state and must start over from scratch
  • Previously completed work gets repeated

There is also no indication to the user of how much context has been consumed or how close they are to hitting the limit, making it impossible to take preventive action before the session breaks.

Proposed Solution

Two complementary capabilities are needed:

  • Automatic context compaction: When the conversation history approaches the context limit, the agent should automatically summarize and compress older history in the background, preserving the task intent and recent work so that the session can continue without user intervention or data loss.

  • Context usage indicator: a real-time widget in the AI chat input area showing how much of the context window is currently in use (as a percentage and token count), with a breakdown by category (system instructions, tool definitions, conversation messages, tool results).

Alternatives

No response

Version

No response

Metadata

Metadata

Assignees

No fields configured for Feature.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions