Skip to content

Remove run_id from GenAI Utils#228

Merged
zhirafovod merged 1 commit intomainfrom
hybim-447_remove-run-id
Mar 18, 2026
Merged

Remove run_id from GenAI Utils#228
zhirafovod merged 1 commit intomainfrom
hybim-447_remove-run-id

Conversation

@keith-decker
Copy link
Contributor

Remove run_id from GenAI Utils

Summary

This PR removes the run_id and parent_run_id fields from the GenAI types and eliminates the associated span/entity registries from TelemetryHandler. OpenTelemetry's native trace context propagation handles parent-child relationships, making these custom tracking mechanisms redundant.

Changes

Core Types (types.py)

  • Removed run_id: UUID field from GenAI base class
  • Removed parent_run_id: Optional[UUID] field from GenAI base class
  • Removed uuid4 import (no longer needed)

TelemetryHandler (handler.py)

  • Removed _span_registry: dict[str, Span] - was used to lookup spans by run_id
  • Removed _entity_registry: dict[str, GenAI] - was used to lookup entities by run_id
  • Removed all registry population/cleanup code from start/stop/fail methods

Emitters

  • Updated SpanEmitter, MetricsEmitter, SplunkEmitter, TestEmitter to remove run_id from attribute dictionaries
  • Removed run_id from debug output in debug.py

Instrumentation Packages

  • Updated LangChain callback handler to remove run_id/parent_run_id mapping logic
  • Updated LlamaIndex callback handler and workflow instrumentation
  • Updated OpenAI Agents span processor tests

Examples & Tests

  • Removed run_id references from all example scripts
  • Updated test assertions to not check for run_id

Why

  1. Redundant: OpenTelemetry's trace.Context already provides parent-child linking via trace_id and span_id
  2. Memory overhead: The registries retained references to ended spans/entities
  3. Complexity: Extra bookkeeping that complicated the instrumentation code
  4. Not in spec: run_id is not part of OpenTelemetry semantic conventions for GenAI

@keith-decker keith-decker requested review from a team as code owners March 10, 2026 17:41
@keith-decker keith-decker requested a review from wrisa March 12, 2026 17:08
Copy link
Contributor

@wrisa wrisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Sometime later we need to move agent_id out of GenAI invocation type.

agent_name_value = context_agent.agent_name or context_agent.name
inv.agent_name = _safe_str(agent_name_value)
inv.agent_id = str(context_agent.run_id)
inv.agent_id = _agent_span_id(context_agent)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think agent id should be coming from the framework and should not be span id, let's discuss with the team ?

"run_id": str(invocation.run_id),
"parent_run_id": str(invocation.parent_run_id)
if invocation.parent_run_id
"trace_id": f"{invocation.trace_id:032x}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worker.py still references run_id. check _process_evaluation line#182

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also in evaluation.py

if isinstance(agent, AgentInvocation):
try:
if self._agent_context_stack:
if self._agent_context_stack and agent.agent_id is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't the agent_id be None for outermost agent since no parent sets it? In that case its pushed on the stack ins start https://github.com/signalfx/splunk-otel-python-contrib/pull/228/changes#diff-ceaf683887146bf95f37a7ebb888621d695e881d6e69059e4c7b3a9bc0120f9bR972 with None value but never popped here cause of not None check.

if isinstance(agent, AgentInvocation):
try:
if self._agent_context_stack:
if self._agent_context_stack and agent.span_id is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fail_agent checks for agent.span_id instead of agent_id. In start agent_id is pushed onto the stack https://github.com/signalfx/splunk-otel-python-contrib/pull/228/changes#diff-ceaf683887146bf95f37a7ebb888621d695e881d6e69059e4c7b3a9bc0120f9bR972, so comparing against hex span_id will never match.
Both this and mismatch in stop_agent will cause stack accumulation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't _span_registry removed?

if processor._workflow is not None:
# Parent run_id should be set (either to agent or workflow)
assert (
getattr(tool_state.invocation, "parent_run_id", None) is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent_run_id reference, can you please check the whole codebase for references to run_id, parent_run_id and _span_registry?


with self._pending_lock:
self._pending[run_id] = invocation
self._pending[span_id] = invocation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will span_id ever be None. If it can then it will be overridden by multiple

* lint updates, test fixes
* remove auto generation of agent_id
* resolve rebase conflicts
* restore context clarification and tests
* clean up dead code in llamaindex, align agent_id references
@keith-decker keith-decker force-pushed the hybim-447_remove-run-id branch from 1143763 to 82a9511 Compare March 18, 2026 19:51
@zhirafovod zhirafovod merged commit a769522 into main Mar 18, 2026
14 checks passed
@zhirafovod zhirafovod deleted the hybim-447_remove-run-id branch March 18, 2026 19:59
@github-actions github-actions bot locked and limited conversation to collaborators Mar 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants