Replies: 3 comments 2 replies
-
|
hi @sauravGit! I'd recommend opening an issue in the brand new https://github.com/open-telemetry/semantic-conventions-genai |
Beta Was this translation helpful? Give feedback.
-
|
RFC v0.3 update — the upstream issue is now open: semantic-conventions-genai #101, consolidating open issues #14, #23, #76, #93 into a single proposal for SIG review. Based on research alignment with the existing OTel GenAI spec, RFC v0.3 makes three key fixes:
Full RFC v0.3: https://github.com/sauravGit/open-llm-observability/blob/main/RFC.md Welcome any continued feedback from the SIG on the open questions (cost unit, error instrument type, TTFT streaming attribute). |
Beta Was this translation helpful? Give feedback.
-
|
For an OpenTelemetry-facing convention, I would keep the distinction between spans, metrics, logs/events, and derived evaluation scores very explicit. A production LLM trace usually needs at least three layers:
The convention should avoid putting high-cardinality or sensitive values directly into metric attributes. Prompt text, completion text, retrieved chunks, and user IDs should be trace/log payloads with redaction controls, not metric dimensions. I would also include lineage attributes early: prompt version, model route, dataset/eval version, tool name/version, retrieval collection version, and guardrail policy version. Those are the fields teams need when a regression appears and they need to identify whether the model, prompt, retrieval corpus, or policy changed. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi OpenTelemetry community,
I'm working on a vendor-neutral, OTEL-compatible semantic convention and SDK layer for standardizing LLM observability across providers, frameworks, and platforms — and I'd genuinely value feedback from GenAI SIG maintainers and contributors on the core metric set and OTEL mapping.
The problem: Every LLM observability tool today defines its own metric names, KPI sets, and attribute schemas. Teams instrumenting production LLM apps have to re-instrument every time they change providers or backends. There is no shared language.
What I'm proposing: A canonical
gen_ai.*metric schema — built on top of the existing OpenTelemetry GenAI semantic conventions — that covers:gen_ai.system,gen_ai.request.model,gen_ai.operation.nameWhy I'm bringing it here: The OTel GenAI SIG has done excellent work on span conventions and is the natural home for a standardized metric layer. I want to make sure this proposal complements (not duplicates or conflicts with) the existing
gen-ai-metricsspec in development.Specific questions for this community:
gen_ai.client.token.usagehistogram vs counter approach)?semantic-conventionsas an Issue/PR?Links:
This is v0.1 and explicitly designed to evolve based on community input. I'm not trying to build another vendor tool — the goal is a shared language for LLM telemetry that any OTEL-compatible backend can consume.
Happy to be redirected to the right SIG channel, mailing list, or issue tracker if this is the wrong venue. Thank you for any feedback.
Beta Was this translation helpful? Give feedback.
All reactions