Inference observability. Part of the MIST stack.
go get github.com/greynewell/tokentracetokentrace serve --addr :8700 --max-spans 100000reporter := tokentrace.NewReporter("myapp", "http://localhost:8700")
reporter.Report(ctx, span)curl localhost:8700/traces # list trace IDs
curl localhost:8700/traces/recent?limit=10 # recent spans
curl localhost:8700/traces/{trace-id} # by trace
curl localhost:8700/stats # aggregated metricscfg := tokentrace.Config{
Addr: ":8700",
MaxSpans: 100_000,
AlertCooldown: 5 * time.Minute,
AlertRules: []tokentrace.AlertRule{
{Metric: "error_rate", Op: ">", Threshold: 0.05, Level: "warning"},
{Metric: "latency_p99", Op: ">", Threshold: 5000, Level: "critical"},
},
}| Metric | Type |
|---|---|
total_spans |
count |
error_count / error_rate |
count / ratio |
latency_p50_ms / latency_p99_ms |
milliseconds |
total_tokens_in / total_tokens_out |
count |
total_cost_usd |
USD |