feat: add opentelemetry tracing and metrics#202
feat: add opentelemetry tracing and metrics#202tqwewe wants to merge 16 commits intolunatic-solutions:mainfrom
Conversation
| tracer: Arc::new(BoxedTracer::new(Box::new(NoopTracer::new()))), | ||
| tracer_context: Arc::new(Context::new()), | ||
| process_context: Context::new(), | ||
| meter_provider: GlobalMeterProvider::new(NoopMeterProvider::new()), | ||
| logger: Arc::new( | ||
| env_logger::Builder::new() | ||
| .filter_level(log::LevelFilter::Off) | ||
| .build(), | ||
| ), |
There was a problem hiding this comment.
I haven't fully understood about new_dist_state, and have made all logging for this be noops.
Is there any advice on what I should do here? Should logging to the terminal print on the control server?
There was a problem hiding this comment.
So dist state only has references to control client and "node". Node implements communication with other nodes (for example, it receives a spawn command from another node). Control client talks to the control server.
We can leave it out for now.
| bincode = "1.3" | ||
| dashmap = "5.4" | ||
| log = "0.4" | ||
| opentelemetry = { version = "0.19", git = "https://github.com/tqwewe/opentelemetry-rust", branch = "cow", features = ["metrics"] } |
There was a problem hiding this comment.
Unfortunately this PR uses some changes in opentelemetry-rust which are not yet published.
The two PR's are:
open-telemetry/opentelemetry-rust#1009
open-telemetry/opentelemetry-rust#1018
There was a problem hiding this comment.
I saw PRs got merged, do they plan to release the newer version soon?
There was a problem hiding this comment.
Looking at their previous releases, it seems they don't push releases very frequently :(
I've just made a discussion on it, hopefully we can get more insight there
open-telemetry/opentelemetry-rust#1031
withtypes
left a comment
There was a problem hiding this comment.
Overall it looks good! All metrics include some attributes like: node id, environment id, process id. They are not implicitly set, right? We have to set them on each call? We also want to set them in vm, no guest, so that it can be trusted.
| tracer: Arc::new(BoxedTracer::new(Box::new(NoopTracer::new()))), | ||
| tracer_context: Arc::new(Context::new()), | ||
| process_context: Context::new(), | ||
| meter_provider: GlobalMeterProvider::new(NoopMeterProvider::new()), | ||
| logger: Arc::new( | ||
| env_logger::Builder::new() | ||
| .filter_level(log::LevelFilter::Off) | ||
| .build(), | ||
| ), |
There was a problem hiding this comment.
So dist state only has references to control client and "node". Node implements communication with other nodes (for example, it receives a spawn command from another node). Control client talks to the control server.
We can leave it out for now.
| bincode = "1.3" | ||
| dashmap = "5.4" | ||
| log = "0.4" | ||
| opentelemetry = { version = "0.19", git = "https://github.com/tqwewe/opentelemetry-rust", branch = "cow", features = ["metrics"] } |
There was a problem hiding this comment.
I saw PRs got merged, do they plan to release the newer version soon?
Actually the process_id and environment_id is not attached to every trace/metric, but is only attached to the parent spans. I'll work on injecting this data to every span/log/metric. |
Todo:
target_info. (Turns out its not possible)Add push and take functions for resource sharing. Though, this should not "move" the resource, probably would be better to allow multiple processes to share metric resources.I don't think this is needed in this PR for now.Spans cannot be shared across processes, as they are in a tree structure, and sharing them means it would be possible to drop a parent span before its child, which wouldn't make sense.
Running the spawn process benchmark, this PR does not seem to affect performance of spawning processes.
Related PRs:
open-telemetry/opentelemetry-rust#1009
open-telemetry/opentelemetry-rust#1018
Screenshots for examples/metrics.rs
https://github.com/lunatic-solutions/lunatic-rs/blob/4681561eb78d1164bc1b2eef7c436bcab36622ab/examples/metrics.rs#L21-L78
Terminal
Jaeger

Prometheus
