Description
Traces are recorded per request and can cause a huge amount of data. Different mechanisms are in place to reduce the load while keeping relevant data in place. While a tail based sampling requires to process all spans of a span in the same collector instance and with that is a complex problem which might be better solved in the routing or backend layer, a head based sampling mechanism can be simple to offer as part of the telemetry module.
Istio already provides a mechanism to set a probabilistic sampling decision at the incoming request which will be propagated to all involved components. However, if not all components are implementing sampling proper and do not respect upstream sampling decisions, or in case Istio is not used, the approach might not be the best.
An alternative is to use the https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/probabilisticsamplerprocessor which is a mechanism happening at the tail but not having the all spans in place for decisions. The sampling will happen just on base of the trace_id and applied to all processed spans having the same trace_id.
The effect is the same as with the Istio approach but can be configured as part of the telemetry setup and will work for all traces even the ones not having Istio involved.
Also, the feature is available for logs on base of unique log IDs.
Goal: Support probabillistic sampling for traces and optional for logs
Criterias:
- There is a concept in place on which attributes to have configurable
- There is a decision if it is meaningful to offer the feature for logs as well
- There is a way to configure a sampling percentage per pipeline, using the default mode
- If a percentage is configured, it gets applied to all spans of a trace
Description
Traces are recorded per request and can cause a huge amount of data. Different mechanisms are in place to reduce the load while keeping relevant data in place. While a tail based sampling requires to process all spans of a span in the same collector instance and with that is a complex problem which might be better solved in the routing or backend layer, a head based sampling mechanism can be simple to offer as part of the telemetry module.
Istio already provides a mechanism to set a probabilistic sampling decision at the incoming request which will be propagated to all involved components. However, if not all components are implementing sampling proper and do not respect upstream sampling decisions, or in case Istio is not used, the approach might not be the best.
An alternative is to use the https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/probabilisticsamplerprocessor which is a mechanism happening at the tail but not having the all spans in place for decisions. The sampling will happen just on base of the trace_id and applied to all processed spans having the same trace_id.
The effect is the same as with the Istio approach but can be configured as part of the telemetry setup and will work for all traces even the ones not having Istio involved.
Also, the feature is available for logs on base of unique log IDs.
Goal: Support probabillistic sampling for traces and optional for logs
Criterias: