Skip to content

feat(metrics): add bound instruments behind experimental feature flag#3392

Open
bryantbiggs wants to merge 20 commits intoopen-telemetry:mainfrom
bryantbiggs:bound-instruments
Open

feat(metrics): add bound instruments behind experimental feature flag#3392
bryantbiggs wants to merge 20 commits intoopen-telemetry:mainfrom
bryantbiggs:bound-instruments

Conversation

@bryantbiggs
Copy link
Contributor

@bryantbiggs bryantbiggs commented Feb 25, 2026

Refs #1374, #2328

Summary

Adds BoundCounter and BoundHistogram to the public API behind the experimental_metrics_bound_instruments feature flag. These types cache the resolved aggregator reference for a fixed attribute set, allowing subsequent measurements to bypass per-call sort, dedup, hash, and HashMap lookup entirely.

let counter = meter.u64_counter("requests").build();
let bound = counter.bind(&[KeyValue::new("method", "GET")]);
bound.add(1); // ~1.9ns — no attribute lookup

Architecture

TrackerEntry

pub(crate) struct TrackerEntry<A: Aggregator> {
    pub(crate) aggregator: A,
    pub(crate) has_been_updated: AtomicBool,
    pub(crate) bound_count: AtomicUsize,
}

has_been_updated tracks whether any measurement occurred since the last collect. bound_count tracks how many live BoundCounter/BoundHistogram handles reference this entry.

Delta collection (collect_and_reset)

In-place iteration under a read lock (no write lock in steady state):

  1. Export only entries where has_been_updated was true, then reset the flag
  2. Entries with bound_count > 0 are never evicted — they persist across idle cycles but produce no export when there are no updates
  3. Stale unbound entries (has_been_updated == false && bound_count == 0) are evicted under a write lock with a TOCTOU re-check

Cardinality overflow handling

ValueMap::bind() returns Option<Arc<TrackerEntry>>None when at the cardinality limit. Bound handle types use a Direct/Fallback enum:

  • Direct: normal case — dedicated tracker, single-digit ns, no map lookup per call
  • Fallback: at overflow — delegates every call to the unbound Measure::call() path

This gives correct overflow attribution (same behavior as unbound at overflow) with automatic recovery when delta collect frees cardinality space. An otel_debug! log fires when fallback occurs.

Aggregator types that don't support direct binding (ExponentialHistogram, LastValue, PrecomputedSum) also use the fallback path gracefully instead of panicking.

Feature flag

All public API, traits, noop impls, and internal plumbing are gated behind experimental_metrics_bound_instruments. The testing feature enables it by default.

Benchmark Results

Apple M4 Max, 16 cores (12 performance + 4 efficiency), macOS 15.4:

Benchmark Time vs Unbound
Counter Unbound 53.2 ns
Counter Bound 1.87 ns ~28x faster
Histogram Unbound 58.6 ns
Histogram Bound 6.57 ns ~8.9x faster

EC2 results across 9 instance configurations (c7i/c7a/c7g × large/xlarge/4xlarge) show 5-28x counter speedup and 5-9x histogram speedup depending on architecture. Full results in the expandable sections below.

EC2 Counter Results
Instance Unbound Bound Speedup
c7i.large 86.2ns 7.47ns 11.5x
c7i.xlarge 88.9ns 7.62ns 11.7x
c7i.4xlarge 90.1ns 7.65ns 11.8x
c7a.large 77.4ns 2.71ns 28.6x
c7a.xlarge 75.0ns 2.71ns 27.7x
c7a.4xlarge 75.0ns 2.71ns 27.7x
c7g.large 111.8ns 6.87ns 16.3x
c7g.xlarge 111.4ns 6.84ns 16.3x
c7g.4xlarge 112.9ns 6.73ns 16.8x
EC2 Histogram Results
Instance Unbound Bound Speedup
c7i.large 97.1ns 17.1ns 5.7x
c7i.xlarge 99.0ns 17.9ns 5.5x
c7i.4xlarge 98.5ns 18.4ns 5.4x
c7a.large 92.5ns 13.7ns 6.8x
c7a.xlarge 92.4ns 13.7ns 6.7x
c7a.4xlarge 92.5ns 13.7ns 6.8x
c7g.large 133.2ns 27.7ns 4.8x
c7g.xlarge 133.5ns 27.6ns 4.8x
c7g.4xlarge 133.8ns 27.2ns 4.9x
Previous experiments (EXP-002, EXP-003) and why bound instruments are the right approach

EXP-002: Micro-optimizations on existing architecture

  • Changes: Relaxed atomic ordering, stale entry eviction during delta collect
  • Result: <2% improvement — the HashMap bottleneck is untouched

EXP-003: Sharded HashMap with pre-hashing

  • Changes: 16 Mutex-guarded shards, pre-computed foldhash, PassthroughHasher
  • Result: +10-33% regression across all architectures. HashedAttributes::from_raw() on every measure() call is more expensive than the existing two-lookup pattern.

Key insight

Profiling shows 78% of CPU in HashMap key hashing (52%) and key comparison (26%). The only way to meaningfully improve measure() throughput is to skip the lookup entirely. Bound instruments achieve this by moving the cost to bind-time (once) rather than measure-time (every call).

Test Coverage

15 tests covering:

  • Cumulative and delta temporality for both Counter and Histogram
  • Bound + unbound sharing the same data point (same attribute set)
  • Idle delta cycles (no update = no export, but handle persists)
  • Cardinality overflow fallback to unbound path (Counter and Histogram)
  • Recovery after delta eviction frees cardinality space
  • Overflow fallback handles working across multiple delta cycles
  • Drop enabling eviction of stale entries
  • Multiple bound handles sharing the same tracker (ref-counting)
  • Binding with empty attributes

Impact on existing code

  • Zero regression on unbound measurement path
  • No changes to collect path semantics for unbound instruments
  • All existing tests pass
  • Additive public API behind feature flag — no breaking changes

Related

@bryantbiggs bryantbiggs force-pushed the bound-instruments branch 2 times, most recently from 9ce05fc to 1bb4efe Compare February 25, 2026 21:36
@codecov
Copy link

codecov bot commented Feb 25, 2026

Codecov Report

❌ Patch coverage is 89.47368% with 72 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.7%. Comparing base (69ab1c4) to head (b7e9197).

Files with missing lines Patch % Lines
opentelemetry-sdk/src/metrics/mod.rs 93.7% 27 Missing ⚠️
opentelemetry-sdk/src/metrics/internal/mod.rs 90.8% 11 Missing ⚠️
...pentelemetry-sdk/src/metrics/internal/aggregate.rs 0.0% 6 Missing ⚠️
opentelemetry-sdk/src/metrics/noop.rs 0.0% 5 Missing ⚠️
opentelemetry/src/metrics/noop.rs 0.0% 5 Missing ⚠️
opentelemetry/src/metrics/instruments/histogram.rs 60.0% 4 Missing ⚠️
...-sdk/src/metrics/internal/exponential_histogram.rs 62.5% 3 Missing ⚠️
...entelemetry-sdk/src/metrics/internal/last_value.rs 25.0% 3 Missing ⚠️
...emetry-sdk/src/metrics/internal/precomputed_sum.rs 40.0% 3 Missing ⚠️
opentelemetry/src/metrics/instruments/counter.rs 66.6% 3 Missing ⚠️
... and 1 more
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #3392     +/-   ##
=======================================
+ Coverage   82.6%   82.7%   +0.1%     
=======================================
  Files        128     128             
  Lines      24645   25288    +643     
=======================================
+ Hits       20360   20928    +568     
- Misses      4285    4360     +75     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

attrs,
tracker.aggregator.clone_and_reset(&self.config),
));
tracker.evicted.store(true, Ordering::Release);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking out loud:
do we need to evict bound instruments after delta collect()? Can we define the behavior as bound instruments remains forever, unless user explicitly call an unbind()?
if there are no new updates in a collect() cycle, then we need to not export anything, but value map keeps the tracker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 936993f — replaced the swap-and-drain collect_and_reset with in-place delta collection that supports bound instrument persistence.

What changed:

TrackerEntry now has two status fields instead of evicted:

  • has_been_updated: AtomicBool — set on every measure()/bound.add() call, checked and reset during collect. Only entries updated since last collect are exported. No updates in a cycle = no export, but the tracker stays in the map.
  • bound_count: AtomicUsize — incremented on bind(), decremented on Drop of the bound handle. Entries with bound_count > 0 are never evicted, even if they had no updates in a cycle.

Delta collect behavior:

  1. Iterate trackers under a read lock (no write lock needed in steady state)
  2. Export only entries where has_been_updated was true, then reset the flag
  3. Collect stale candidates: entries where has_been_updated=false AND bound_count=0 AND not overflow sentinel
  4. If stale candidates exist, acquire write lock, re-check has_been_updated under the lock (TOCTOU guard against concurrent measure()), then evict

Bound handles no longer check an evicted flag or have a fallback path — they always go direct to the aggregator. The Drop impl decrements bound_count so the entry becomes eligible for eviction after all handles are dropped.

The drain_and_reset path is preserved for async/Observable instruments (PrecomputedSum) that need true staleness detection via map clearing.

This design was informed by a previous experiment branch (feat/delta-collect-optimization, part of #2328) which validated the has_been_updated + in-place iteration pattern on EC2 benchmarks. The bound_count mechanism extends it to prevent eviction of bound entries.

self.count.fetch_add(1, Ordering::SeqCst);
new_tracker
} else {
// Over cardinality limit — bind to the overflow tracker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is tricky...when user called bind(), we may have been over limit and it gets bound to overflow. But after a delta collect, unused points are reclaimed, but the bound instrument still goes to overflow?
Or the delta collect set evicted to true, and it fallsback to slow path?

We may have to see if bound instruments should be exempt from the cardinality limit? (that is not good and will violate spec too). Open for ideas.

Copy link
Contributor Author

@bryantbiggs bryantbiggs Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 7b531c4 with an overflow fallback to the unbound path.

Solution: bind() returns Option, None at overflow

ValueMap::bind() now returns Option<Arc<TrackerEntry>>None when at/over the cardinality limit. The bound handle types (BoundSumHandle, BoundHistogramHandle) use a Direct/Fallback enum:

  • Direct: normal case — dedicated tracker, single-digit ns, no map lookup on each call
  • Fallback: at overflow — delegates every .call() to the unbound Measure::call() path

This gives us:

  1. Correct overflow attribution — same behavior as unbound measurements at overflow, no data merging between distinct bound handles sharing an overflow tracker
  2. Automatic recovery — when delta collect evicts stale entries and frees cardinality space, measure.call() creates a proper entry for those attrs on the next invocation
  3. No bound_count inflation on the overflow tracker (was a problem with the previous approach of binding directly to it)

Trade-off: a Fallback handle is no faster than unbound (full HashMap lookup each time). But if you're at the cardinality limit when calling bind(), you're already in degraded mode — the user still gets correct data, just not the fast-path benefit. The guidance remains: bind early during initialization before cardinality fills up.

Benchmarks confirm zero regression on the Direct hot path from the enum match:

Benchmark Before After
Counter Bound 1.81 ns 1.83 ns
Histogram Bound 6.57 ns 6.57 ns
Counter Unbound 51.40 ns 51.75 ns
Histogram Unbound 55.91 ns 57.35 ns

Added 4 new tests covering overflow + delta interaction scenarios.

@cijothomas
Copy link
Member

cijothomas commented Feb 26, 2026

Looks very promising! Left some comments in code, and response to the open questions below. Will need to close some edge cases (cardinality cap, evict etc.). Also more test cases on this would give higher confidence!

Should BoundCounter/BoundHistogram auto-rebind after delta eviction, or is explicit bind() sufficient?
I left a comment in the code a bit ago; I think we should not evict the bound instrument after each collect(). It should be still kept bound. But don't export if no updates in a collect() cycle.

Should bound instruments be behind a feature flag initially?
Yes

Is there interest in bound UpDownCounter or Gauge support?
Not now. Just Counter alone is sufficient for proof of concept and to gather feedback. This PR already makes it happen for histogram, which is awesome.

Naming: bind() vs with_attributes() vs something else?

bind() sounds good. This was the original name for this idea when it was part of OTel spec in 2020, but got removed for scope control.

use opentelemetry::{metrics::MeterProvider as _, KeyValue};
use opentelemetry_sdk::metrics::{ManualReader, SdkMeterProvider, Temporality};

fn create_provider(temporality: Temporality) -> SdkMeterProvider {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryantbiggs If you can include the bench results in code comment (similar to what we do in many other benches), that'll make it easy to get a quick feel of things. PR description has it, let's just copy it here. (do include hardware info if you are comfortable with that).

The numbers shared meets our expectation: single digit ns for counters!

There is contention still, but solving that is a separate effort. (Already solved in OpenTelemetry Arrow project, we can steal it when the time comes)

Copy link
Contributor Author

@bryantbiggs bryantbiggs Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added benchmark results as a comment block at the top of benches/bound_instruments.rs with Apple M4 Max hardware info and all benchmark numbers (Counter/Histogram unbound vs bound, plus multi-threaded bound counter at 2/4/8 threads).

Single-digit ns for bound counters as expected (~1.87 ns bound vs ~53 ns unbound = ~28x speedup). Histogram bound is ~6.6 ns vs ~59 ns unbound (~8.9x).

@bryantbiggs
Copy link
Contributor Author

Benchmark Snapshot — Before Delta Collect Refactor

Baseline numbers on the current bound-instruments branch (swap-and-drain collect_and_reset with evicted flag) before porting the in-place delta collection design. This serves as the "before" watermark.

Hardware: Apple M4 Max (MacBook Pro), macOS 15.4

Benchmark Time
Counter_Unbound_Delta 48.10 ns
Counter_Bound_Delta 1.70 ns
Histogram_Unbound_Delta 54.01 ns
Histogram_Bound_Delta 9.90 ns
Counter_Bound_Multithread/2 20.97 µs
Counter_Bound_Multithread/4 36.03 µs
Counter_Bound_Multithread/8 68.19 µs

Speedups: Counter 28.3x, Histogram 5.5x (bound vs unbound)

…rsistence

Replace swap-and-drain delta collection with in-place iteration using a
has_been_updated flag on TrackerEntry. This eliminates the evicted flag
and fallback path from bound instruments, and removes the second
trackers_for_collect HashMap entirely.

Key changes:

TrackerEntry now has two status fields:
- has_been_updated: AtomicBool — set on every measure()/bound.add() call,
  checked and reset during collect. Only entries updated since last collect
  are exported. This implements the "no updates = no export" semantic.
- bound_count: AtomicUsize — incremented on bind(), decremented on Drop
  of the bound handle. Entries with bound_count > 0 are never evicted
  from the map, even if they had no updates in a cycle.

collect_and_reset() (sync instruments, delta temporality):
- Iterates trackers under a read lock (no write lock needed in steady state)
- Exports only entries where has_been_updated was true, resets the flag
- Evicts stale unbound entries (has_been_updated=false, bound_count=0)
- Preserves bound entries and overflow sentinel indefinitely
- TOCTOU re-check under write lock prevents race with concurrent measure()

drain_and_reset() (async instruments like PrecomputedSum):
- Preserved for Observable instruments that need true staleness detection
- Uses std::mem::take instead of the old two-map swap pattern

Bound instrument handles (BoundSumHandle, BoundHistogramHandle):
- Always use the fast path — no eviction check, no fallback
- Set has_been_updated on every call so the tracker is exported at collect
- Decrement bound_count on Drop so the entry can be evicted after unbind

This design was informed by the delta-collect-optimization branch (EXP-002)
which validated the has_been_updated + in-place iteration pattern. The
bound_count mechanism extends it to prevent eviction of bound entries,
addressing the maintainer's feedback that bound instruments should persist
until explicitly unbound.

Part of open-telemetry#1374, open-telemetry#2328
@bryantbiggs
Copy link
Contributor Author

Benchmark Snapshot — After Delta Collect Refactor (936993f)

Numbers after replacing swap-and-drain collect_and_reset with in-place delta collection using has_been_updated flag and bound_count persistence.

Hardware: Apple M4 Max (MacBook Pro), macOS 15.4

Benchmark Before After Change
Counter_Unbound_Delta 48.10 ns 51.40 ns +6.9% (noise)
Counter_Bound_Delta 1.70 ns 1.81 ns +6.5% (noise)
Histogram_Unbound_Delta 54.01 ns 55.91 ns +3.5% (noise)
Histogram_Bound_Delta 9.90 ns 6.57 ns -33.6%
Counter_Bound_Multithread/2 20.97 µs 20.68 µs -1.4% (noise)
Counter_Bound_Multithread/4 36.03 µs 35.12 µs -2.5%
Counter_Bound_Multithread/8 68.19 µs 66.14 µs -3.0%

Key observations:

  • Histogram bound path improved 34% (9.90ns → 6.57ns) — eliminating the evicted flag check and fallback branch removes a conditional on the hot path
  • Counter bound and unbound paths show +3-7% which is run-to-run variance on a laptop (sub-nanosecond absolute differences). EC2 benchmarks would show this as noise.
  • Multithread bound counters show slight improvement from removing the evicted flag load

Speedups vs unbound (after refactor): Counter 28.4x, Histogram 8.5x (up from 5.5x before)

…unbound path

When bind() is called at/over the cardinality limit, return None from
ValueMap::bind() instead of binding to the shared overflow tracker.
BoundSumHandle and BoundHistogramHandle now use a Direct/Fallback enum:
- Direct: fast path with dedicated tracker (normal case)
- Fallback: delegates to unbound Measure::call() path, ensuring correct
  overflow attribution and automatic recovery when delta collect frees space

Also fixes pre-existing counter_aggregation_overflow_delta test failure
caused by collect_and_reset's two-phase eviction (stale entries need a
second collect cycle to be evicted, unlike the old drain_and_reset).

Adds 4 new tests for overflow and delta interaction scenarios.
@bryantbiggs
Copy link
Contributor Author

Benchmark Snapshot — After Overflow Fallback (7b531c4)

Numbers after adding Direct/Fallback enum to bound handles for cardinality overflow handling. bind() now returns OptionNone at overflow falls back to unbound Measure::call() path.

Hardware: Apple M4 Max (MacBook Pro), macOS 15.4

Benchmark Before (936993f) After (7b531c4) Change
Counter_Unbound_Delta 51.40 ns 51.75 ns +0.7% (noise)
Counter_Bound_Delta 1.81 ns 1.83 ns +1.1% (noise)
Histogram_Unbound_Delta 55.91 ns 57.35 ns +2.6% (noise)
Histogram_Bound_Delta 6.57 ns 6.57 ns 0.0%
Counter_Bound_Multithread/2 20.68 µs 21.05 µs +1.8% (noise)
Counter_Bound_Multithread/4 35.12 µs 34.18 µs -2.7% (noise)
Counter_Bound_Multithread/8 66.14 µs 63.63 µs -3.8%

Key observations:

  • Zero regression on the Direct hot path from the enum match — compiler optimizes the single-variant branch prediction perfectly
  • All differences are within run-to-run noise (sub-nanosecond absolute)
  • The Fallback path (not benchmarked here) performs identically to unbound since it delegates to Measure::call()

Speedups vs unbound (unchanged): Counter 28.3x, Histogram 8.7x

Add `experimental_metrics_bound_instruments` feature flag to both
opentelemetry and opentelemetry-sdk crates. All bound instrument
public API (Counter::bind, Histogram::bind, BoundCounter,
BoundHistogram, BoundSyncInstrument) and internal plumbing
(BoundMeasure, Measure::bind, ValueMap::bind) are gated behind
this flag.

Follows existing naming convention (experimental_metrics_*).
The testing feature enables it by default for test coverage.
… coverage

Add benchmark results as code comments in benches/bound_instruments.rs
with Apple M4 Max hardware info, following the project convention.

Add 7 new tests for bound instruments covering:
- Histogram delta temporality with reset verification
- Histogram bound+unbound sharing same data point
- Histogram idle delta cycle (no update = no export)
- Histogram overflow fallback to unbound path
- Counter drop enabling eviction of stale entries
- Multiple bound handles sharing same tracker (ref-counting)
- Counter with empty attributes via bind()
@bryantbiggs
Copy link
Contributor Author

Known limitation: bind(&[]) vs counter.add(_, &[]) produce separate data points

bind(&[]) goes through the hashmap path via ValueMap::bind_sorted(vec![]), while counter.add(_, &[]) uses the dedicated no_attribute_tracker fast path. This means they don't share aggregation — a user mixing both patterns on empty attributes would see two data points instead of one.

This is an edge case (binding with empty attributes is unusual — the whole point of bind() is to pre-resolve a fixed attribute set), and fixing it would require either routing bind(&[]) to the no_attribute_tracker (with bound_count support added to it) or routing empty-attr measure() calls through the hashmap. Either change touches the hot path for all instruments, not just bound ones.

Should we: (a) document this as a known limitation for now, (b) fix it in this PR, or (c) track as a follow-up issue?

…ported aggregators

Aggregator types that don't support direct binding (ExponentialHistogram,
LastValue, PrecomputedSum) now gracefully fall back to the unbound
Measure::call() path instead of panicking at runtime. This prevents
crashes when bind() is called on instruments with view-remapped
aggregations (e.g., a Counter mapped to Base2ExponentialHistogram).

Also adds otel_debug! log when bind() hits cardinality overflow, and
#[must_use] on BoundCounter/BoundHistogram to catch accidental drops.
@bryantbiggs bryantbiggs changed the title feat: add BoundCounter and BoundHistogram for pre-bound attribute sets feat(metrics): add bound instruments behind experimental feature flag Feb 26, 2026
@bryantbiggs bryantbiggs force-pushed the bound-instruments branch 2 times, most recently from 578e22a to 235ab3d Compare February 26, 2026 16:16
…d instruments

- Use stable rustfmt import ordering for cfg-gated imports
- Gate otel_debug import behind experimental_metrics_bound_instruments feature
- Fix clippy iter_kv_map warning in trace_asserter (pre-existing)
- Fix clippy collapsible-if and redundant-closure warnings
- Update CHANGELOG.md entries for both opentelemetry and opentelemetry-sdk
@bryantbiggs bryantbiggs marked this pull request as ready for review February 26, 2026 17:03
@bryantbiggs bryantbiggs requested a review from a team as a code owner February 26, 2026 17:03
"All overflow-bound measurements should accumulate in overflow bucket"
);

// Cycle 2: bound handles still work after delta collect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only part I am not so sure of. If bind() occurred when we were in overflow, we will always remain in overflow, even though we got space cleaned up. Should not be an issue in practice, as bound instruments have very specialized use case, and the bind occur at the very start of application typically, and hence unlikely they'd go into overflow.

Leaving this comment for us to revisit in future, not a blocker for this PR.

Some options to discuss in future:

  1. bind() can return err/none if overflowing, so users know upfront they are not using bind() properly. This return approach is not done in OTel world so far, but bound is special and might call for exceptions.
  2. See if we can internally adjust bindings from overflow to normal during a delta collect.
  3. Exclude bind() from cardinality capping - users leveraging bind() knows that no matter what happens elsewhere, the bound ones always reports proper and never overflows.
  4. Other ideas.

Copy link
Contributor Author

@bryantbiggs bryantbiggs Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Captured this as a TODO comment directly in the code at the overflow None return in bind_sorted() (b7e9197), linking back to this discussion thread so it doesn't get lost (if/when) the PR merges.

let mut stale_entries: Vec<Arc<TrackerEntry<A>>> = Vec::new();

{
let Ok(trackers) = self.trackers.read() else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the previous design of maintaining 2 trackers allowed us to require a lock for a single instruction swap. But now, we need read() lock held thoughout the iteration through the entries, and then write() lock throughout the stale_entry cleanups.

Given we are not clearing entires for delta aggressively in each collect, this might be less impactful than it appears. The typical pattern is to re-use majority of trackers, so they don't get evicted, and hence, update/hot-path won't need to get write lock. The new attribute set (or the ones that went unused and making a comeback), would slightly suffer as its competing for a write lock() with collect, and the time it waits is directly proportional to the overall count of trackers.

This maybe a reasonable tradeoff to accept, as previous situation was definitely bad due to force clearing after every collect(), and thereby forcing write lock() on every attributeset after every collect().

Need some more time to think through this fully, but feel free to comment if my observations are correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your observations are correct. Here's how the locking trade-off breaks down:

Old design (swap-and-drain): Write lock on trackers held only for mem::swap (pointer swap — near instant), then iteration happens on a second map. But every delta collect clears all entries, so the next measure() for every attribute set must take a write lock to re-insert — O(n) write lock acquisitions on the hot path per collection cycle.

New design (in-place iteration): Read lock held throughout iteration, then write lock only if there are stale entries to evict.

Steady state (the common case): Attribute sets are reused across cycles → zero stale entries → no write lock during collect at all. Both measure() and collect_and_reset hold only read locks and run concurrently. This is strictly better than the old design where every attr set had to re-acquire a write lock after every collect.

Edge case — new/returning attrs during collect: As you noted, a measure() that needs to insert a new entry competes for the write lock with the read lock held by collect. Wait time is proportional to total tracker count. But this only affects never-before-seen attribute sets arriving during the iteration window. Once inserted, they're back on the read-lock fast path.

Stale eviction write lock: Duration proportional to the number of stale entries (the retain() call), not total tracker count. In steady state this never fires.

Bound instruments: Completely unaffected — they hold Arc<TrackerEntry> directly, bypassing the map and all locking entirely.

Atomics safety during concurrent read-lock iteration: has_been_updated.swap(false) and the aggregator read/reset use atomics internally (AtomicI64/AtomicU64 for sums, Mutex for histograms). If a concurrent measure() sets has_been_updated back to true after the swap, that measurement is captured in the next cycle — not lost.

Net: the new design trades a brief read-lock hold during collect for eliminating O(n) write-lock acquisitions on the hot path every cycle. The only regression is the narrow window where brand-new attribute sets must wait for collect's read lock to release before inserting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming! My conclusion is same - this is a trade-off, and I think this is a reasonable balance.

Briefly chatted with @utpilla on this as well (who has most experience with all perf things we have in this area!).

TrackerEntry::has_been_updated now starts as false (no recording yet)
instead of true. The measure() paths explicitly set it to true after
the first update(). This prevents phantom exports if a tracker is
ever created without an immediate recording (e.g., via bind()).

Also adds a TODO capturing maintainer feedback about permanent
overflow-fallback for bound instruments created at cardinality limit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants