Replace NanosSinceEpoch with UniqueTimestamp (HLC) in Bifrost Record.created_at

## Context

In #4515, we identified that the vqueue inbox ordering breaks when multiple Bifrost records arrive within the same millisecond. The root cause is that `Record.created_at` is a `NanosSinceEpoch` which gets truncated to `MillisSinceEpoch` in the partition processor, then reconstructed into a `UniqueTimestamp` with logical clock = 0 via `from_unix_millis_unchecked()`. This makes same-millisecond entries indistinguishable in the vqueue inbox key ordering.

The immediate fix (#4515) adds a deterministic HLC-like counter in the state machine that increments the logical clock for same-millisecond records. The HLC state is persisted to the FSM table so that crash recovery produces correctly ordered timestamps. This works correctly but is a workaround — the information loss happens because `Record.created_at` is typed as `NanosSinceEpoch` rather than `UniqueTimestamp`.

## Proposal

Change `Record.created_at` from `NanosSinceEpoch` to `UniqueTimestamp` (HLC) so the monotonic ordering guarantee lives at the Bifrost layer. This would:

1. **Eliminate information loss**: The state machine would receive a proper `UniqueTimestamp` directly, no reconstruction needed
2. **Benefit all consumers**: Any future consumer of Record ordering gets correct monotonic timestamps for free
3. **Leverage existing infrastructure**: The on-disk record format already supports HLC timestamps via `RecordFlags::HlcTimestamp`, and both log-server and local-loglet decoders handle it
4. **Remove the persisted HLC workaround**: The state machine's HLC counter and its FSM table persistence (added in #4515) can be removed

## Design considerations

### Sequencer-assigned timestamps

Ideally, the `created_at` HLC timestamp should be assigned by the **sequencer** rather than by the record producer. Currently, `NanosSinceEpoch::now()` is called at the producer side (`InputRecord::from(Arc<T>)`), meaning different nodes can stamp records with their own wall clocks before sending to the sequencer. This causes **timestamp skew** — records from nodes with slightly different clocks can arrive at the sequencer in a different order than their timestamps suggest.

If the sequencer assigns the HLC timestamp, it guarantees:
- **No cross-node clock skew**: A single clock source determines the ordering
- **Monotonicity aligned with LSN**: The sequencer already assigns LSNs, so aligning HLC assignment with LSN assignment ensures the two ordering signals are consistent
- **Simpler producer API**: Producers don't need access to an HLC clock

The tradeoff is that `created_at` would reflect sequencer time rather than producer time, which may slightly affect latency metrics (write-to-read latency). This is likely acceptable since the sequencer is on the critical path anyway.

## Key changes needed

- Change `Record.created_at` field type from `NanosSinceEpoch` to `UniqueTimestamp`
- Add an HLC clock to the sequencer (preferred) or Bifrost append path
- Update `InputRecord` creation to either omit `created_at` (sequencer fills it) or use a provisional value
- Set `RecordFlags::HlcTimestamp` in on-disk encoders (infrastructure already exists)
- Handle bilrost wire format compatibility (rolling upgrade)
- Update latency metrics from nanosecond to millisecond precision (acceptable)
- The state machine's `record_created_at` could become `UniqueTimestamp` directly
- Remove the persisted HLC workaround in the state machine (FSM table field `LAST_RECORD_UNIQUE_TS`, `StateMachine.last_record_unique_ts`)

## References

- Root cause analysis: #4515
- `RecordFlags::HlcTimestamp` already exists in `log-server/src/rocksdb_logstore/record_format.rs` and `bifrost/src/providers/local_loglet/record_format.rs`
- `LogsHlcClock` exists in `types/src/logs/builder.rs` (used for metadata, not records — but the type can be reused)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace NanosSinceEpoch with UniqueTimestamp (HLC) in Bifrost Record.created_at #4516

Context

Proposal

Design considerations

Sequencer-assigned timestamps

Key changes needed

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Replace NanosSinceEpoch with UniqueTimestamp (HLC) in Bifrost Record.created_at #4516

Description

Context

Proposal

Design considerations

Sequencer-assigned timestamps

Key changes needed

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions