Skip to content

Connection stability and E2EE key exchange improvements#3761

Draft
alisonjenkins wants to merge 6 commits intoelement-hq:livekitfrom
alisonjenkins:alisonjenkins/connection-stability-e2ee
Draft

Connection stability and E2EE key exchange improvements#3761
alisonjenkins wants to merge 6 commits intoelement-hq:livekitfrom
alisonjenkins:alisonjenkins/connection-stability-e2ee

Conversation

@alisonjenkins
Copy link

@alisonjenkins alisonjenkins commented Feb 28, 2026

Summary

  • Skip local participant in MatrixAudioRenderer track filter to prevent audio feedback
  • Debounce homeserver connection signals and reconnecting$ to prevent spurious reconnects and deduplicate local connection updates
  • Add E2EE key exchange timing diagnostics to MatrixKeyProvider for debugging key distribution latency
  • Expose useKeyDelay and keyRotationGracePeriodMs config options for fine-tuning E2EE behavior
  • Pre-warm Olm sessions during lobby phase to speed up E2EE key exchange when joining a call

Test plan

  • Verify audio rendering doesn't include local participant's audio
  • Confirm connection status doesn't flicker during brief network interruptions
  • Check E2EE key exchange timing logs appear in developer console
  • Verify useKeyDelay and keyRotationGracePeriodMs config options are respected
  • Test that Olm session pre-warming reduces time to first encrypted frame after joining

🤖 Generated with Claude Code

alisonjenkins and others added 6 commits February 28, 2026 07:48
The useTracks hook returns all audio tracks including the local
participant's, but validIdentities intentionally excludes the local
user (you shouldn't hear your own mic). This caused a spurious warning
on every track update. Filter out local tracks early to resolve the
existing TODO.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The /sync request occasionally aborts (AbortError), causing syncing$
to briefly go false. This cascaded through combined$ → tracks paused →
"Reconnecting..." toast, even though the SDK auto-retries and sync
resumes within seconds.

- Add 8s debounce on syncing$ since sync errors are transient
- Add 2s debounce on combined$ for membershipConnected$/certainlyConnected$
- Add per-condition diagnostic logging to identify which signal flaps

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Debounce reconnecting$ with 1.5s delay so brief disconnections don't
  flash the "Reconnecting..." toast to the user
- Add distinctUntilChanged to localConnection$ to prevent redundant
  "Local connection updated" log spam (was firing 26+ times per event)
- Add diagnostic logging on localConnectionState$

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Record a join timestamp when setRTCSession is called and log per-key
timing on each EncryptionKeyChanged event: time since join (key delivery
latency) and crypto.subtle.importKey duration. These diagnostics help
identify whether slow media decryption after joining a call is caused by
key delivery (Olm/sync) or key processing (Web Crypto).

Filter console by [MatrixKeyProvider] to see the new logs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add use_key_delay_ms and key_rotation_grace_period_ms to the
matrix_rtc_session config block, allowing operators to tune E2EE key
exchange timing without code changes.

- use_key_delay_ms: delay between sending a new key and encrypting with
  it, giving other participants time to receive the key (SDK default 1s)
- key_rotation_grace_period_ms: grace period during which new joiners
  reuse the existing key instead of triggering another rotation (SDK
  default 10s)

Both values are passed through to joinRTCSession and documented in the
sample/devenv config files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a user is in the call lobby with per-participant E2EE enabled, call
CryptoApi.prepareToEncrypt(room) to trigger /keys/claim for any devices
we don't yet have Olm sessions with. This moves the Olm session
establishment cost from join-time to lobby-time, so encryption keys can
be delivered immediately when the user clicks Join.

In testing, key delivery to some participants took ~48s after join due
to /keys/claim round-trips. With pre-warming, the Olm sessions are
already established by the time the call is joined.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@CLAassistant
Copy link

CLAassistant commented Feb 28, 2026

CLA assistant check
All committers have signed the CLA.

Comment on lines +47 to +49
logger.info(
`Pre-warming Olm sessions for room ${room.roomId} (${memberships.length} call members)`,
);
Copy link
Contributor

@toger5 toger5 Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log is wrong. Or at least misleading. we are preparing for all room members. not just the N call members with this right?

Comment on lines 80 to 83
if (!isValid) {
// TODO make sure to also skip the warn logging for the local identity
// Log that there is an invalid identity, that means that someone is publishing audio that is not expected to be in the call.
prefixedLogger.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants