Skip to content

matrix: Fix flaky room creation test#4372

Draft
backspace wants to merge 14 commits intomainfrom
matrix/flaky-room-creation-test
Draft

matrix: Fix flaky room creation test#4372
backspace wants to merge 14 commits intomainfrom
matrix/flaky-room-creation-test

Conversation

@backspace
Copy link
Copy Markdown
Contributor

I’m seeing this repeatedly fail, like here.

@backspace backspace self-assigned this Apr 9, 2026
@backspace backspace added the bug Something isn't working label Apr 9, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 9, 2026

Realm Server Test Results

  1 files  ±0    1 suites  ±0   13m 37s ⏱️ -30s
844 tests +1  844 ✅ +1  0 💤 ±0  0 ❌ ±0 
915 runs  +1  915 ✅ +1  0 💤 ±0  0 ❌ ±0 

Results for commit 3295e62. ± Comparison against base commit e549179.

This pull request removes 1 and adds 2 tests. Note that renamed tests count towards both.
default ‑ should handle streaming requests
default ‑ should fall back to generation cost API when inline cost is missing
default ‑ should handle streaming requests and deduct credits from inline cost

♻️ This comment has been updated with latest results.

backspace and others added 6 commits April 9, 2026 14:56
- Increase test timeout to 120s for the room deletion/creation test
- Wait for the deleted room to leave the DOM before polling for the new one
- Increase waitUntil timeout to 60s for room auto-creation under CI load
- Only upload blob reports from repeat=1 to avoid corrupted merge

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each repeat's blob report gets a unique artifact name. On download,
each artifact goes to its own subdirectory. A flatten step copies all
.zip files into a single directory with unique prefixes so duplicate
filenames across repeats don't collide. This ensures the merged
Playwright report includes results from all 30 runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The { timeout } object passed as 2nd arg to test() is for
annotations/tags, not timeout config. Use test.setTimeout() inside
the test body which is the correct Playwright API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The root cause of the flaky test is that room creation after deleting
all rooms is slow — it loads skill cards from the realm server and
uploads them to Matrix before creating the room. Under CI load this
can take 60+ seconds or fail entirely.

Fix: when creating a fallback room (after the last room is deleted),
pass skipDefaultSkills to avoid the expensive loadDefaultSkills() call.
The room is created with empty skills, which is fine for an initial
landing room. Also await the createNewSession() call for correctness.

Test improvements:
- Detect [data-test-room-error] to fail fast with a clear message
  instead of polling until timeout when room creation errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 10, 2026

Host Test Results

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 3295e62. ± Comparison against base commit e549179.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

Preview deployments

backspace and others added 6 commits April 10, 2026 08:46
The newSessionId getter checks roomResources.has(id), but doLeaveRoom
deletes the room from roomResourcesCache right before this check. So
the getter always returns undefined, making the comparison
(this.newSessionId === roomId) always false — localStorage is never
cleared.

Later, a Matrix sync event can re-add the deleted room to the cache
via setRoomData (which calls roomResourcesCache.set if the key is
missing). Now newSessionId returns the stale room ID, and
createNewSession enters the deleted room instead of creating a new one.

Fix: check localStorage directly instead of going through the getter.
Also await createNewSession() to prevent floating promises.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With the localStorage race condition fixed in the service, room
creation after deletion is reliable. Remove the inflated timeouts
and the wait-for-deletion step that were compensating for the bug.

Keep the non-blocking polling (better than getRoomId which blocks
on waitFor) and the error detection (fails fast with a clear message).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When no explicit skills are provided, create the room immediately
without skills so the UI updates fast, then load and apply default
skills in the background via a room state event update.

Previously, loadDefaultSkills() blocked room creation — fetching skill
cards from the realm server and uploading them to Matrix before the
room could be entered. This made room creation unreliable under load.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous commit deferred skills for ALL room creation, breaking
tests that expect skills to be present immediately. Scope this to
only the fallback path (creating a room after all rooms are deleted)
via a deferDefaultSkills flag. All other room creation (new session
button, initial load, error retry) loads skills synchronously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove incorrect await inside Promise.all that made room creation
and module loading run sequentially instead of in parallel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant