Skip to content

Fix flaky VerifyTransparentStatement_ThreadSafety test timeout#59264

Open
JonathanCrd wants to merge 2 commits into
Azure:mainfrom
JonathanCrd:fix/flaky-transparent-statement-test
Open

Fix flaky VerifyTransparentStatement_ThreadSafety test timeout#59264
JonathanCrd wants to merge 2 commits into
Azure:mainfrom
JonathanCrd:fix/flaky-transparent-statement-test

Conversation

@JonathanCrd
Copy link
Copy Markdown
Member

@JonathanCrd JonathanCrd commented May 14, 2026

Azure.Security.CodeTransparency - Fix flaky thread safety test

Fixes #59267

Problem

The VerifyTransparentStatement_ThreadSafety_ParallelCallsShouldNotFail test has been consistently failing across all recent core pipeline runs (pipeline #1180). The test runs 8 threads × 3 iterations = 24 crypto-heavy operations, taking ~31-32 seconds — just over the 30-second CI global timeout.

This is blocking the entire core pipeline on main.

Fix

Reduced the test workload from 8 threads × 3 iterations to 4 threads × 2 iterations. The test still validates thread safety via barrier synchronization ensuring all threads start simultaneously. With the reduced workload, the test completes in ~244ms locally.

Verification

  • Confirmed the same test fails in the last 5+ pipeline runs on main
  • Test passes locally in 244ms after the fix

🤖 Created by JonathanCrd's copilot

Reduce thread count from 8 to 4 and iterations from 3 to 2 in the
thread safety test. The test was consistently taking ~31-32s, just
over the 30s CI global timeout, causing all core pipeline runs to
fail. The reduced workload (4x2=8 parallel calls) still validates
thread safety via the barrier synchronization while completing well
within the timeout.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Reduces the workload of a flaky thread-safety test to avoid exceeding the CI 30-second global timeout.

Changes:

  • Lower threadCount from 8 to 4
  • Lower iterationsPerThread from 3 to 2

Explain why threadCount and iterationsPerThread are kept low,
referencing the 30s CI global timeout from GlobalTimeoutTearDown.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
byte[] transparentStatementBytes = readFileBytes(name: "transparent_statement.cose");
int threadCount = 8;
int iterationsPerThread = 3;
// Keep thread count and iterations low to stay within the 30s CI global timeout
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems brittle, as the overall load on a CI run is going to influence the timing. Depending on what else gets pulled into pullrequest you'll still potentially exceed the timeout.

For something non-deterministic like this, we may want to consider putting in a local cancellation token that triggers after ~25 seconds and does an ignore assert.

It's not ideal, but basically defaults to "we couldn't complete this test" instead of failing the run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: VerifyTransparentStatement_ThreadSafety_ParallelCallsShouldNotFail exceeds 30s CI timeout

3 participants