Skip to content

feat: keep flakiness in table and user server-side pagination for flakiness overview#920

Open
meryemefe wants to merge 1 commit intostagingfrom
feat/flaky-test-optimization
Open

feat: keep flakiness in table and user server-side pagination for flakiness overview#920
meryemefe wants to merge 1 commit intostagingfrom
feat/flaky-test-optimization

Conversation

@meryemefe
Copy link
Contributor

Motivation

After deploying Flakiness Overview feature to test environment, I realized that calculating flakiness score on the fly causes critical performance issue, even Java Heap Out of Memory error. Therefore, it is essential to keep flakiness score in the database and using server side pagination in flakiness overview implementation.

Description

  • Created test_case_flakiness table which keeps flakiness_score, default_branch_failure_rate, combined_failure_rate for only tests having flakiness score > 0.
  • test_case_flakiness table is updated in TestResultProcessor after updating test_case_statistics.
  • Added pagination to getFlakyTestsOverview service, which returns results in pages and sorted by flakiness_score.
  • In the frontend, Flakiness Overview page is updated with new API call and made compatible with server-side pagination.

Testing Instructions

Prerequisites:

  • Access to a Helios with at least one repository with tests with different flakiness scores.

Flow:

  1. Log in to Helios as a Developer
  2. Currently test_case_flakiness is empty. Run a test workflow to calculate flaky tests.
  3. Navigate to the Flakiness Overview page for a repository
  4. Verify:
    4.1. Table loads successfully.
    4.2. Search and severity filter still update the displayed page.
    4.3. Pagination controls behave consistently with server-side results.

Screenshots

Screenshot 2026-03-18 at 23 09 54

Checklist

General

Server

  • Code is performant and follows best practices
  • I documented the Java code using JavaDoc style.

Client

  • I documented the TypeScript code using JSDoc style.
  • I added multiple screenshots/screencasts of my UI changes.

@codacy-production
Copy link

codacy-production bot commented Mar 18, 2026

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.37% (target: -1.00%) 50.68%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (437fcaa) 8387 2715 32.37%
Head commit (fab952b) 8436 (+49) 2700 (-15) 32.01% (-0.37%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#920) 148 75 50.68%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses performance issues in the Flakiness Overview by persisting flakiness metrics in a dedicated DB table and switching the overview endpoint + UI to server-side pagination/filtering/sorting.

Changes:

  • Add test_case_flakiness table + JPA entity/repository and recompute/update it after test statistics updates.
  • Update flaky tests overview endpoint/service to query the precomputed table with pagination + filters and return filteredCount.
  • Update Angular Flakiness Overview page to use lazy server-side pagination/filtering/sorting and align OpenAPI-generated client types.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
server/application-server/src/main/resources/db/migration/V38__add_test_case_flakiness.sql Adds new table + index to persist flakiness metrics efficiently.
server/application-server/src/main/java/de/tum/cit/aet/helios/tests/TestCaseStatisticsService.java Implements flakiness persistence + paged overview query from the new table.
server/application-server/src/main/java/de/tum/cit/aet/helios/tests/TestResultProcessor.java Triggers recomputation/persistence after statistics updates.
server/application-server/src/main/java/de/tum/cit/aet/helios/tests/TestResultController.java Extends /api/tests/flaky with pagination/filter/search params.
server/application-server/src/main/java/de/tum/cit/aet/helios/tests/FlakyTestOverviewDto.java Adjusts DTO to match server-side paging (filteredCount) and new storage model.
server/application-server/openapi.yaml Documents new query params and updated response shape.
client/src/app/pages/flaky-tests-overview/* Moves Flakiness Overview table to server-side lazy loading and updates UI accordingly.
server/client tests & OpenAPI generated files Updates tests and generated client types for the API change.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

[loading]="query.isPending()"
(onPage)="onPage($event)"
(onSort)="onSort($event)"
[sortField]="paginationService.sortField()"
Comment on lines +130 to +133
for (TestCase testCase : suite.getTestCases()) {
var flakinessInfo =
computeFlakinessInfo(
testCase.getName(), testCase.getClassName(), defaultBranchStats, combinedStats);
Comment on lines +119 to +127
var suiteClassNames = testSuites.stream().map(TestSuite::getName).distinct().toList();

List<TestCaseStatistics> defaultBranchStats =
statisticsRepository.findByTestSuiteNameInAndBranchNameAndRepositoryRepositoryId(
suiteClassNames, defaultBranch, repository.getRepositoryId());

List<TestCaseStatistics> combinedStats =
statisticsRepository.findByTestSuiteNameInAndBranchNameAndRepositoryRepositoryId(
suiteClassNames, "combined", repository.getRepositoryId());
Comment on lines +121 to +134
List<TestCaseStatistics> defaultBranchStats =
statisticsRepository.findByTestSuiteNameInAndBranchNameAndRepositoryRepositoryId(
suiteClassNames, defaultBranch, repository.getRepositoryId());

List<TestCaseStatistics> combinedStats =
statisticsRepository.findByTestSuiteNameInAndBranchNameAndRepositoryRepositoryId(
suiteClassNames, "combined", repository.getRepositoryId());

for (TestSuite suite : testSuites) {
for (TestCase testCase : suite.getTestCases()) {
var flakinessInfo =
computeFlakinessInfo(
testCase.getName(), testCase.getClassName(), defaultBranchStats, combinedStats);
double flakinessScore = flakinessInfo.flakinessScore();
Comment on lines +106 to +159
/**
* Recomputes and persists flakiness scores for all tests belonging to the given suites.
* Reads statistics for the default branch and combined branches to compute scores, then updates
* the {@code test_case_flakiness} table accordingly. Tests that are no longer flaky (score <= 0)
* will have their flakiness record removed to keep the table focused on only flaky tests.
*
* @param testSuites the suites processed in this run
* @param defaultBranch the repository's default branch name
* @param repository the repository
*/
public void updateFlakinessForTestSuite(
List<TestSuite> testSuites, String defaultBranch, GitRepository repository) {

var suiteClassNames = testSuites.stream().map(TestSuite::getName).distinct().toList();

List<TestCaseStatistics> defaultBranchStats =
statisticsRepository.findByTestSuiteNameInAndBranchNameAndRepositoryRepositoryId(
suiteClassNames, defaultBranch, repository.getRepositoryId());

List<TestCaseStatistics> combinedStats =
statisticsRepository.findByTestSuiteNameInAndBranchNameAndRepositoryRepositoryId(
suiteClassNames, "combined", repository.getRepositoryId());

for (TestSuite suite : testSuites) {
for (TestCase testCase : suite.getTestCases()) {
var flakinessInfo =
computeFlakinessInfo(
testCase.getName(), testCase.getClassName(), defaultBranchStats, combinedStats);
double flakinessScore = flakinessInfo.flakinessScore();

if (flakinessScore <= 0.0) {
// Test is not flaky anymore; remove any existing flakiness record so the
// precomputed table only contains genuinely flaky tests.
flakinessRepository
.findByTestNameAndClassNameAndTestSuiteNameAndRepositoryRepositoryId(
testCase.getName(),
testCase.getClassName(),
suite.getName(),
repository.getRepositoryId())
.ifPresent(flakinessRepository::delete);
continue;
}

upsertFlakiness(
testCase.getName(),
testCase.getClassName(),
suite.getName(),
flakinessScore,
flakinessInfo.defaultBranchFailureRate(),
flakinessInfo.combinedFailureRate(),
repository);
}
}
}
type: integer
format: int64
required:
- flakyTests
Comment on lines +29 to 31
imports: [CommonModule, TableModule, TagModule, TooltipModule, ButtonModule, SkeletonModule, PageHeadingComponent, TablerIconComponent, TableFilterPaginatedComponent],
providers: [PaginatedTableService, { provide: PAGINATED_FILTER_OPTIONS_TOKEN, useFactory: createFlakinessFilterOptions }, provideTablerIcons({ IconBug, IconAlertTriangle })],
templateUrl: './flaky-tests-overview.component.html',
Comment on lines +41 to 80
queryOptions = computed(() => {
const state = this.paginationService.paginationState();
const filterType = state.filterType as 'ALL' | 'HIGH' | 'MEDIUM' | 'LOW' | undefined;
return getFlakyTestsOverviewOptions({
query: {
page: state.page,
size: state.size,
sortDirection: state.sortDirection,
filterType,
searchTerm: state.searchTerm,
},
});
});

filteredFlakyTests = computed(() => {
const data = this.flakyTestsQuery.data()?.flakyTests ?? [];
const search = this.debouncedSearch().toLowerCase();
const severity = this.severityFilter();
query = injectQuery(() => this.queryOptions());

let filtered = data;
if (search) {
filtered = filtered.filter(t => t.testName.toLowerCase().includes(search) || t.className.toLowerCase().includes(search) || t.testSuiteName.toLowerCase().includes(search));
}
if (severity !== 'all') {
filtered = filtered.filter(t => {
const score = t.flakinessScore ?? 0;
if (severity === 'high') return score > 70;
if (severity === 'medium') return score > 30 && score <= 70;
return score <= 30;
});
}
return filtered;
});
get typedPaginationService() {
return this.paginationService as PaginatedTableService;
}

paginatedFlakyTests = computed(() => {
const filtered = this.filteredFlakyTests();
const start = this.currentPage() * this.pageSize();
return filtered.slice(start, start + this.pageSize());
});
flakyTests(): FlakyTestDto[] {
return this.query.data()?.flakyTests ?? [];
}

totalRecords = computed(() => this.filteredFlakyTests().length);
totalElements(): number {
return this.query.data()?.filteredCount ?? 0;
}

onSearchChange(value: string) {
this.searchTerm.set(value);
if (this.searchDebounceTimer) {
clearTimeout(this.searchDebounceTimer);
}
this.searchDebounceTimer = setTimeout(() => {
this.debouncedSearch.set(value);
this.currentPage.set(0);
}, 300);
onPage(event: TablePageEvent) {
this.paginationService.onPage(event);
}

onPageChange(event: PaginatorState) {
this.currentPage.set(event.page ?? 0);
this.pageSize.set(event.rows ?? 10);
onSort(event: SortMeta) {
this.paginationService.onSort(event);
}

setSeverityFilter(filter: SeverityFilter) {
this.severityFilter.set(filter);
this.currentPage.set(0);
clearFilters() {
this.filterComponent.clearSearch();
this.paginationService.clearFilters();
}
@mertilginoglu
Copy link
Contributor

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fab952bb97

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +3 to +7
CREATE TABLE test_case_flakiness (
id BIGSERIAL PRIMARY KEY,
repository_id BIGINT NOT NULL,
test_name VARCHAR(255) NOT NULL,
class_name VARCHAR(255) NOT NULL,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Backfill existing flakiness rows before switching reads

getFlakyTestsOverview() and getFlakinessScoresForTests() now read exclusively from test_case_flakiness, but this migration only creates the table. On upgrade, any repository that already has historical test_case_statistics data but has not processed a new test run yet will suddenly show zero flaky tests and zero flakiness scores until each affected suite reruns, which is a visible regression for existing installations.

Useful? React with 👍 / 👎.

Comment on lines +64 to +68
@Column(nullable = false, name = "flakiness_score")
private double flakinessScore;

@Column(nullable = false, name = "default_branch_failure_rate")
private double defaultBranchFailureRate;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Record which default branch the cached score was computed from

These rows persist flakinessScore and defaultBranchFailureRate, but they do not capture which branch those values came from. GitHubBranchSyncService#processBranch can change the repo's default branch, and the old implementation recomputed against the current default branch on every request; after this cache is introduced, the overview and CI score endpoint will keep serving values derived from the old default branch until a fresh run rewrites every affected suite.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants