Enhance dque fs sync #397

nickytd · 2025-11-24T13:38:53Z

How to categorize this PR?
/kind enhancement
/area logging

What this PR does / why we need it:
This PR enhances the performance testing framework for the fluent-bit logging plugin and improves disk synchronization in the dque buffering client. It adds support for single namespace performance tests targeting seed clusters, extends the Plutono dashboard with comprehensive metrics visualization, and fixes a typo in the dque client.

Code changes:

client: Fix typo - Corrected isStooped to isStopped in the dque client implementation
client: Add dque size metric - Introduced a new Prometheus metric dque_size to track the current queue size and added periodic turbo sync operations (every 30 seconds) to ensure data consistency when turbo mode is enabled
example: Enhanced performance dashboard - Significantly improved the Plutono dashboard with:
- New row sections for Fluent Bit, Output Plugin, and ValiTail metrics
- Replaced legacy graph panels with modern timeseries visualizations
- Added stat panels showing total records, errors, and dropped logs
- Implemented comprehensive metrics tracking including input/output bytes, records rate, latency percentiles, and DQue buffer status
- Added new template variables for filtering by host and URL
example: Support single namespace performance test load - Extended the performance test framework with:
- New SCENARIO parameter supporting both "shoot" (multiple namespaces) and "seed" (single namespace) test modes
- Refactored up.sh → run.sh with scenario-based execution
- Updated Makefile to pass scenario parameter to test scripts
- Enhanced down.sh to support both test scenarios
- README documentation updated to reflect new testing capabilities

Additional context:
The enhancements improve observability of the logging pipeline and provide flexibility in performance testing scenarios. The new metrics and dashboard improvements enable better monitoring of the buffer queue health and overall system performance. The seed scenario support allows testing concentrated load patterns that better simulate single-cluster stress conditions.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:
Please note the extensive changes to the dashboard JSON file - these are primarily visual improvements and metric additions that enhance monitoring capabilities.

Release note:

Enhanced performance testing framework with support for seed cluster scenarios and improved fluent-bit metrics dashboard with comprehensive visualization of queue sizes, latency percentiles, and throughput metrics

🔄 Regenerate and Update Summary

gardener-robot · 2025-11-24T13:39:56Z

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

gardener-robot · 2025-11-24T13:39:58Z

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

gardener-robot · 2025-11-24T13:40:00Z

@hyperspace-insights[bot] You have mentioned internal references in the public. Please check.

gardener-robot · 2025-11-24T13:40:55Z

@nickytd You have mentioned internal references in the public. Please check.

gardener-robot · 2025-11-24T13:41:16Z

@nickytd You have mentioned internal references in the public. Please check.

gardener-robot · 2025-11-24T13:41:37Z

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

gardener-robot · 2025-11-24T13:41:39Z

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

hyperspace-insights

Summary

This pull request introduces support for seed scenario testing alongside the existing shoot scenario, refactors the performance test scripts, fixes a typo in variable naming (isStooped → isStopped), and adds queue metrics monitoring. While the changes are generally well-structured, there are several issues that need to be addressed:

Key Issues Found:

Shell scripting best practices: Missing quotes around variable expansions in function calls could lead to word splitting issues
Makefile variable quoting: The SCENARIO variable has unnecessary quotes that could cause comparison issues
Missing newline at EOF: The down.sh file lacks a trailing newline, which violates POSIX standards
Performance concern: The timer implementation in dque.go uses a select with default case in a tight loop, which could cause unnecessary CPU overhead
Error handling: The TurboSync() error is only logged without proper handling

Please address these issues before merging, particularly the shell scripting issues and the busy-wait pattern in the dequeuer loop.

PR Bot Information

Version: 1.15.5

Event Trigger: issue_comment.created
Correlation ID: 4ee9bd40-c93b-11f0-90b7-69570e81753d

example/performance-test/Makefile

example/performance-test/down.sh

example/performance-test/run.sh

pkg/client/dque.go

gardener-robot · 2025-11-24T13:43:28Z

@nickytd Command "/InviteCommand" failed with "Review cannot be requested from pull request author.".

Additional Information

Redacted in public. Check backend logs.

Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>

nickytd added 4 commits November 21, 2025 22:56

client: fix typo in dque client

b2112fc

client: add dque size metric

e7b72c3

example: enhance fluent-bit performance dashboard

4eeed0d

example: support single namespace performance test load

0289326

gardener-robot added needs/review Needs review size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. needs/second-opinion Needs second review by someone else labels Nov 24, 2025

gardener-robot added area/logging Logging related kind/enhancement Enhancement, improvement, extension labels Nov 24, 2025

Merge branch 'master' into share-dque-with-controller

5228761

hyperspace-insights bot reviewed Nov 24, 2025

View reviewed changes

hyperspace-insights bot deleted a comment from nickytd Nov 24, 2025

nickytd and others added 6 commits November 24, 2025 14:47

Default value should not be quoted

db765c5

Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>

Update example/performance-test/down.sh

c7abdc1

Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>

Update example/performance-test/run.sh

60e302e

Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>

Update example/performance-test/down.sh

48efb63

Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>

Update example/performance-test/down.sh

9ba1408

Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>

Update example/performance-test/run.sh

4a902f7

Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>

nickytd merged commit af691ce into master Nov 24, 2025
56 checks passed

nickytd deleted the share-dque-with-controller branch November 24, 2025 14:01

gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Nov 24, 2025

Enhance dque fs sync #397

Enhance dque fs sync #397

Uh oh!

Conversation

nickytd commented Nov 24, 2025 • edited by hyperspace-insights bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

hyperspace-insights bot left a comment

Choose a reason for hiding this comment

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gardener-robot commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nickytd commented Nov 24, 2025 •

edited by hyperspace-insights bot

Loading