Skip to content

Conversation

@nickytd
Copy link
Collaborator

@nickytd nickytd commented Nov 24, 2025

How to categorize this PR?
/kind enhancement
/area logging

What this PR does / why we need it:
This PR enhances the performance testing framework for the fluent-bit logging plugin and improves disk synchronization in the dque buffering client. It adds support for single namespace performance tests targeting seed clusters, extends the Plutono dashboard with comprehensive metrics visualization, and fixes a typo in the dque client.

Code changes:

  • client: Fix typo - Corrected isStooped to isStopped in the dque client implementation
  • client: Add dque size metric - Introduced a new Prometheus metric dque_size to track the current queue size and added periodic turbo sync operations (every 30 seconds) to ensure data consistency when turbo mode is enabled
  • example: Enhanced performance dashboard - Significantly improved the Plutono dashboard with:
    • New row sections for Fluent Bit, Output Plugin, and ValiTail metrics
    • Replaced legacy graph panels with modern timeseries visualizations
    • Added stat panels showing total records, errors, and dropped logs
    • Implemented comprehensive metrics tracking including input/output bytes, records rate, latency percentiles, and DQue buffer status
    • Added new template variables for filtering by host and URL
  • example: Support single namespace performance test load - Extended the performance test framework with:
    • New SCENARIO parameter supporting both "shoot" (multiple namespaces) and "seed" (single namespace) test modes
    • Refactored up.shrun.sh with scenario-based execution
    • Updated Makefile to pass scenario parameter to test scripts
    • Enhanced down.sh to support both test scenarios
    • README documentation updated to reflect new testing capabilities

Additional context:
The enhancements improve observability of the logging pipeline and provide flexibility in performance testing scenarios. The new metrics and dashboard improvements enable better monitoring of the buffer queue health and overall system performance. The seed scenario support allows testing concentrated load patterns that better simulate single-cluster stress conditions.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:
Please note the extensive changes to the dashboard JSON file - these are primarily visual improvements and metric additions that enhance monitoring capabilities.

Release note:

Enhanced performance testing framework with support for seed cluster scenarios and improved fluent-bit metrics dashboard with comprehensive visualization of queue sizes, latency percentiles, and throughput metrics
  • 🔄 Regenerate and Update Summary

@gardener-robot gardener-robot added needs/review Needs review size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. needs/second-opinion Needs second review by someone else labels Nov 24, 2025
@gardener-robot
Copy link

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

1 similar comment
@gardener-robot
Copy link

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

@gardener-robot
Copy link

@hyperspace-insights[bot] You have mentioned internal references in the public. Please check.

@gardener-robot gardener-robot added area/logging Logging related kind/enhancement Enhancement, improvement, extension labels Nov 24, 2025
@gardener-robot
Copy link

@nickytd You have mentioned internal references in the public. Please check.

1 similar comment
@gardener-robot
Copy link

@nickytd You have mentioned internal references in the public. Please check.

@gardener-robot
Copy link

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

1 similar comment
@gardener-robot
Copy link

@hyperspace-insights[bot] Command /add is not available to you but only to a Maintainer, Member, Author.

Copy link
Contributor

@hyperspace-insights hyperspace-insights bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This pull request introduces support for seed scenario testing alongside the existing shoot scenario, refactors the performance test scripts, fixes a typo in variable naming (isStoopedisStopped), and adds queue metrics monitoring. While the changes are generally well-structured, there are several issues that need to be addressed:

Key Issues Found:

  1. Shell scripting best practices: Missing quotes around variable expansions in function calls could lead to word splitting issues
  2. Makefile variable quoting: The SCENARIO variable has unnecessary quotes that could cause comparison issues
  3. Missing newline at EOF: The down.sh file lacks a trailing newline, which violates POSIX standards
  4. Performance concern: The timer implementation in dque.go uses a select with default case in a tight loop, which could cause unnecessary CPU overhead
  5. Error handling: The TurboSync() error is only logged without proper handling

Please address these issues before merging, particularly the shell scripting issues and the busy-wait pattern in the dequeuer loop.

PR Bot Information

Version: 1.15.5

  • Event Trigger: issue_comment.created
  • Correlation ID: 4ee9bd40-c93b-11f0-90b7-69570e81753d

@hyperspace-insights hyperspace-insights bot deleted a comment from nickytd Nov 24, 2025
@gardener-robot
Copy link

@nickytd Command "/InviteCommand" failed with "Review cannot be requested from pull request author.".

Additional Information
Redacted in public. Check backend logs.

nickytd and others added 6 commits November 24, 2025 14:47
Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>
Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>
Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>
Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>
Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>
Co-authored-by: hyperspace-insights[bot] <209611008+hyperspace-insights[bot]@users.noreply.github.com>
@nickytd nickytd merged commit af691ce into master Nov 24, 2025
56 checks passed
@nickytd nickytd deleted the share-dque-with-controller branch November 24, 2025 14:01
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Nov 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/logging Logging related kind/enhancement Enhancement, improvement, extension needs/review Needs review needs/second-opinion Needs second review by someone else size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/closed Issue is closed (either delivered or triaged)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants