Added configuration options to disable Prometheus metrics and endpoints#15499
Added configuration options to disable Prometheus metrics and endpoints#15499kenliao94 wants to merge 2 commits intorabbitmq:mainfrom
Conversation
c361663 to
9434b59
Compare
9434b59 to
cceb21d
Compare
Move `normalize_disabled_metrics/1` and `normalize_metric_name/1` from `prometheus_rabbitmq_core_metrics_collector` into a new `rabbit_prometheus_util` module so that the functions are properly exported and testable. Update `rabbit_prometheus_disabled_metrics_SUITE` to call `rabbit_prometheus_util:normalize_metric_name/1` directly instead of duplicating the implementation locally.
lukebakken
left a comment
There was a problem hiding this comment.
Thank you for this contribution. After reviewing the changes, I have the following feedback:
Breaking default for disable_per_object_endpoint
The schema sets {default, true} for disable_per_object_endpoint, which silently disables /metrics/per-object for all existing deployments. This endpoint is documented as always accessible regardless of return_per_object_metrics, so changing its availability by default is a breaking change that will affect users who scrape it directly. Please change the default to false so that existing behavior is preserved and operators must explicitly opt in to disabling it.
normalize_metric_name/1 was duplicated in the test suite
The test suite defined its own local copy of normalize_metric_name/1 rather than testing the production function. This has been addressed in a follow-up commit by extracting the normalization logic into a new rabbit_prometheus_util module, which is now called from both the collector and the test suite.
Performance
normalize_disabled_metrics/1 is called on every scrape, re-reading and re-normalizing the disabled metrics list from app env each time. This is minor when the list is short, but worth addressing in a follow-up.
Documentation
The website documentation needs to be updated to describe the new configuration options and their defaults. The relevant sections are in docs/prometheus/index.md in the rabbitmq-website repository:
Each endpoint section (/metrics/memory-breakdown, /metrics/per-object, /metrics/detailed) should note the corresponding disable option, and a new section should cover prometheus.disabled_metrics.
Proposed Changes
This PR adds configuration options to the rabbitmq_prometheus plugin that allow operators to:
Disable specific Prometheus endpoints - New configuration flags to disable the /metrics/per-object, /metrics/detailed, and /metrics/memory-breakdown endpoints individually:
Disable individual metrics - A new disabled_metrics configuration option that accepts a list of metric names to exclude from scrape responses. The implementation normalizes metric names by stripping rabbitmq_, rabbitmq_detailed_, and rabbitmq_cluster_ prefixes, allowing users to specify metrics in any format.
Why this is useful:
Types of Changes
What types of changes does your code introduce to this project?
Put an
xin the boxes that applyChecklist
Put an
xin the boxes that apply.You can also fill these out after creating the PR.
This is simply a reminder of what we are going to look for before merging your code.
CONTRIBUTING.mddocumentFurther Comments
If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution
you did and what alternatives you considered, etc.