Helm installed application metric Grafana dashboards#800
Helm installed application metric Grafana dashboards#800poussa merged 4 commits intoopea-project:mainfrom
Conversation
|
@eero-t Is opea-project/GenAIComps#1280 related to this failure? |
@lianhao Yes. For some reason CI did not catch the the original change failing the CI... (That issue does not happen in production. Only CI creates multiple Orchestrator instances in the same program.) |
|
Rebased to @lianhao CI still failing, but this time all failures are due to: |
|
Bugs in GenAIExamples and GenAIComps resulted the latest changes are not populated into the CI's image registry repo. These bugs just got fixed yesterday. I'll manually populate the related container images |
|
I'm afraid the job of release charts will think "assets" is a helm chart and report fail. |
UIDs identifying the dashboards for Grafana, are based on the chart "fullname" values. That way, if dashboard configMaps get installed multiple times as dependencies for different OPEA application charts, Grafana can differentiate them based on those UIDs. (At least as long as those application instances are given different Helm release names, not just installed to separate namespaces.) Another alternative would have been omitting the dashboard UIDs, to let Grafana generate them. However, that would have meant dashboard URLs changing on every Helm re-deployment, as UID is part of the Grafana dashboard URL! Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
* s/{{err}}/{{ printf "{{err}}" }}/
* s/{{le}}/{{ printf "{{le}}" }}/
Another possibility would be putting the dashboard JSON spec to a
separate file and reading it to Helm configMap template with:
{{ .Files.Get "dashboards/metrics.json" | toJson | indent 4 }}
But that would require extra scripting to replace dashboard spec
"title" and "uid" fields with suitable templated values.
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
This assume "prometheusNamespace" variable value to be correct. Metrics dashboard is installed when monitoring is enabled, and scaling dashboard when also HPA autoscaling is enabled. NOTE: Regardless of which application installs the dashboard(s), they are identical except for the title in the Grafana dashboards list (and dashboard internal "uid"). Both dashboards show metrics for _any_ of the OPEA applications that process streamed tokens with an LLM, not just for ChatQnA or DocSum. Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
* Describe new Helm functionality in monitoring.md * Update kubernetes-addons/Observability/ - Add index to README + move dashboard import info to its own section - Remove duplicate Protheus / Grafana install info - Move related files to better locations * New dashboad screenshots Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
|
@eero-t Would you please add filter for assets here? |
@yongfengdu I filed separate PR for fixing the workflow, as it had bugs that were not related to this PR, see #814. |
|
Btw. In this PR "ChatQnA" & "DocSum" both have |
Description
Note: to make reviewing the changes easier, I'll file separate PR for removing redundant files from under kubernetes/Observability/ after this is merged. Scaling dashboard in there included LLM metrics only for TGI, new one supports also vLLM.
Issues
n/a.Type of change
Dependencies
Requires HG workflow fix #814.
Tests
Manually tested that dashboard installation, and the dashboards themselves work.