docs: Add MLflow integration guide#2344
Conversation
Add documentation for using TruLens feedback functions as MLflow GenAI scorers. Includes: - Installation instructions - Available scorers (RAG and Agent trace) - Usage examples (direct calls and batch evaluation) - Model configuration for multiple providers - Threshold configuration - MLflow tracing integration - Best practices and troubleshooting Resolves truera#2343 Signed-off-by: debu-sinha <debusinha2009@gmail.com>
|
Related Documentation No published documentation to review for changes on this repository. |
|
Note: This is a docs-only PR adding MLflow integration documentation. The |
| @@ -0,0 +1,238 @@ | |||
| # MLflow Integration | |||
There was a problem hiding this comment.
Instead of creating a new /integrations/ folder, place this in docs/component_guides/evaluations.
sfc-gh-jreini
left a comment
There was a problem hiding this comment.
thanks again for contributing - few small things to address than can approve.
| Install MLflow with TruLens support: | ||
|
|
||
| ```bash | ||
| pip install 'mlflow>=3.10.0' trulens |
There was a problem hiding this comment.
will also need trulens-providers-litellm I believe
| @@ -0,0 +1,238 @@ | |||
| # MLflow Integration | |||
|
|
|||
| TruLens feedback functions are available as first-class scorers in MLflow's GenAI evaluation framework starting with MLflow 3.10.0. This integration was contributed by [Debu Sinha](https://github.com/debu-sinha) in [MLflow PR #19492](https://github.com/mlflow/mlflow/pull/19492). | |||
There was a problem hiding this comment.
Mentioning the PR here seems nonstandard. Can we call this out in the contribution guide instead (saying that integrating TruLens to other libraries is a new category of contributions).
| | `Groundedness` | Evaluates whether the response is grounded in the provided context | | ||
| | `ContextRelevance` | Evaluates whether the retrieved context is relevant to the query | | ||
| | `AnswerRelevance` | Evaluates whether the response is relevant to the input query | | ||
| | `Coherence` | Evaluates the coherence and logical flow of the response | |
There was a problem hiding this comment.
Coherence should be in a separate category/not limited to RAG. You could call it an Output Scorer
|
|
||
| ## Dynamic Scorer Creation | ||
|
|
||
| Use `get_scorer` to create scorers dynamically: |
There was a problem hiding this comment.
Add a sentence or two on why you would want to create the scorers dynamically
Changes: - Move docs from integrations/ to evaluation/ folder per reviewer request - Add trulens-providers-litellm to installation instructions - Remove PR reference from intro (nonstandard) - Recategorize Coherence as "Output Scorer" (not RAG-specific) - Add explanation for dynamic scorer creation use case - Update related resources link Signed-off-by: debu-sinha <debusinha2009@gmail.com>
|
@sfc-gh-jreini All review feedback addressed:
Ready for re-review! |
sfc-gh-jreini
left a comment
There was a problem hiding this comment.
LGTM!
Would love to share broadly about this integration. @debu-sinha Interested in co-authoring a blog about this?
|
Thanks for the review and glad the docs look good! Absolutely interested in co-authoring a blog. The TruLens + MLflow integration opens up some interesting possibilities - especially the agent trace scorers covering the TRAIL evaluation framework. Happy to contribute wherever helpful. Do you have a preferred format or platform in mind? I can draft an outline covering the key use cases (RAG evaluation, agent traces, etc.) if that would be a good starting point. Let me know how you'd like to proceed. |
I've got a draft started, can I add the gmail listed on your github? Or would you prefer a different email |
|
The gmail on my GitHub works - looking forward to seeing the draft! |
|
Thanks for the blog draft and review. I finished my edits on the blog last week -- let me know if everything looks good or if anything needs adjusting. |
|
Thanks Debu, appreciate your contribution to the blog. Will reach out if anything is needed, otherwise expecting to publish this aligned with the mlflow release |
Summary
Adds documentation for using TruLens feedback functions as MLflow GenAI scorers.
Resolves #2343
Changes
docs/component_guides/integrations/- For third-party integrationsdocs/component_guides/integrations/index.md- Integrations indexdocs/component_guides/integrations/mlflow.md- MLflow integration guideDocumentation includes:
mlflow.genai.evaluate)Context
The TruLens integration was merged into MLflow in PR #19492 and ships in MLflow 3.10.0. This documentation helps TruLens users discover and use this integration.
Navigation
This creates a new "Integrations" section under Component Guides. The navigation may need to be updated in the site config if not auto-discovered.
Important
Adds documentation for integrating TruLens feedback functions with MLflow as GenAI scorers, including installation, usage, and troubleshooting.
mlflow.mdindocs/component_guides/integrations/for MLflow integration guide.docs/component_guides/integrations/for third-party integrations.index.mdto list available integrations.This description was created by
for 47dc3ea. You can customize this summary. It will automatically update as commits are pushed.