Conversation
|
I still need to write the tests, but I am sharing the PR already to get feedbacks on the implementation. Little note: You will see that in the load methods you can select another llm or embedding client than the one that was used to generate the KnowledgeBase. I added this purely to add flexibility and improve developer experience, but it was not necessary nor asked in the issue. |
|
TODO
|
davidberenstein1957
left a comment
There was a problem hiding this comment.
Hi @GTimothee, this already looks nice. I left some comments to improve some things..
| Hugging Face token for authentication. If None, will use local token. | ||
| private : bool | ||
| Whether to make the repo private or public. | ||
| """ |
There was a problem hiding this comment.
should we also pass kwargs for the push to hub/upload file etc?
| kb = cls( | ||
| data=data, | ||
| columns=config.get("columns"), | ||
| llm_client=llm_client, | ||
| embedding_model=embedding_model, | ||
| chunk_size=config.get("chunk_size", 2048), | ||
| seed=config.get("seed"), | ||
| min_topic_size=config.get("min_topic_size"), | ||
| ) |
There was a problem hiding this comment.
Doesn't this already create the embeddings during initialisation?
Co-authored-by: David Berenstein <david.m.berenstein@gmail.com>
Co-authored-by: David Berenstein <david.m.berenstein@gmail.com>
|
Hi @GTimothee, thanks for opening this PR. Giskard v3 is coming! So, we are closing some active issues and PRs related to v2 and shifting our focus to nailing v3. At the moment, we are still actively developing v3, so we are not looking for external contributions yet, but we would love to hear your early feedback and expectations on our roadmap and discussion: https://github.com/orgs/Giskard-AI/discussions/2250. Feel free to close the PR, or potentially reopen it as a contribution to our v3 :) |
Description
Related Issue
#2145
Type of Change
It is an improvement as it avoids embeddings to be recomputed.
It is a new feature because we can now save/load to/from disk and hugginface hub.
Checklist
CODE_OF_CONDUCT.mddocument.CONTRIBUTING.mdguide.pdm.lockrunningpdm update-lock(only applicable whenpyproject.tomlhas beenmodified)