-
Notifications
You must be signed in to change notification settings - Fork 100
Labels
community-requestIssue reported or requested by someone from the communityIssue reported or requested by someone from the communitydocumentationImprovements to documentationImprovements to documentationusabilityimprovements to user experienceimprovements to user experience
Description
Description
Current tutorials only demonstrate using OpenAI's Responses API for model serving, and we lack documentation for configuring our other supported inference options.
Design
The vLLM model server should be a dedicated page in the Model Server section. This should cover how the middleware works to convert between chat and responses, how to use vllm endpoint and local vllm
Acceptance Criteria:
- getting started guides should link to the vLLM docs for ease of user navigation
- doc page for vLLM
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
community-requestIssue reported or requested by someone from the communityIssue reported or requested by someone from the communitydocumentationImprovements to documentationImprovements to documentationusabilityimprovements to user experienceimprovements to user experience