Skip to content

[XPU] Update doc and add scripts for downloading dependencies#2845

Merged
yulangz merged 9 commits into
PaddlePaddle:developfrom
yulangz:update_xvllm_download
Jul 16, 2025
Merged

[XPU] Update doc and add scripts for downloading dependencies#2845
yulangz merged 9 commits into
PaddlePaddle:developfrom
yulangz:update_xvllm_download

Conversation

@yulangz

@yulangz yulangz commented Jul 15, 2025

Copy link
Copy Markdown
Collaborator
  1. XVLLM 的下载方法统一集中到 download_dependencies.sh 脚本中,减少后期维护成本。
  2. 增加 kunlunxin_xpu_deployment.md 文档描述已适配模型及部署方法,并从安装文档挪到 usage 目录。
  3. 修复 xpu_model_runner 中 block_table 的 shape 被错误设置的问题,在部分显存非常充裕的小模型场景会导致算子报错。

@paddle-bot

paddle-bot Bot commented Jul 15, 2025

Copy link
Copy Markdown

Thanks for your contribution!

@@ -0,0 +1,54 @@
#!/bin/bash

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

脚本名改成 download_dependencies.sh 吧

fi

echo "Installation completed in: $THIRDPARTY_DIR"
echo "You can set environment variables to use XVLLM and XTDK in the following way:"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set environment variables as follows to use XVLLM and XTDK:

Comment thread dockerfiles/Dockerfile.xpu Outdated
wget https://klx-sdk-release-public.su.bcebos.com/xre/kl3-release/5.0.21.21/xre-Linux-x86_64-5.0.21.21.tar.gz && \
tar -zxf xre-Linux-x86_64-5.0.21.21.tar.gz && mv xre-Linux-x86_64-5.0.21.21 xre
tar -zxf xre-Linux-x86_64-5.0.21.21.tar.gz && mv xre-Linux-x86_64-5.0.21.21 xre && \
cd /workspace/FastDeploy && bash custom_ops/xpu_ops/src/download_dependency.sh stable

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

脚本名改成 download_dependencies.sh 吧


For detailed OpenAI protocol specifications, see [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create). Differences from the standard OpenAI protocol are documented in [OpenAI Protocol-Compatible API Server](../../online_serving/README.md).

## Supported Models

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个能放在 ## Quick start 前面吗?
然后在Quick start 可以删掉这一大段『The P800 supports the deployment of the ERNIE-4.5-300B-A47B-Paddle model using the following configurations (Note: Different configurations may result in variations in performance).

  • 32K WINT4 with 8 XPUs (Recommended)
  • 128K WINT4 with 8 XPUs
  • 32K WINT4 with 4 XPUs』
    同时,『#### Start service 』只保留一个推荐的启动方法吧。

```bash
XTDK: https://klx-sdk-release-public.su.bcebos.com/xtdk_15fusion/dev/latest/xtdk-llvm15-ubuntu2004_x86_64.tar.gz
XVLLM: https://klx-sdk-release-public.su.bcebos.com/xinfer/daily/eb/latest/output.tar.gz
bash custom_ops/xpu_ops/src/download_dependency.sh develop

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上


OpenAI 协议的更多说明可参考文档 [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create),以及与 OpenAI 协议的区别可以参考 [兼容 OpenAI 协议的服务化部署](../../online_serving/README.md)。

## 支持的模型

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

@hong19860320 hong19860320 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hong19860320 hong19860320 changed the title [XPU] update xvllm download [XPU] Update doc and add scripts for downloading dependencies Jul 15, 2025
@yulangz yulangz merged commit 17314ee into PaddlePaddle:develop Jul 16, 2025
3 of 4 checks passed
xiaoguoguo626807 pushed a commit to xiaoguoguo626807/FastDeploy that referenced this pull request May 7, 2026
…Paddle#2845)

* [XPU] update xvllm download

* update supported models

* fix xpu model runner in huge memory with small model

* update doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants