Copilot/add patch for ascend in lmdeploy by jinminxi104 · Pull Request #291 · DeepLink-org/dlinfer

jinminxi104 · 2025-12-27T04:55:41Z

workaround for uncompleted attention_with_kv_cache op

Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>

Added a comment regarding a workaround for a specific issue.

This update adds logic for prefill scheduling with kv-cache optimization, ensuring compliance with batch size and token count limits.

Copilot

Pull request overview

This PR adds a workaround patch for Ascend devices to address an incomplete prefill_attention_with_kvcache operation in lmdeploy. The patch monkey-patches the Scheduler._schedule_prefill method to add custom logic for handling prefill sequences with KV-cache optimization.

Key Changes

Introduces _schedule_prefill_ascend function that replaces the default scheduler's prefill method
Adds logic to control batching based on whether sequences have new tokens for prefill operations
Implements early-break conditions to handle prefill-with-kvcache sequences differently

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dlinfer/framework/lmdeploy_ext/device/ascend.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

dlinfer/framework/lmdeploy_ext/device/ascend.py

Copilot AI and others added 7 commits December 27, 2025 04:46

Initial plan

6bb8531

Add scheduler patch for Ascend in lmdeploy extension

bbbc571

Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>

Add comment for workaround in ascend.py

49e9f76

Added a comment regarding a workaround for a specific issue.

Refactor conditionals for better readability

f84dc55

Fix indentation in ascend.py

870d5cc

Enhance prefill scheduling for Ascend devices

56a6492

This update adds logic for prefill scheduling with kv-cache optimization, ensuring compliance with batch size and token count limits.

Update import for EventType in ascend.py

63a2401

jinminxi104 requested review from Copilot, tangzhiyi11 and yao-fengchen December 27, 2025 17:00

Copilot started reviewing on behalf of jinminxi104 December 27, 2025 17:00 View session

Copilot AI reviewed Dec 27, 2025

View reviewed changes

jinminxi104 and others added 4 commits December 28, 2025 01:37

Update dlinfer/framework/lmdeploy_ext/device/ascend.py

a73311c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Simplify boolean checks in ascend.py

0536808

fix

1e95e85

format

18268aa

tangzhiyi11 reviewed Dec 29, 2025

View reviewed changes

dlinfer/framework/lmdeploy_ext/device/ascend.py Show resolved Hide resolved

tangzhiyi11 approved these changes Dec 29, 2025

View reviewed changes

jinminxi104 merged commit 059e42c into DeepLink-org:main Dec 29, 2025
4 checks passed

jinminxi104 added a commit to jinminxi104/dlinfer that referenced this pull request Apr 2, 2026

avoid prefill-with-kv error (DeepLink-org#291)

9dab1a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copilot/add patch for ascend in lmdeploy#291

Copilot/add patch for ascend in lmdeploy#291
jinminxi104 merged 11 commits intoDeepLink-org:mainfrom
jinminxi104:copilot/add-patch-for-ascend-in-lmdeploy

jinminxi104 commented Dec 27, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jinminxi104 commented Dec 27, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants