Qwen2.5-VL-3B model  evaluating BLINK, and it reported an OOM error:

I am using the 910B NPU with the Qwen2.5-VL-3B model to evaluate BLINK, and it reported an error:
[2026-01-25 15:38:11] ERROR - RUN - run.py: main - 2090: Model Qwen2.5-VL-3B-Instruct x Dataset VStarBench combination failed: NPU out of memory. Tried to allocate 5.51 GiB (NPU 0; 60.96 GiB total capacity; 53.99 GiB already allocated; 53.99 GiB current active; 4.53 GiB free; 55.50 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

How can I resolve this issue? Even when I use 4*910B, it doesn't seem to distribute the memory usage evenly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2.5-VL-3B model evaluating BLINK, and it reported an OOM error: #1416

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Qwen2.5-VL-3B model evaluating BLINK, and it reported an OOM error: #1416

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions