Release Notes
vLLM-MindSpore Plugin 0.3.0 Release Notes
The following are the key new features and models supported in the vLLM-MindSpore Plugin version 0.3.0.
New Features
0.9.1 V1 Architecture Basic Features, including chunked prefill and automatic prefix caching;
V0 Multi-step Scheduling;
V0 Chunked Prefill;
V0 Automatic Prefix Caching;
V0 DeepSeek MTP (Multi-Task Processing);
GPTQ Quantization;
SmoothQuant Quantization;
V1 Sampling Enhancements.
New Models
DeepSeek-V3/R1
Qwen2.5-0.5B/1.5/7B/14B/32B/72B
Qwen3-0.6B/1.7B/4B/8B/14B/32B