Release Notes

vLLM-MindSpore Plugin 0.3.0 Release Notes

The following are the key new features and models supported in the vLLM-MindSpore Plugin version 0.3.0.

New Features

  • 0.9.1 V1 Architecture Basic Features, including chunked prefill and automatic prefix caching;

  • V0 Multi-step Scheduling;

  • V0 Chunked Prefill;

  • V0 Automatic Prefix Caching;

  • V0 DeepSeek MTP (Multi-Task Processing);

  • GPTQ Quantization;

  • SmoothQuant Quantization;

  • V1 Sampling Enhancements.

New Models

  • DeepSeek-V3/R1

  • Qwen2.5-0.5B/1.5/7B/14B/32B/72B

  • Qwen3-0.6B/1.7B/4B/8B/14B/32B