Installation Guide

This document will introduce the Version Matching of vLLM-MindSpore Plugin, the installation steps for vLLM-MindSpore Plugin, and the Quick Verification to verify whether the installation is successful. The installation steps provide two installation methods:

Docker Installation: Suitable for quick deployment scenarios.
Source Code Installation: Suitable for incremental development of vLLM-MindSpore Plugin.

Version Compatibility

OS: Linux-aarch64
Python: 3.9 / 3.10 / 3.11
Depent Software version compatibility

Software

Version And Links

CANN

8.3.RC1

MindSpore

2.7.1.post1

MSAdapter

0.0.5

MindSpore Transformers

1.7.0

vLLM

0.11.0

ms_custom_ops

0.1.0

MindSpore ONE

0.5.0

Docker Installation

We recommend using Docker for quick deployment of the vLLM-MindSpore Plugin environment. Below are the steps:

Building the Image

User can execute the following commands to clone the vLLM-MindSpore Plugin code repository:

git clone https://atomgit.com/mindspore/vllm-mindspore.git

To build the image according to your npu type, follow these steps:

For Atlas 800I A2:
```
bash build_image.sh
```
For Atlas 300I Duo:
```
bash build_image.sh -a 310p
```

After a successful build, user will get the following output:

Successfully built e40bcbeae9fc
Successfully tagged vllm_ms_20250726:latest

Here, e40bcbeae9fc is the image ID, and vllm_ms_20250726:latest is the image name and tag. User can run the following command to confirm that the Docker image has been successfully created:

docker images

Creating a Container

After building the image, set DOCKER_NAME and IMAGE_NAME as the container and image names, then execute the following command to create the container:

export DOCKER_NAME=vllm-mindspore-container  # your container name
export IMAGE_NAME=vllm_ms_20250726:latest  # your image name

docker run -itd --name=${DOCKER_NAME} --ipc=host --network=host --privileged=true \
        --device=/dev/davinci0 \
        --device=/dev/davinci1 \
        --device=/dev/davinci2 \
        --device=/dev/davinci3 \
        --device=/dev/davinci4 \
        --device=/dev/davinci5 \
        --device=/dev/davinci6 \
        --device=/dev/davinci7 \
        --device=/dev/davinci_manager \
        --device=/dev/devmm_svm \
        --device=/dev/hisi_hdc \
        -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
        -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
        -v /etc/ascend_install.info:/etc/ascend_install.info \
        -v /var/log/npu/:/usr/slog \
        -v /usr/bin/hccn_tool:/usr/bin/hccn_tool \
        -v /etc/hccn.conf:/etc/hccn.conf \
        --shm-size="250g" \
        ${IMAGE_NAME} \
        bash

For docker run parameters, please refer to the "Running MindSpore Image" section in the MindSpore Installation Guide.

The container ID will be returned if docker is created successfully. User can also check the container by executing the following command:

docker ps

Entering the Container

After creating the container, user can start and enter the container, using the environment variable DOCKER_NAME:

docker exec -it $DOCKER_NAME bash

Source Code Installation

CANN Installation

For CANN installation methods and environment configuration, please refer to CANN Community Edition Installation Guide. If you encounter any issues during CANN installation, please consult the Ascend FAQ for troubleshooting.

The default installation path for CANN is /usr/local/Ascend. After completing CANN installation, configure the environment variables with the following commands:

LOCAL_ASCEND=/usr/local/Ascend # the root directory of run package
source ${LOCAL_ASCEND}/ascend-toolkit/set_env.sh
export ASCEND_CUSTOM_PATH=${LOCAL_ASCEND}/ascend-toolkit

vLLM Prerequisites Installation

For vLLM environment configuration and installation methods, please refer to the vLLM Installation Guide.

vLLM-MindSpore Plugin Installation

vLLM-MindSpore Plugin can be installed in the following two ways. vLLM-MindSpore Plugin Quick Installation is suitable for scenarios where users need quick deployment and usage. vLLM-MindSpore Plugin Manual Installation is suitable for scenarios where users require custom modifications to the components.

vLLM-MindSpore Plugin Quick Installation

To install vLLM-MindSpore Plugin, user needs to pull the vLLM-MindSpore Plugin source code and then runs the following command to install the dependencies:
```
git clone https://atomgit.com/mindspore/vllm-mindspore.git
cd vllm-mindspore
bash install_depend_pkgs.sh
```
Compile and install vLLM-MindSpore Plugin:
```
pip install .
```
If pip version is greater than or equal to 25.3, users need to use the following command to compile and install vLLM-MindSpore Plugin:
```
pip install --no-build-isolation .
```
User can also refer to Version Compatibility, check the Python version, download vLLM-Mindspore Pulgin whl package, and use pip to install.
vLLM-MindSpore Plugin Manual Installation

If users require custom modifications to dependent components such as vLLM, MindSpore, or MSAdapter, they can prepare the modified installation packages locally and perform manual installation in a specific sequence. The installation sequence requirements are as follows:
1. Install vLLM
```
pip install /path/to/vllm-*.whl
```
2. Install MindSpore
```
pip install /path/to/mindspore-*.whl
```
3. Install MindSpore Transformers
```
pip install /path/to/mindformers-*.whl
```
4. Install MSAdapter
```
pip install /path/to/msadapter-*.whl
```
5. Install Custom Ops
```
pip install /path/to/ms_custom_ops-*.whl
```
6. Install MindSpore ONE
```
pip install /path/to/mindone-*.whl
```
7. Install vLLM-MindSpore Plugin
  
  User can use whl package to install vLLM-MindSpore Plugin.
```
pip install /path/to/vllm_mindspore-*.whl
```
  User could also use source code to install vLLM-MindSpore Plugin.
```
git clone https://atomgit.com/mindspore/vllm-mindspore.git
cd vllm-mindspore
pip install .
```
  If pip version is greater than or equal to 25.3, users need to use the following command to compile and install vLLM-MindSpore Plugin:
```
pip install --no-build-isolation .
```

Quick Verification

User can verify the installation with a simple offline inference test. First, user needs to configure the environment variables with the following command:

export VLLM_MS_MODEL_BACKEND=MindFormers # use MindSpore Transformers as model backend.

About environment variables above, user can also refer to environment variables section for more details.

User can use the following Python scripts to verify with Qwen2.5-7B:

import vllm_mindspore # Add this line on the top of script.
from vllm import LLM, SamplingParams

# Sample prompts.
prompts = [
    "I am",
    "Today is",
    "Llama is"
]

# Create a sampling params object.
sampling_params = SamplingParams(temperature=0.0, top_p=0.95)

# Create a LLM
llm = LLM(model="Qwen2.5-7B-Instruct")
# Generate texts from the prompts. The output is a list of RequestOutput objects
# that contain the prompt, generated text, and other information.
outputs = llm.generate(prompts, sampling_params)
# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}. Generated text: {generated_text!r}")

If successful, the output will resemble:

Prompt: 'I am'. Generated text: ' trying to create a virtual environment for my Python project, but I am encountering some'
Prompt: 'Today is'. Generated text: ' the 100th day of school. To celebrate, the teacher has'
Prompt: 'Llama is'. Generated text: ' a 100% natural, biodegradable, and compostable alternative'

Alternatively, refer to the Quick Start guide for online inference verification.

Software	Version And Links
CANN	8.3.RC1
MindSpore	2.7.1.post1
MSAdapter	0.0.5
MindSpore Transformers	1.7.0
vLLM	0.11.0
ms_custom_ops	0.1.0
MindSpore ONE	0.5.0