Environment Variable List

View Source On Gitee

Environment Variable

Function

Type

Values

Description

VLLM_MS_MODEL_BACKEND

Used to specify the model backend. If this variable is not set, the backend will be automatically selected in the priority order: MindFormers > Native > MindONE; if set, the specified backend will be used.

String

MindFormers: Model backend is MindSpore Transformers. Native: Model backend is Native. MindONE: Model backend is MindONE.

The native model backend currently supports the Qwen2.5, Qwen2.5VL, Qwen3 and Llama series; the MindSpore Transformers backend supports Qwen, DeepSeek and TeleChat models. When using MindSpore Transformers, set the environment variable: export PYTHONPATH=/path/to/mindformers/:$PYTHONPATH.

MINDFORMERS_MODEL_CONFIG

Configuration file for MindSpore Transformers models. Required for Qwen2.5 series or DeepSeek series models.

String

Path to the model configuration file

This environment variable will be removed in future versions. Example: export MINDFORMERS_MODEL_CONFIG=/path/to/research/deepseek3/deepseek_r1_671b/predict_deepseek_r1_671b_w8a8.yaml.

GLOO_SOCKET_IFNAME

Specifies the network interface name for inter-machine communication using gloo.

String

Interface name (e.g., enp189s0f0).

Used in multi-machine scenarios. The interface name can be found via ifconfig by matching the IP address.

TP_SOCKET_IFNAME

Specifies the network interface name for inter-machine communication using TP.

String

Interface name (e.g., enp189s0f0).

Used in multi-machine scenarios. The interface name can be found via ifconfig by matching the IP address.

HCCL_SOCKET_IFNAME

Specifies the network interface name for inter-machine communication using HCCL.

String

Interface name (e.g., enp189s0f0).

Used in multi-machine scenarios. The interface name can be found via ifconfig by matching the IP address.

ASCEND_RT_VISIBLE_DEVICES

Specifies which devices are visible to the current process, supporting one or multiple Device IDs.

String

Device IDs as a comma-separated string (e.g., "0,1,2,3,4,5,6,7").

Recommended for Ray usage scenarios.

HCCL_BUFFSIZE

Controls the buffer size for data sharing between two NPUs.

int

Buffer size in MB (e.g., 2048).

Usage reference: HCCL_BUFFSIZE. Example: For DeepSeek hybrid parallelism (Data Parallel: 32, Expert Parallel: 32) with max-num-batched-tokens=256, set export HCCL_BUFFSIZE=2048.

MS_MEMPOOL_BLOCK_SIZE

Set the size of the memory pool block in PyNative mode for devices

String

String of positive number, and the unit is GB.

vLLM_USE_NPU_ADV_STEP_FLASH_OP

Whether to use Ascend operation adv_step_flash

String

on: Use;off:Not use

If the variable is set to off, model will use the implement of small operations.

VLLM_TORCH_PROFILER_DIR

Enables profiling data collection and takes effect when a data save path is configured.

String

The path to save profiling data.

More environment variable information can be referred in the following links: