Environment Variable List

Environment Variable	Function	Type	Values	Description
`VLLM_MS_MODEL_BACKEND`	Used to specify the model backend. If this variable is not set, the backend will be automatically selected in the priority order: MindFormers > Native > MindONE; if set, the specified backend will be used.	String	`MindFormers`: Model backend is MindSpore Transformers. `Native`: Model backend is Native. `MindONE`: Model backend is MindONE.	The native model backend currently supports the Qwen2.5, Qwen2.5VL, Qwen3 and Llama series; the MindSpore Transformers backend supports Qwen, DeepSeek and TeleChat models.
`GLOO_SOCKET_IFNAME`	Specifies the network interface name for inter-machine communication using gloo.	String	Interface name (e.g., `enp189s0f0`).	Used in multi-machine scenarios. The interface name can be found via `ifconfig` by matching the IP address.
`TP_SOCKET_IFNAME`	Specifies the network interface name for inter-machine communication using TP.	String	Interface name (e.g., `enp189s0f0`).	Used in multi-machine scenarios. The interface name can be found via `ifconfig` by matching the IP address.
`HCCL_SOCKET_IFNAME`	Specifies the network interface name for inter-machine communication using HCCL.	String	Interface name (e.g., `enp189s0f0`).	Used in multi-machine scenarios. The interface name can be found via `ifconfig` by matching the IP address.
`ASCEND_RT_VISIBLE_DEVICES`	Specifies which devices are visible to the current process, supporting one or multiple Device IDs.	String	Device IDs as a comma-separated string (e.g., `"0,1,2,3,4,5,6,7"`).	Recommended for Ray usage scenarios.
`HCCL_BUFFSIZE`	Controls the buffer size for data sharing between two NPUs.	int	Buffer size in MB (e.g., `2048`).	Usage reference: HCCL_BUFFSIZE. Example: For DeepSeek hybrid parallelism (Data Parallel: 32, Expert Parallel: 32) with `max-num-batched-tokens=256`, set `export HCCL_BUFFSIZE=2048`.
`MS_MEMPOOL_BLOCK_SIZE`	Set the size of the memory pool block in PyNative mode for devices	String	String of positive number, and the unit is GB.
`vLLM_USE_NPU_ADV_STEP_FLASH_OP`	Whether to use Ascend operation `adv_step_flash`	String	`on`: Use; `off`: Not use	If the variable is set to `off`, model will use the implementation of small operations.
`VLLM_TORCH_PROFILER_DIR`	Enables profiling data collection and takes effect when a data save path is configured.	String	The path to save profiling data.

More environment variable information can be referred in the following links: