lite_boost.parallel.initialize_usp

View Source On AtomGit
lite_boost.parallel.initialize_usp()[source]

Initialize the HCCL distributed environment for parallel inference.

This function configures the NPU runtime settings and initializes the HCCL distributed process group by reading the following environment variables:

  • RANK: Local rank of the current process. Default: 0.

  • WORLD_SIZE: Total number of distributed processes. Default: 1.

  • MASTER_ADDR: IP address of the master node. Default: "127.0.0.1".

  • MASTER_PORT: Port of the master node. Default: 29502.

  • NUM_THREADS: Number of CPU threads per process. Default: 24.

If the distributed process group has not been initialized, this function will initialize it with the hccl backend. After initialization, the NPU device corresponding to RANK is set as the active device.

Note

This function must be called before constructing ParallelManager. It is typically invoked at the entry point of a distributed training or inference script.

Raises:

RuntimeError – If HCCL process group initialization fails.

Examples

>>> import os
>>> os.environ["RANK"] = "0"
>>> os.environ["WORLD_SIZE"] = "1"
>>> from lite_boost.parallel import initialize_usp
>>> initialize_usp()