lite_boost.parallel.initialize_usp
- lite_boost.parallel.initialize_usp()[source]
Initialize the HCCL distributed environment for parallel inference.
This function configures the NPU runtime settings and initializes the HCCL distributed process group by reading the following environment variables:
RANK: Local rank of the current process. Default:0.WORLD_SIZE: Total number of distributed processes. Default:1.MASTER_ADDR: IP address of the master node. Default:"127.0.0.1".MASTER_PORT: Port of the master node. Default:29502.NUM_THREADS: Number of CPU threads per process. Default:24.
If the distributed process group has not been initialized, this function will initialize it with the
hcclbackend. After initialization, the NPU device corresponding toRANKis set as the active device.Note
This function must be called before constructing
ParallelManager. It is typically invoked at the entry point of a distributed training or inference script.- Raises:
RuntimeError – If HCCL process group initialization fails.
Examples
>>> import os >>> os.environ["RANK"] = "0" >>> os.environ["WORLD_SIZE"] = "1" >>> from lite_boost.parallel import initialize_usp >>> initialize_usp()