mindspore.mint.distributed.init_process_group
- mindspore.mint.distributed.init_process_group(backend='hccl', init_method=None, timeout=None, world_size=- 1, rank=- 1, store=None, pg_options=None, device_id=None)[source]
Init collective communication lib. And create a default collective communication group.
Note
This method isn't supported in GPU and CPU versions of MindSpore. In Ascend hardware platforms, this API should be set before the definition of any Tensor and Parameter, and the instantiation and execution of any operation and net.
- Parameters
backend (str, optional) – The backend to ues. Default is
"hccl"
and now only support hccl.init_method (str, optional) – URL specifying how to init collective communication group. Default is
None
.timeout (timedelta, optional) – Timeout for API executed. Default is
None
. Currently, this parameter is only supported for host-side cluster network configuration using init_method or store.world_size (int, optional) – Number of the processes participating in the job. Default is
-1
.rank (int, optional) – Rank of the current process. Default is
-1
.store (Store, optional) – An object that stores key/value data, facilitating the exchange of inter-process communication addresses and connection information. Default is
None
. Currently, only theTCPStore
type is supported.pg_options (ProcessGroupOptions, invalid) – process group options specifying what additional options need to be passed in during the construction of specific process group. The provided parameter is a reserved parameter, and the current setting does not take effect.
device_id (int, invalid) – the device id to exeute. The provided parameter is a reserved parameter, and the current setting does not take effect.
- Raises
ValueError – If backend is not hccl.
ValueError – If world_size is not equal to -1 or process group number.
ValueError – If both init_method and store are set.
ValueError – world_size is not correctly set as a positive integer value, when using the initialization method init_method or store.
ValueError – rank is not correctly set as a non-negative integer, when using the initialization method init_method or store.
RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails, or the environment variables RANK_ID/MINDSPORE_HCCL_CONFIG_PATH have not been exported when backend is HCCL.
- Supported Platforms:
Ascend
Examples
Note
Before running the following examples, you need to configure the communication environment variables.
For Ascend devices, it is recommended to use the msrun startup method without any third-party or configuration file dependencies. Please see the msrun start up for more details.
>>> import mindspore as ms >>> from mindspore.mint.distributed import init_process_group, destroy_process_group >>> ms.set_device(device_target="Ascend") >>> init_process_group() >>> destroy_process_group()