mindspore.communication¶
Collective communication interface.
-
mindspore.communication.
init
(backend_name='hccl')[source]¶ Init distributed backend, e.g., hccl/nccl, it is required before communication service can be used.
Note
The full name of hccl is Huawei Collective Communication Library. The full name of nccl is NVIDIA Collective Communication Library.
- Parameters
backend_name (str) – Backend.
- Raises
TypeError – If backend name is not a string.
RuntimeError – If backend is invalid or distributed init fails.
-
mindspore.communication.
release
()[source]¶ Release distributed resource. e.g., hccl/nccl.
- Raises
RuntimeError – If distributed resource release fails.
-
mindspore.communication.
get_rank
(group='hccl_world_group')[source]¶ Gets rank ID for current device in specified collective communication group.
- Parameters
group (str) – ProcessGroup, the process group to work on. Default: WORLD_COMM_GROUP.
- Returns
int, the rank ID of the calling process within the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.
-
mindspore.communication.
get_group_size
(group='hccl_world_group')[source]¶ Gets rank size of the specified collective communication group.
- Parameters
group (str) – ProcessGroup, the process group to work on.
- Returns
int, the rank size of the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.
-
mindspore.communication.
get_world_rank_from_group_rank
(group, group_rank_id)[source]¶ Gets the rank ID in world communication group corresponding to the rank ID in specified user communication group.
Note
Nccl is not supported. The parameter group should not be “hccl_world_group”.
- Parameters
- Returns
int, the rank ID in world communication group.
- Raises
TypeError – If group_rank_id is not a int or group is not a string.
ValueError – If group is ‘hccl_world_group’ or backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.
-
mindspore.communication.
get_group_rank_from_world_rank
(world_rank_id, group)[source]¶ Gets the rank ID in specified user communication group corresponding to the rank ID in world communication group.
Note
Nccl is not supported. The parameter group should not be “hccl_world_group”.
- Parameters
- Returns
int, the rank ID in user communication group.
- Raises
TypeError – If world_rank_id is not a int or group is not a string.
ValueError – If group is ‘hccl_world_group’ or backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.
-
mindspore.communication.
create_group
(group, rank_ids)[source]¶ Creates user collective communication group.
Note
Nccl is not supported. The size of rank_ids should be larger than 1. Rank_ids should not have duplicate data.
- Parameters
- Raises
TypeError – If group is not a string or rank_ids is not a list.
ValueError – If rank_ids size is not larger than 1 or rank_ids has duplicate data or backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.
Examples
>>> group = "0-1" >>> rank_ids = [0,1] >>> create_group(group, rank_ids)
-
mindspore.communication.
get_group
(group)[source]¶ Get the global world group if the group is default world comm group.
-
mindspore.communication.
get_local_rank
(group='hccl_world_group')[source]¶ Gets local rank ID for current device in specified collective communication group.
Note
Nccl is not supported.
- Parameters
group (str) – ProcessGroup, the process group to work on. Default: WORLD_COMM_GROUP.
- Returns
int, the local rank ID of the calling process within the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.
-
mindspore.communication.
get_local_rank_size
(group='hccl_world_group')[source]¶ Gets local rank size of the specified collective communication group.
Note
Nccl is not supported.
- Parameters
group (str) – ProcessGroup, the process group to work on.
- Returns
int, the local rank size where the calling process is being within the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.
-
mindspore.communication.
destroy_group
(group)[source]¶ Destroys user collective communication group.
Note
Nccl is not supported. The parameter group should not be “hccl_world_group”.
- Parameters
group (str) – ProcessGroup, the process group to destroy.
- Raises
TypeError – If group is not a string.
ValueError – If group is “hccl_world_group” or backend is invalid.
RuntimeError – If hccl/nccl is not available or nccl not supports.