mindspore.communication

Collective communication interface.

class mindspore.communication.GlobalComm[source]

Global communication info.

mindspore.communication.create_group(group, rank_ids)[source]

Creates user collective communication group.

Note

Nccl is not supported. The size of rank_ids should be larger than 1. Rank_ids should not have duplicate data.

Parameters
  • group (str) – ProcessGroup, the process group to create.

  • rank_ids (list) – List of device ID.

Raises
  • TypeError – If group is not a string or rank_ids is not a list.

  • ValueError – If rank_ids size is not larger than 1 or rank_ids has duplicate data or backend is invalid.

  • RuntimeError – If hccl/nccl is not available or nccl not supports.

Examples

>>> group = "0-1"
>>> rank_ids = [0,1]
>>> create_group(group, rank_ids)
mindspore.communication.destroy_group(group)[source]

Destroys user collective communication group.

Note

Nccl is not supported. The parameter group should not be “hccl_world_group”.

Parameters

group (str) – ProcessGroup, the process group to destroy.

Raises
  • TypeError – If group is not a string.

  • ValueError – If group is “hccl_world_group” or backend is invalid.

  • RuntimeError – If hccl/nccl is not available or nccl not supports.

mindspore.communication.get_group_rank_from_world_rank(world_rank_id, group)[source]

Gets the rank ID in specified user communication group corresponding to the rank ID in world communication group.

Note

Nccl is not supported. The parameter group should not be “hccl_world_group”.

Parameters
  • world_rank_id (int) – A rank ID in world communication group.

  • group (str) – The user communication group.

Returns

int, the rank ID in user communication group.

Raises
  • TypeError – If world_rank_id is not a int or group is not a string.

  • ValueError – If group is ‘hccl_world_group’ or backend is invalid.

  • RuntimeError – If hccl/nccl is not available or nccl not supports.

mindspore.communication.get_group_size(group='hccl_world_group')[source]

Gets rank size of the specified collective communication group.

Parameters

group (str) – ProcessGroup, the process group to work on.

Returns

int, the rank size of the group.

Raises
mindspore.communication.get_local_rank(group='hccl_world_group')[source]

Gets local rank ID for current device in specified collective communication group.

Note

Nccl is not supported.

Parameters

group (str) – ProcessGroup, the process group to work on. Default: WORLD_COMM_GROUP.

Returns

int, the local rank ID of the calling process within the group.

Raises
mindspore.communication.get_local_rank_size(group='hccl_world_group')[source]

Gets local rank size of the specified collective communication group.

Note

Nccl is not supported.

Parameters

group (str) – ProcessGroup, the process group to work on.

Returns

int, the local rank size where the calling process is being within the group.

Raises
mindspore.communication.get_rank(group='hccl_world_group')[source]

Gets rank ID for current device in specified collective communication group.

Parameters

group (str) – ProcessGroup, the process group to work on. Default: WORLD_COMM_GROUP.

Returns

int, the rank ID of the calling process within the group.

Raises
mindspore.communication.get_world_rank_from_group_rank(group, group_rank_id)[source]

Gets the rank ID in world communication group corresponding to the rank ID in specified user communication group.

Note

Nccl is not supported. The parameter group should not be “hccl_world_group”.

Parameters
  • group (str) – The user communication group.

  • group_rank_id (int) – A rank ID in user communication group.

Returns

int, the rank ID in world communication group.

Raises
  • TypeError – If group_rank_id is not a int or group is not a string.

  • ValueError – If group is ‘hccl_world_group’ or backend is invalid.

  • RuntimeError – If hccl/nccl is not available or nccl not supports.

mindspore.communication.init(backend_name='hccl')[source]

Init distributed backend, e.g., hccl/nccl, it is required before communication service can be used.

Note

The full name of hccl is Huawei Collective Communication Library. The full name of nccl is NVIDIA Collective Communication Library.

Parameters

backend_name (str) – Backend.

Raises
  • TypeError – If backend name is not a string.

  • RuntimeError – If backend is invalid or distributed init fails.

mindspore.communication.release()[source]

Release distributed resource. e.g., hccl/nccl.

Raises

RuntimeError – If distributed resource release fails.