mindspore.communication

Collective communication interface. Note the API in the file needs to preset communication environment variables. For the Ascend cards, users need to prepare the rank table, set rank_id and device_id. Please see the Ascend tutorial for more details. For the GPU device, users need to prepare the host file and mpi, please see the GPU tutorial for more details.

class mindspore.communication.GlobalComm[source]

World communication information. The GlobalComm is a global class. The members contain:

  • BACKEND: The communication library used, using HCCL/NCCL.

  • WORLD_COMM_GROUP: Global communication domain.

mindspore.communication.create_group(group, rank_ids)[source]

Create a user collective communication group.

Note

GPU version of MindSpore doesn’t support this method. The size of rank_ids should be larger than 1, rank_ids should not have duplicate data. This method should be used after init(). Only support global single communication group in PyNative mode. The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.managerment.

Parameters
  • group (str) – The name of the communication group to be created.

  • rank_ids (list) – A list of device IDs.

Raises
  • TypeError – If group is not a string or rank_ids is not a list.

  • ValueError – If rank_ids size is not larger than 1, or rank_ids has duplicate data, or backend is invalid.

  • RuntimeError – If HCCL is not available or MindSpore is GPU version.

Supported Platforms:

Ascend

Examples

>>> from mindspore import set_context
>>> from mindspore.ops import operations as ops
>>> from mindspore.communication.management import init, create_group
>>> set_context(device_target="Ascend")
>>> init()
>>> group = "0-8"
>>> rank_ids = [0,8]
>>> create_group(group, rank_ids)
>>> allreduce = ops.AllReduce(group)
mindspore.communication.destroy_group(group)[source]

Destroy the user collective communication group.

Note

GPU version of MindSpore doesn’t support this method. The parameter group should not be “hccl_world_group”. This method should be used after init().

Parameters

group (str) – The communication group to destroy, the group should be created by create_group.

Raises
  • TypeError – If group is not a string.

  • ValueError – If group is “hccl_world_group” or backend is invalid.

  • RuntimeError – If HCCL is not available or MindSpore is GPU version.

Supported Platforms:

Ascend

mindspore.communication.get_group_rank_from_world_rank(world_rank_id, group)[source]

Get the rank ID in the specified user communication group corresponding to the rank ID in the world communication group.

Note

GPU version of MindSpore doesn’t support this method. The parameter group should not be “hccl_world_group”. This method should be used after init(). The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.managerment.

Parameters
  • world_rank_id (int) – A rank ID in the world communication group.

  • group (str) – The communication group to work on. The group is created by create_group.

Returns

int, the rank ID in the user communication group.

Raises
  • TypeError – If world_rank_id is not an integer or the group is not a string.

  • ValueError – If group is ‘hccl_world_group’ or backend is invalid.

  • RuntimeError – If HCCL is not available or MindSpore is GPU version.

Supported Platforms:

Ascend

Examples

>>> from mindspore import set_context
>>> from mindspore.communication.management import init, create_group, get_group_rank_from_world_rank
>>> set_context(device_target="Ascend")
>>> init()
>>> group = "0-4"
>>> rank_ids = [0,4]
>>> create_group(group, rank_ids)
>>> group_rank_id = get_group_rank_from_world_rank(4, group)
>>> print("group_rank_id is: ", group_rank_id)
group_rank_id is: 1
mindspore.communication.get_group_size(group=GlobalComm.WORLD_COMM_GROUP)[source]

Get the rank size of the specified collective communication group.

Note

This method should be used after init(). The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.communication.

Parameters

group (str) – The communication group to work on. Normally, the group should be created by create_group, otherwise, using the default group. Default: WORLD_COMM_GROUP.

Returns

int, the rank size of the group.

Raises
Supported Platforms:

Ascend GPU

Examples

>>> import mindspore as ms
>>> from mindspore.communication.management import init, get_group_size
>>> ms.set_auto_parallel_context(device_num=8)
>>> init()
>>> group_size = get_group_size()
>>> print("group_size is: ", group_size)
group_size is: 8
mindspore.communication.get_local_rank(group=GlobalComm.WORLD_COMM_GROUP)[source]

Gets local rank ID for current device in specified collective communication group.

Note

GPU version of MindSpore doesn’t support this method. This method should be used after init(). The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.communication.

Parameters

group (str) – The communication group to work on. Normally, the group should be created by create_group, otherwise, using the default group. Default: WORLD_COMM_GROUP.

Returns

int, the local rank ID of the calling process within the group.

Raises
Supported Platforms:

Ascend

Examples

>>> import mindspore as ms
>>> from mindspore.communication.management import init, get_rank, get_local_rank
>>> ms.set_context(device_target="Ascend")
>>> ms.set_auto_parallel_context(device_num=16) # 2 server, each server with 8 NPU.
>>> init()
>>> world_rank = get_rank()
>>> local_rank = get_local_rank()
>>> print("local_rank is: {}, world_rank is {}".format(local_rank, world_rank))
local_rank is: 1, world_rank is 9
mindspore.communication.get_local_rank_size(group=GlobalComm.WORLD_COMM_GROUP)[source]

Gets local rank size of the specified collective communication group.

Note

GPU version of MindSpore doesn’t support this method. This method should be used after init(). The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.communication.

Parameters

group (str) – The communication group to work on. The group is created by create_group or the default world communication group. Default: WORLD_COMM_GROUP.

Returns

int, the local rank size where the calling process is within the group.

Raises
Supported Platforms:

Ascend

Examples

>>> import mindspore as ms
>>> from mindspore.communication.management import init, get_local_rank_size
>>> ms.set_context(device_target="Ascend")
>>> ms.set_auto_parallel_context(device_num=16) # 2 server, each server with 8 NPU.
>>> init()
>>> local_rank_size = get_local_rank_size()
>>> print("local_rank_size is: ", local_rank_size)
local_rank_size is: 8
mindspore.communication.get_rank(group=GlobalComm.WORLD_COMM_GROUP)[source]

Get the rank ID for the current device in the specified collective communication group.

Note

This method should be used after init(). The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.communication.

Parameters

group (str) – The communication group to work on. Normally, the group should be created by create_group, otherwise, using the default group. Default: WORLD_COMM_GROUP.

Returns

int, the rank ID of the calling process within the group.

Raises
Supported Platforms:

Ascend GPU

Examples

>>> from mindspore.communication import init, get_rank
>>> init()
>>> rank_id = get_rank()
>>> print(rank_id)
>>> # the result is the rank_id in world_group
mindspore.communication.get_world_rank_from_group_rank(group, group_rank_id)[source]

Get the rank ID in the world communication group corresponding to the rank ID in the specified user communication group.

Note

GPU version of MindSpore doesn’t support this method. The parameter group should not be “hccl_world_group”. This method should be used after init(). The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.communication.

Parameters
  • group (str) – The communication group to work on. The group is created by create_group.

  • group_rank_id (int) – A rank ID in the communication group.

Returns

int, the rank ID in world communication group.

Raises
  • TypeError – If group_rank_id is not an integer or the group is not a string.

  • ValueError – If group is ‘hccl_world_group’ or backend is invalid.

  • RuntimeError – If HCCL is not available or MindSpore is GPU version.

Supported Platforms:

Ascend

Examples

>>> from mindspore import set_context
>>> from mindspore.communication.management import init, create_group, get_world_rank_from_group_rank
>>> set_context(device_target="Ascend")
>>> init()
>>> group = "0-4"
>>> rank_ids = [0,4]
>>> create_group(group, rank_ids)
>>> world_rank_id = get_world_rank_from_group_rank(group, 1)
>>> print("world_rank_id is: ", world_rank_id)
world_rank_id is: 4
mindspore.communication.init(backend_name=None)[source]

Initialize distributed backend, e.g. HCCL/NCCL, it is required before using the communication service.

Note

  • The full name of HCCL is Huawei Collective Communication Library.

  • The full name of NCCL is NVIDIA Collective Communication Library.

  • The full name of MCCL is MindSpore Collective Communication Library.

  • This method should be used after set_context. The user needs to preset communication environment variables

before running the following example, please see the docstring of the mindspore.communication.

Parameters

backend_name (str) – Backend, using HCCL/NCCL/MCCL. If the backend_name is None, system will recognize device_target by devices. Default: None.

Raises
  • TypeError – If backend_name is not a string.

  • RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails, or the environment variables RANK_ID/MINDSPORE_HCCL_CONFIG_PATH have not been exported when backend is HCCL.

Supported Platforms:

Ascend GPU

Examples

>>> from mindspore.communication import init
>>> init()
mindspore.communication.release()[source]

Release distributed resource. e.g. HCCL/NCCL.

Note

This method should be used after init(). The user needs to preset communication environment variables before running the following example, please see the docstring of the mindspore.communication.

Raises

RuntimeError – If failed to release distributed resource.

Supported Platforms:

Ascend GPU

Examples

>>> from mindspore.communication import init, release
>>> init()
>>> release()
mindspore.communication.HCCL_WORLD_COMM_GROUP

The string of “hccl_world_group” referring to the default communication group created by HCCL.

mindspore.communication.NCCL_WORLD_COMM_GROUP

The string of “nccl_world_group” referring to the default communication group created by NCCL.