mindspore.ops.communication.gather_into_tensor

mindspore.ops.communication.gather_into_tensor(output_tensor, input_tensor, dst=0, group=None, async_op=False)[source]

Gathers tensors from the specified communication group. The operation will gather the tensor from processes according to dimension 0.

Note

Only the tensor in process dst (global rank) will keep the gathered tensor. The other process will keep a tensor with shape [1], which has no mathematical meaning.
Only support PyNative mode, Graph mode is not currently supported.

Parameters:

output_tensor (Tensor) – Output tensor to accommodate tensor elements from all ranks.
input_tensor (Tensor) – The tensor to be gathered. The shape of tensor is \((x_1, x_2, ..., x_R)\). The input tensors in this API must have the same size across all ranks.
dst (int, optional) – Specifies the rank(global rank) of the process that receive the tensor. And only process dst will receive the gathered tensor. Default: 0.
group (str, optional) – The communication group to work on. Default: None, which means "hccl_world_group" in Ascend.
async_op (bool, optional) – Whether this operator should be an async operator. Default: False.

Returns:

CommHandle. If async_op is set to True, CommHandle is an async work handle. If async_op is set to False, CommHandle will be None.

Raises:

TypeError – If the type of the input_tensor or output_tensor parameter is not Tensor, dst is not an int, group is not a str, or async_op is not bool.
RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails.

Supported Platforms:: Ascend

Examples

Note

Before running the following examples, you need to configure the communication environment variables.

For Ascend devices, it is recommended to use the msrun startup method without any third-party or configuration file dependencies. Please see the msrun startup for more details.

This example should be run with 2 devices.

>>> import numpy as np
>>> import mindspore as ms
>>> import mindspore.nn as nn
>>> from mindspore.ops.communication import init_process_group
>>> from mindspore import Tensor
>>> from mindspore.communication.comm_func import gather_into_tensor
>>> # Launch 2 processes.
>>>
>>> init_process_group()
>>> input = Tensor(np.arange(4).reshape([2, 2]).astype(np.float32))
>>> output = Tensor(np.zeros([4, 2]).astype(np.float32))
>>> handle = gather_into_tensor(output, input, dst=0)
>>> print(output)
Process with rank 0: [[0. 1.],
                      [2. 3.],
                      [0. 1.],
                      [2. 3.]]
Process with rank 1:  [[0. 0.],
                      [0. 0.],
                      [0. 0.],
                      [0. 0.]]