mindspore.ops.communication.all_reduce
- mindspore.ops.communication.all_reduce(tensor, op=ReduceOp.SUM, group=None, async_op=False)[source]
Reduce tensors across all devices in such a way that all devices will get the same final result, returns the tensor which is all reduced.
Note
The tensors must have the same shape and format in all processes of the collection.
- Parameters
tensor (Tensor) – The input tensor of collective. The shape of tensor is \((x_1, x_2, ..., x_R)\). If the function operates in-place, this also means output of collective.
op (str, optional) – Specifies an operation used for element-wise reductions, like sum, prod, max, and min. Default:
ReduceOp.SUM.group (str, optional) – The communication group to work on. Default:
None, which means"hccl_world_group"in Ascend.async_op (bool, optional) – Whether this operator should be an async operator. Default:
False.
- Returns
If the function operates in-place, return CommHandle.
If the function operates non in-place, return Tuple(Tensor, CommHandle). The first element stores the output result, and the second element is CommHandle.
Among them, when async_op is
True, then CommHandle is an asynchronous working handle; When async_op isFalse, CommHandle will returnNone.- Raises
TypeError – If the type of the first input parameter is not Tensor, or any of op and group is not a str, op range is illegal or async_op is not bool.
RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails.
- Supported Platforms:
AscendCPU
Examples
Note
Before running the following examples, you need to configure the communication environment variables.
For Ascend devices, it is recommended to use the msrun startup method without any third-party or configuration file dependencies. Please see the msrun startup for more details.
This example should be run with 2 devices.
>>> import numpy as np >>> from mindspore.ops.communication import init_process_group >>> from mindspore.ops.communication import all_reduce >>> from mindspore import Tensor >>> >>> init_process_group() >>> tensor = Tensor(np.ones([2, 8]).astype(np.float32)) >>> output = all_reduce(tensor) >>> print(tensor) [[2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2.]]