mindspore.communication.comm_func.all_to_all_v_c
- mindspore.communication.comm_func.all_to_all_v_c(output, input, send_count_matrix, group=None, async_op=False)[source]
- Based on the user-specified split size, the input tensor is divided and sent to other devices, where split chunks are received and then merged into a single output tensor. - Note - Only support PyNative mode, Graph mode is not currently supported. - Parameters
- output (Tensor) – the output tensor is gathered concatenated from remote ranks. 
- input (Tensor) – tensor to be scattered to remote rank. 
- send_count_matrix (list[int]) – The sending and receiving parameters of all ranks, \(\text{send_count_matrix}[i*\text{rank_size}+j]\) represents the amount of data sent by rank i to rank j, and the basic unit is first dimension sizes. Among them, rank_size indicates the size of the communication group. 
- group (str, optional) – The communication group to work on. If - None, which means- "hccl_world_group"in Ascend. Default:- None.
- async_op (bool, optional) – Whether this operator should be an async operator. Default: - False.
 
- Returns
- CommHandle. CommHandle is an async work handle, if async_op is set to True. CommHandle will be None, when async_op is False. 
- Raises
- TypeError – If input or output is not tensor. group is not a str, or async_op is not bool. 
 - Supported Platforms:
- Ascend
 - Examples - Note - Before running the following examples, you need to configure the communication environment variables. - For Ascend devices, it is recommended to use the msrun startup method without any third-party or configuration file dependencies. Please see the msrun start up for more details. - This example should be run with 2 devices. - >>> import numpy as np >>> import mindspore >>> from mindspore.mint.distributed import init_process_group, get_rank >>> from mindspore.communication.comm_func import all_to_all_v_c >>> from mindspore import Tensor >>> from mindspore.ops import zeros >>> >>> init_process_group() >>> this_rank = get_rank() >>> if this_rank == 0: ... output = Tensor(np.zeros([3]).astype(np.float32)) ... tensor = Tensor([0, 1, 2.]) * this_rank ... result = all_to_all_v_c(output, tensor, [0, 3, 3, 0]) ... print(output) >>> if this_rank == 1: ... output = Tensor(np.zeros([3]).astype(np.float32)) ... tensor = Tensor([0, 1, 2.]) * this_rank ... result = all_to_all_v_c(output, tensor, [0, 3, 3, 0]) ... print(output) rank 0: [0. 1. 2] rank 1: [0. 0. 0]