mindscience.distributed

manager

mindscience.distributed.manager.get_context_parallel_group

Get the context parallel group object.

mindscience.distributed.manager.get_context_parallel_group_name

Get the name of the context parallel group.

mindscience.distributed.manager.get_context_parallel_rank

Get the context parallel rank of the current process.

mindscience.distributed.manager.get_context_parallel_world_size

Get the size of the context parallel group.

mindscience.distributed.manager.get_data_context_parallel_group

Get the data-context parallel group object.

mindscience.distributed.manager.get_data_context_parallel_group_name

Get the name of the data-context parallel group.

mindscience.distributed.manager.get_data_context_parallel_rank

Get the data-context parallel rank of the current process.

mindscience.distributed.manager.get_data_context_parallel_world_size

Get the size of the data-context parallel group.

mindscience.distributed.manager.get_data_parallel_group

Get the data parallel group object.

mindscience.distributed.manager.get_data_parallel_group_name

Get the name of the data parallel group.

mindscience.distributed.manager.get_data_parallel_rank

Get the data parallel rank of the current process.

mindscience.distributed.manager.get_data_parallel_world_size

Get the size of the data parallel group.

mindscience.distributed.manager.get_tensor_parallel_group

Get the tensor parallel group object.

mindscience.distributed.manager.get_tensor_parallel_group_name

Get the name of the tensor parallel group.

mindscience.distributed.manager.get_tensor_parallel_rank

Get the tensor parallel rank of the current process.

mindscience.distributed.manager.get_tensor_parallel_world_size

Get the size of the tensor parallel group.

mindscience.distributed.manager.initialize_parallel

Initialize parallel communication groups for distributed training.

mappings

mindscience.distributed.mappings.all_to_all_from_hidden_to_sequence

Performs an all-to-all from hidden layout to sequence layout.

mindscience.distributed.mappings.all_to_all_from_sequence_to_hidden

Performs an all-to-all from sequence layout to hidden layout.

mindscience.distributed.mappings.copy_to_all

Forwards the input to all ranks in the specified group.

mindscience.distributed.mappings.gather_from_hidden

Gathers hidden-partitioned tensors along the last dimension.

mindscience.distributed.mappings.gather_from_sequence

Gathers sequence partitions along the first dimension.

mindscience.distributed.mappings.reduce_from_all

Performs an all-reduce operation across all ranks in the group.

mindscience.distributed.mappings.reduce_scatter_to_sequence

Performs reduce-scatter across sequence partitions along the first dimension.

mindscience.distributed.mappings.scatter_to_hidden

Scatters tensors into hidden partitions along the last dimension.

mindscience.distributed.mappings.scatter_to_sequence

Scatters tensors across the first dimension to form sequence partitions.

modules

mindscience.distributed.modules.ColumnParallelLinear

Column-parallel linear layer that shards the output feature dimension across TP ranks.

mindscience.distributed.modules.RowParallelLinear

Row-parallel linear layer that shards the input feature dimension across TP ranks.

mindscience.distributed.modules.initialize_affine_weight

Initialize and (optionally) partition a weight tensor for parallelism.