mindscience.distributed

manager

`mindscience.distributed.manager.get_context_parallel_group`	Get the context parallel group object.
`mindscience.distributed.manager.get_context_parallel_group_name`	Get the name of the context parallel group.
`mindscience.distributed.manager.get_context_parallel_rank`	Get the context parallel rank of the current process.
`mindscience.distributed.manager.get_context_parallel_world_size`	Get the size of the context parallel group.
`mindscience.distributed.manager.get_data_context_parallel_group`	Get the data-context parallel group object.
`mindscience.distributed.manager.get_data_context_parallel_group_name`	Get the name of the data-context parallel group.
`mindscience.distributed.manager.get_data_context_parallel_rank`	Get the data-context parallel rank of the current process.
`mindscience.distributed.manager.get_data_context_parallel_world_size`	Get the size of the data-context parallel group.
`mindscience.distributed.manager.get_data_parallel_group`	Get the data parallel group object.
`mindscience.distributed.manager.get_data_parallel_group_name`	Get the name of the data parallel group.
`mindscience.distributed.manager.get_data_parallel_rank`	Get the data parallel rank of the current process.
`mindscience.distributed.manager.get_data_parallel_world_size`	Get the size of the data parallel group.
`mindscience.distributed.manager.get_tensor_parallel_group`	Get the tensor parallel group object.
`mindscience.distributed.manager.get_tensor_parallel_group_name`	Get the name of the tensor parallel group.
`mindscience.distributed.manager.get_tensor_parallel_rank`	Get the tensor parallel rank of the current process.
`mindscience.distributed.manager.get_tensor_parallel_world_size`	Get the size of the tensor parallel group.
`mindscience.distributed.manager.initialize_parallel`	Initialize parallel communication groups for distributed training.

mappings

`mindscience.distributed.mappings.all_to_all_from_hidden_to_sequence`	Performs an all-to-all from hidden layout to sequence layout.
`mindscience.distributed.mappings.all_to_all_from_sequence_to_hidden`	Performs an all-to-all from sequence layout to hidden layout.
`mindscience.distributed.mappings.copy_to_all`	Forwards the input to all ranks in the specified group.
`mindscience.distributed.mappings.gather_from_hidden`	Gathers hidden-partitioned tensors along the last dimension.
`mindscience.distributed.mappings.gather_from_sequence`	Gathers sequence partitions along the first dimension.
`mindscience.distributed.mappings.reduce_from_all`	Performs an all-reduce operation across all ranks in the group.
`mindscience.distributed.mappings.reduce_scatter_to_sequence`	Performs reduce-scatter across sequence partitions along the first dimension.
`mindscience.distributed.mappings.scatter_to_hidden`	Scatters tensors into hidden partitions along the last dimension.
`mindscience.distributed.mappings.scatter_to_sequence`	Scatters tensors across the first dimension to form sequence partitions.

modules

`mindscience.distributed.modules.ColumnParallelLinear`	Column-parallel linear layer that shards the output feature dimension across TP ranks.
`mindscience.distributed.modules.RowParallelLinear`	Row-parallel linear layer that shards the input feature dimension across TP ranks.
`mindscience.distributed.modules.initialize_affine_weight`	Initialize and (optionally) partition a weight tensor for parallelism.