mindscience.distributed
manager
Get the context parallel group object. |
|
|
Get the name of the context parallel group. |
Get the context parallel rank of the current process. |
|
|
Get the size of the context parallel group. |
|
Get the data-context parallel group object. |
|
Get the name of the data-context parallel group. |
|
Get the data-context parallel rank of the current process. |
|
Get the size of the data-context parallel group. |
Get the data parallel group object. |
|
|
Get the name of the data parallel group. |
Get the data parallel rank of the current process. |
|
|
Get the size of the data parallel group. |
Get the tensor parallel group object. |
|
|
Get the name of the tensor parallel group. |
Get the tensor parallel rank of the current process. |
|
|
Get the size of the tensor parallel group. |
Initialize parallel communication groups for distributed training. |
mappings
|
Performs an all-to-all from hidden layout to sequence layout. |
|
Performs an all-to-all from sequence layout to hidden layout. |
Forwards the input to all ranks in the specified group. |
|
Gathers hidden-partitioned tensors along the last dimension. |
|
Gathers sequence partitions along the first dimension. |
|
Performs an all-reduce operation across all ranks in the group. |
|
Performs reduce-scatter across sequence partitions along the first dimension. |
|
Scatters tensors into hidden partitions along the last dimension. |
|
Scatters tensors across the first dimension to form sequence partitions. |
modules
Column-parallel linear layer that shards the output feature dimension across TP ranks. |
|
Row-parallel linear layer that shards the input feature dimension across TP ranks. |
|
Initialize and (optionally) partition a weight tensor for parallelism. |