mindspore.parallel
mindspore.parallel provides a large number of interfaces for automatic parallelization, including parallel base configuration, model loading and transformation, and functional parallel slicing.
The module import method is as follows:
from mindspore import parallel
Parallel Base Configuration
Encapsulation of top-level Cells or functions to realize static graph parallelism for a single network. |
|
Implementation of parallel gradient accumulation for static graphs. |
|
Implement the static graph parallel multi-copy splitting function to enable concurrent computation and communication. |
|
Specify the number of micro_batch for pipeline parallelism and the division rules for stage. |
|
Functional training scenarios for gradient statute and accumulation of pipeline parallel. |
Model Loading and Transformation
Convert distributed checkpoint from source sharding strategy to destination sharding strategy for a rank. |
|
Convert distributed checkpoint from source sharding strategy to destination sharding strategy by rank for a network. |
|
Load checkpoint into net for distributed predication. |
|
Load checkpoint info from a specified file. |
|
List of original distributed checkpoint rank index for obtaining the target checkpoint of a rank_id during the distributed checkpoint conversion. |
|
Merge multiple safetensor files into a unified safetensor file. |
Functional Parallel Slicing
Converting a tensor from one distributed arrangement to another distributed arrangement. |
|
Topological abstraction describing cluster devices for tensor slice placement on the cluster. |
|
Specify the input and output slicing strategy for a Cell or function. |
Others
Extract the sharding strategy for each parameter in the network from the strategy file for distributed inference scenarios. |
|
Aggregate the sharding strategy files of all pipeline parallel subgraphs to the destination file. |
|
Broadcast parameter to other rank in data parallel dimension. |
|
Extract rank list information from communication domain files. |
|
Synchronization of shared weights between stages for pipeline parallel inference scenarios. |