mindscience.distributed.modules.initialize_affine_weight

mindscience.distributed.modules.initialize_affine_weight(init_shape, tp_world_size, partition_dim, init_method='XavierUniform', init_dtype=ms.float32)[source]

Initialize and (optionally) partition a weight tensor for parallelism.

Parameters

init_shape (Tuple[int]) – Shape of the weight tensor to initialize.
tp_world_size (int) – Tensor parallel world size.
partition_dim (int) – Dimension along which to partition the weight tensor.
init_method (Union[Initializer, str], optional) – Initialization method to use. Default: "XavierUniform".
init_dtype (mstype.dtype, optional) – Data type for the initialized weight. Default: ms.float32.

Returns

Parameter, The full parameter when tp_world_size==1 or the per-rank partition otherwise.

Examples

>>> import mindspore as ms
>>> from mindspore.communication import init
>>> from mindscience.distributed import initialize_parallel
>>> from mindscience.distributed.modules import initialize_affine_weight
>>> init()
>>> initialize_parallel(tensor_parallel_size=2)
>>> param = initialize_affine_weight((64, 64), 2, 0)
>>> print(param.shape)
(32, 64)