mindspore.nn.MaxPool3d

class mindspore.nn.MaxPool3d(kernel_size=1, stride=1, pad_mode='valid', padding=0, dilation=1, return_indices=False, ceil_mode=False)[source]

3D max pooling operation.

Applies a 3D max pooling over an input Tensor which can be regarded as a composition of 3D planes.

Typically the input is of shape \((N_{in}, C_{in}, D_{in}, H_{in}, W_{in})\), MaxPool outputs regional maximum in the \((D_{in}, H_{in}, W_{in})\)-dimension. Given kernel size is \(ks = (d_{ker}, h_{ker}, w_{ker})\) and stride is \(s = (s_0, s_1, s_2)\), the operation is as follows.

\[\text{output}(N_i, C_j, d, h, w) = \max_{l=0, \ldots, d_{ker}-1} \max_{m=0, \ldots, h_{ker}-1} \max_{n=0, \ldots, w_{ker}-1} \text{input}(N_i, C_j, s_0 \times d + l, s_1 \times h + m, s_2 \times w + n)\]

Parameters

kernel_size (Union[int, tuple[int]]) – The size of kernel used to take the maximum value, is an int number or a single element tuple that represents depth, height and width of the kernel, or a tuple of three int numbers that represent depth, height and width respectively. The value must be a positive integer. Default: 1.
stride (Union[int, tuple[int]]) – The moving stride of pooling operation, an int number or a single element tuple that represents the moving stride of pooling kernel in the directions of depth, height and the width, or a tuple of three int numbers that represent depth, height and width of movement respectively. The value must be a positive integer. If the value is None, the default value kernel_size is used. Default: 1.
pad_mode (str) –
The optional value for pad mode, is “same”, “valid” or “pad”, not case sensitive. Default: “valid”.
- same: The output shape is the same as the input shape evenly divided by stride.
- valid: The possible largest height and width of output will be returned without padding. Extra pixels will be discarded.
- pad: pads the input. Pads the top, bottom, left, and right sides of the input with padding number of zeros. If this mode is set, padding must be greater than or equal to 0.
padding (Union(int, tuple[int], list[int])) – Pooling padding value. Default: 0. padding can only be an integer or a tuple/list containing one or three integers. If padding is an integer or a tuple/list containing one integer, it will be padded in six directions of front, back, top, bottom, left and right of the input. If padding is a tuple/list containing three integers, it will be padded in front and back of the input padding[0] times, up and down padding[1] times, and left and right of the input padding[2] times.
dilation (Union(int, tuple[int])) – The spacing between the elements of the kernel in convolution, used to increase the receptive field of the pooling operation. If it is a tuple, it must contain one or three integers. Default: 1.
return_indices (bool) – If True, output is a Tuple of 2 Tensors, representing the maxpool result and where the max values are generated. Otherwise, only the maxpool result is returned. Default: False.
ceil_mode (bool) – Whether to use ceil or floor to calculate output shape. Default: False.

Inputs:

x (Tensor) - Tensor of shape \((N_{in}, C_{in}, D_{in}, H_{in}, W_{in})\) or \((C_{in}, D_{in}, H_{in}, W_{in})\).

Outputs:

If return_indices is False, output is a Tensor, with shape \((N_{out}, C_{out}, D_{out}, H_{out}, W_{out})\) or \((C_{out}, D_{out}, H_{out}, W_{out})\). It has the same data type as x.

If return_indices is True, output is a Tuple of 2 Tensors, representing the maxpool result and where the max values are generated.

output (Tensor) - Maxpooling result, with shape \((N_{out}, C_{out}, D_{out}, H_{out}, W_{out})\) or \((C_{out}, D_{out}, H_{out}, W_{out})\). It has the same data type as x.
argmax (Tensor) - Index corresponding to the maximum value. Data type is int64.

If pad_mode is in pad mode, the output shape calculation formula is as follows:

\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]

\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]

\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - \text{dilation}[2] \times (\text{kernel_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\]

Raises

ValueError – If length of shape of x is not equal to 4 or 5.
TypeError – If kernel_size , stride , padding or dilation is neither an int nor a tuple.
ValueError – If kernel_size or stride is less than 1.
ValueError – If the padding parameter is neither an integer nor a tuple of length 3.
ValueError – If pad_mode is not set to ‘pad’, setting return_indices to True or dilation to a value other than 1.
ValueError – If padding is non-zero when pad_mode is not ‘pad’.

Supported Platforms:: Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import mindspore.nn as nn
>>> import numpy as np
>>> np_x = np.random.randint(0, 10, [5, 3, 4, 6, 7])
>>> x = Tensor(np_x, ms.float32)
>>> pool1 = nn.MaxPool3d(kernel_size=2, stride=1, pad_mode='pad', padding=1, dilation=3, return_indices=True)
>>> output = pool1(x)
>>> print(output[0].shape)
(5, 3, 3, 5, 6)
>>> print(output[1].shape)
(5, 3, 3, 5, 6)
>>> pool2 = nn.MaxPool3d(kernel_size=2, stride=1, pad_mode='pad', padding=1, dilation=3, return_indices=False)
>>> output2 = pool2(x)
>>> print(output2.shape)
(5, 3, 3, 5, 6)