mindspore.nn.MaxPool2d

class mindspore.nn.MaxPool2d(kernel_size=1, stride=1, pad_mode='valid', padding=0, dilation=1, return_indices=False, ceil_mode=False, data_format='NCHW')[source]

Applies a 2D max pooling over an input Tensor which can be regarded as a composition of 2D planes.

Typically the input is of shape \((N_{in}, C_{in}, H_{in}, W_{in})\), MaxPool2d outputs regional maximum in the \((H_{in}, W_{in})\)-dimension. Given kernel size \((h_{ker}, w_{ker})\) and stride \((s_0, s_1)\), the operation is as follows.

\[\text{output}(N_i, C_j, h, w) = \max_{m=0, \ldots, h_{ker}-1} \max_{n=0, \ldots, w_{ker}-1} \text{input}(N_i, C_j, s_0 \times h + m, s_1 \times w + n)\]
Parameters
  • kernel_size (Union[int, tuple[int]]) – The size of kernel used to take the max value, is an int number or a single element tuple that represents height and width are both kernel_size, or a tuple of two int numbers that represent height and width respectively. Default: 1.

  • stride (Union[int, tuple[int]]) – The distance of kernel moving, an int number or a single element tuple that represents the height and width of movement are both stride, or a tuple of two int numbers that represent height and width of movement respectively. Default: 1.

  • pad_mode (str) –

    The optional value for pad mode, is “same”, “valid” or “pad”, not case sensitive. Default: “valid”.

    • same: The output shape is the same as the input shape evenly divided by stride.

    • valid: The possible largest height and width of output will be returned without padding. Extra pixels will be discarded.

    • pad: pads the input. Pads the top, bottom, left, and right sides of the input with padding number of zeros. If this mode is set, padding must be greater than or equal to 0.

  • padding (Union(int, tuple[int], list[int])) – Specifies the padding value of the pooling operation. Default: 0. padding can only be an integer or a tuple/list containing one or two integers. If padding is an integer or a tuple/list containing one integer, it will be padded padding times in the four directions of the input. If padding is a tuple/list containing two integers, it will be padded padding[0] times in the up-down direction of the input and padding[1] times in the left-right direction of the input.

  • dilation (Union(int, tuple[int])) – The spacing between the elements of the kernel in convolution, used to increase the receptive field of the pooling operation. If it is a tuple, it must contain one or two integers. Default: 1.

  • return_indices (bool) – If True, the function will return both the result of max pooling and the indices of the max elements. Default: False.

  • ceil_mode (bool) – If True, use ceil to compute the output shape instead of floor. Default: False.

  • data_format (str) – The optional value for data format, is ‘NHWC’ or ‘NCHW’. Default: ‘NCHW’.

Inputs:
  • x (Tensor) - Tensor of shape \((N,C_{in},H_{in},W_{in})\) or \((C_{in},H_{in},W_{in})\).

Outputs:

If return_indices is False, output is a Tensor, with shape \((N, C, H_{out}, W_{out})\) or \((C_{out}, H_{out}, W_{out})\). It has the same data type as x.

If return_indices is True, output is a Tuple of 2 Tensors, representing the maxpool result and where the max values are generated.

  • output (Tensor) - Maxpooling result, with shape \((N_{out}, C_{out}, H_{out}, W_{out})\) or \((C_{out}, H_{out}, W_{out})\). It has the same data type as x.

  • argmax (Tensor) - Index corresponding to the maximum value. Data type is int64.

If pad_mode is in pad mode, the output shape calculation formula is as follows:

\[H_{out} = \left\lfloor\frac{H_{in} + 2 * \text{padding[0]} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) - 1}{\text{stride[0]}} + 1\right\rfloor\]
\[W_{out} = \left\lfloor\frac{W_{in} + 2 * \text{padding[1]} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) - 1}{\text{stride[1]}} + 1\right\rfloor\]
Raises
  • TypeError – If kernel_size or stride is neither int nor tuple.

  • ValueError – If pad_mode is neither ‘valid’ nor ‘same’ with not case sensitive.

  • ValueError – If data_format is neither ‘NCHW’ nor ‘NHWC’.

  • ValueError – If kernel_size or stride is less than 1.

  • ValueError – If length of shape of x is not equal to 3 or 4.

  • ValueError – If pad_mode is not ‘pad’, padding, dilation, return_indices, ceil_mode parameters are not set to their default values.

  • ValueError – If the length of the tuple/list padding parameter is not 2.

  • ValueError – If The length of the tuple dilation parameter is not 2.

  • ValueError – If dilation parameter is neither an integer nor a tuple.

  • ValueError – If pad_mode is ‘pad’ and data_format is ‘NHWC’.

  • ValueError – If padding is non-zero when pad_mode is not ‘pad’.

Supported Platforms:

Ascend GPU CPU

Examples

>>> pool = nn.MaxPool2d(kernel_size=3, stride=1)
>>> x = Tensor(np.random.randint(0, 10, [1, 2, 4, 4]), mindspore.float32)
>>> output = pool(x)
>>> print(output.shape)
(1, 2, 2, 2)
>>> np_x = np.random.randint(0, 10, [5, 3, 4, 5])
>>> x = Tensor(np_x, mindspore.float32)
>>> pool2 = nn.MaxPool2d(kernel_size=2, stride=1, pad_mode='pad', padding=1, dilation=1, return_indices=True)
>>> output = pool2(x)
>>> print(output[0].shape)
(5, 3, 5, 6)
>>> print(output[1].shape)
(5, 3, 5, 6)