mindspore.nn.MaxPool2d

View Source On Gitee
class mindspore.nn.MaxPool2d(kernel_size=1, stride=1, pad_mode='valid', padding=0, dilation=1, return_indices=False, ceil_mode=False, data_format='NCHW')[source]

Applies a 2D max pooling over an input Tensor which can be regarded as a composition of 2D planes.

Typically the input is of shape \((N_{in}, C_{in}, H_{in}, W_{in})\), MaxPool2d outputs regional maximum in the \((H_{in}, W_{in})\)-dimension. Given kernel size \((h_{ker}, w_{ker})\) and stride \((s_0, s_1)\), the operation is as follows.

\[\text{output}(N_i, C_j, h, w) = \max_{m=0, \ldots, h_{ker}-1} \max_{n=0, \ldots, w_{ker}-1} \text{input}(N_i, C_j, s_0 \times h + m, s_1 \times w + n)\]
Parameters
  • kernel_size (Union[int, tuple[int]]) – The size of kernel used to take the max value, is an int number or a single element tuple that represents height and width are both kernel_size, or a tuple of two int numbers that represent height and width respectively. Default: 1 .

  • stride (Union[int, tuple[int]]) – The distance of kernel moving, an int number or a single element tuple that represents the height and width of movement are both stride, or a tuple of two int numbers that represent height and width of movement respectively. Default: 1 .

  • pad_mode (str, optional) –

    Specifies the padding mode with a padding value of 0. It can be set to: "same" , "valid" or "pad" . Default: "valid" .

    • "same": Pad the input around its edges so that the shape of input and output are the same when stride is set to 1. The amount of padding to is calculated by the operator internally, If the amount is even, it is uniformly distributed around the input, if it is odd, the excess amount goes to the right/bottom side. If this mode is set, padding must be 0.

    • "valid": No padding is applied to the input, and the output returns the maximum possible height and width. Extra pixels that could not complete a full stride will be discarded. If this mode is set, padding must be 0.

    • "pad": Pad the input with a specified amount. In this mode, the amount of padding in the height and width directions is determined by the padding parameter. If this mode is set, padding must be greater than or equal to 0.

  • padding (Union(int, tuple[int], list[int])) – Specifies the padding value of the pooling operation. Default: 0 . padding can only be an integer or a tuple/list containing one or two integers. If padding is an integer or a tuple/list containing one integer, it will be padded padding times in the four directions of the input. If padding is a tuple/list containing two integers, it will be padded padding[0] times in the up-down direction of the input and padding[1] times in the left-right direction of the input.

  • dilation (Union(int, tuple[int])) – The spacing between the elements of the kernel in convolution, used to increase the receptive field of the pooling operation. If it is a tuple, it must contain one or two integers. Default: 1 .

  • return_indices (bool) – If True , the function will return both the result of max pooling and the indices of the max elements. Default: False .

  • ceil_mode (bool) – If True , use ceil to compute the output shape instead of floor. Default: False .

  • data_format (str) – The optional value for data format, is 'NHWC' or 'NCHW' . Default: 'NCHW' .

Inputs:
  • x (Tensor) - Tensor of shape \((N,C_{in},H_{in},W_{in})\) or \((C_{in},H_{in},W_{in})\).

Outputs:

If return_indices is False, output is a Tensor, with shape \((N, C, H_{out}, W_{out})\) or \((C_{out}, H_{out}, W_{out})\). It has the same data type as x.

If return_indices is True, output is a Tuple of 2 Tensors, representing the maxpool result and where the max values are generated.

  • output (Tensor) - Maxpooling result, with shape \((N_{out}, C_{out}, H_{out}, W_{out})\) or \((C_{out}, H_{out}, W_{out})\). It has the same data type as x.

  • argmax (Tensor) - Index corresponding to the maximum value. Data type is int64.

If pad_mode is in pad mode, the output shape calculation formula is as follows:

\[H_{out} = \left\lfloor\frac{H_{in} + 2 * \text{padding[0]} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) - 1}{\text{stride[0]}} + 1\right\rfloor\]
\[W_{out} = \left\lfloor\frac{W_{in} + 2 * \text{padding[1]} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) - 1}{\text{stride[1]}} + 1\right\rfloor\]
Raises
  • TypeError – If kernel_size or stride is neither int nor tuple.

  • ValueError – If pad_mode is neither "valid" nor "same" with not case sensitive.

  • ValueError – If data_format is neither 'NCHW' nor 'NHWC' .

  • ValueError – If kernel_size or stride is less than 1.

  • ValueError – If length of shape of x is not equal to 3 or 4.

  • ValueError – If pad_mode is not "pad", padding, dilation, return_indices, ceil_mode parameters are not set to their default values.

  • ValueError – If the length of the tuple/list padding parameter is not 2.

  • ValueError – If The length of the tuple dilation parameter is not 2.

  • ValueError – If dilation parameter is neither an integer nor a tuple.

  • ValueError – If pad_mode is "pad" and data_format is 'NHWC'.

  • ValueError – If padding is non-zero when pad_mode is not "pad".

Supported Platforms:

Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import numpy as np
>>> pool = ms.nn.MaxPool2d(kernel_size=3, stride=1)
>>> x = ms.Tensor(np.random.randint(0, 10, [1, 2, 4, 4]), ms.float32)
>>> output = pool(x)
>>> print(output.shape)
(1, 2, 2, 2)
>>> np_x = np.random.randint(0, 10, [5, 3, 4, 5])
>>> x = ms.Tensor(np_x, ms.float32)
>>> pool2 = ms.nn.MaxPool2d(kernel_size=2, stride=1, pad_mode="pad", padding=1, dilation=1, return_indices=True)
>>> output = pool2(x)
>>> print(output[0].shape)
(5, 3, 5, 6)
>>> print(output[1].shape)
(5, 3, 5, 6)