mindspore.nn.FakeQuantWithMinMaxObserver

class mindspore.nn.FakeQuantWithMinMaxObserver(min_init=- 6, max_init=6, ema=False, ema_decay=0.999, per_channel=False, channel_axis=1, num_channels=1, quant_dtype=QuantDtype.INT8, symmetric=False, narrow_range=False, quant_delay=0, neg_trunc=False, mode='DEFAULT')[source]

Quantization aware operation which provides the fake quantization observer function on data with min and max.

The detail of the quantization mode DEFAULT is described as below:

The running min/max \(x_{min}\) and \(x_{max}\) are computed as:

\[\begin{split}\begin{array}{ll} \\ x_{min} = \begin{cases} \min(\min(X), 0) & \text{ if } ema = \text{False} \\ \min((1 - c) \min(X) + \text{c } x_{min}, 0) & \text{ if } \text{otherwise} \end{cases}\\ x_{max} = \begin{cases} \max(\max(X), 0) & \text{ if } ema = \text{False} \\ \max((1 - c) \max(X) + \text{c } x_{max}, 0) & \text{ if } \text{otherwise} \end{cases} \end{array}\end{split}\]

where X is the input tensor, and \(c\) is the ema_decay.

The scale and zero point zp is computed as:

\[\begin{split}\begin{array}{ll} \\ scale = \begin{cases} \frac{x_{max} - x_{min}}{Q_{max} - Q_{min}} & \text{ if } symmetric = \text{False} \\ \frac{2\max(x_{max}, \left | x_{min} \right |) }{Q_{max} - Q_{min}} & \text{ if } \text{otherwise} \end{cases}\\ zp\_min = Q_{min} - \frac{x_{min}}{scale} \\ zp = \left \lfloor \min(Q_{max}, \max(Q_{min}, zp\_min)) + 0.5 \right \rfloor \end{array}\end{split}\]

where \(Q_{max}\) and \(Q_{min}\) is decided by quant_dtype, for example, if quant_dtype=INT8, then \(Q_{max} = 127\) and \(Q_{min} = -128\).

The fake quant output is computed as:

\[\begin{split}\begin{array}{ll} \\ u_{min} = (Q_{min} - zp) * scale \\ u_{max} = (Q_{max} - zp) * scale \\ u_X = \left \lfloor \frac{\min(u_{max}, \max(u_{min}, X)) - u_{min}}{scale} + 0.5 \right \rfloor \\ output = u_X * scale + u_{min} \end{array}\end{split}\]

The detail of the quantization mode LEARNED_SCALE is described as below:

The fake quant output is computed as:

\[ \begin{align}\begin{aligned}\begin{split}\bar{X}=\left\{\begin{matrix} clip\left ( \frac{X}{maxq},0,1\right ) \qquad \quad if\quad neg\_trunc\\ clip\left ( \frac{X}{maxq},-1,1\right )\qquad \ if\quad otherwise \end{matrix}\right. \\\end{split}\\output=\frac{floor\left ( \bar{X}\ast Q_{max}+0.5 \right ) \ast scale }{Q_{max}}\end{aligned}\end{align} \]

where X is the input tensor. where \(Q_{max}\) (quant_max) is decided by quant_dtype and neg_trunc, for example, if quant_dtype=INT8 and neg_trunc works, \(Q_{max} = 256\) , otherwise \(Q_{max} = 127\).

The maxq is updated by training, and its gradient is calculated as follows:

\[ \begin{align}\begin{aligned}\begin{split}\frac{\partial \ output}{\partial \ maxq} = \left\{\begin{matrix} -\frac{X}{maxq}+\left \lfloor \frac{X}{maxq} \right \rceil \qquad if\quad bound_{lower}< \frac{X}{maxq}< 1\\ -1 \qquad \quad \qquad \quad if\quad \frac{X}{maxq}\le bound_{lower}\\ 1 \qquad \quad \qquad \quad if\quad \frac{X}{maxq}\ge 1 \qquad \quad \end{matrix}\right. \\\end{split}\\\begin{split}bound_{lower}= \left\{\begin{matrix} 0\qquad \quad if\quad neg\_trunc\\ -1\qquad if\quad otherwise \end{matrix}\right.\end{split}\end{aligned}\end{align} \]

Then minq is computed as:

\[\begin{split}minq=\left\{\begin{matrix} 0 \qquad \qquad \quad if\quad neg\_trunc\\ -maxq\qquad if\quad otherwise \end{matrix}\right.\end{split}\]

When exporting, the scale and zero point zp is computed as:

\[\begin{split}scale=\frac{maxq}{quant\_max} ,\quad zp=0 \\\end{split}\]

zp is equal to 0 consistently, due to the LEARNED_SCALE`s symmetric nature.

Parameters
  • min_init (int, float, list) – The initialized min value. Default: -6.

  • max_init (int, float, list) – The initialized max value. Default: 6.

  • ema (bool) – The exponential Moving Average algorithm updates min and max. Default: False.

  • ema_decay (float) – Exponential Moving Average algorithm parameter. Default: 0.999.

  • per_channel (bool) – Quantization granularity based on layer or on channel. Default: False.

  • channel_axis (int) – Quantization by channel axis. Default: 1.

  • num_channels (int) – declarate the min and max channel size, Default: 1.

  • quant_dtype (QuantDtype) – The datatype of quantization, supporting 4 and 8bits. Default: QuantDtype.INT8.

  • symmetric (bool) – Whether the quantization algorithm is symmetric or not. Default: False.

  • narrow_range (bool) – Whether the quantization algorithm uses narrow range or not. Default: False.

  • quant_delay (int) – Quantization delay parameters according to the global step. Default: 0.

  • neg_trunc (bool) – Whether the quantization algorithm uses negative truncation or not. Default: False.

  • mode (str) – Optional quantization mode, currently only DEFAULT`(QAT) and `LEARNED_SCALE are supported. Default: (“DEFAULT”)

Inputs:
  • x (Tensor) - The input of FakeQuantWithMinMaxObserver. The input dimension is preferably 2D or 4D.

Outputs:

Tensor, with the same type and shape as the x.

Raises
  • TypeError – If min_init or max_init is not int, float or list.

  • TypeError – If quant_delay is not an int.

  • ValueError – If quant_delay is less than 0.

  • ValueError – If min_init is not less than max_init.

  • ValueError – If mode is neither DEFAULT nor LEARNED_SCALE.

  • ValueError – If mode is LEARNED_SCALE and symmetric is not True.

  • ValueError – If mode is LEARNED_SCALE, and narrow_range is not True unless when neg_trunc is True.

Supported Platforms:

Ascend GPU

Examples

>>> import mindspore
>>> from mindspore import Tensor
>>> fake_quant = nn.FakeQuantWithMinMaxObserver()
>>> x = Tensor(np.array([[1, 2, 1], [-2, 0, -1]]), mindspore.float32)
>>> result = fake_quant(x)
>>> print(result)
[[ 0.9882355  1.9764705  0.9882355]
 [-1.9764705  0.        -0.9882355]]
extend_repr()[source]

Display instance object as string.

reset(quant_dtype=QuantDtype.INT8, min_init=- 6, max_init=6)[source]

Reset the quant max parameter (eg. 256) and the initial value of the minq parameter and maxq parameter, this function is currently only valid for LEARNED_SCALE mode.

Parameters
  • quant_dtype (QuantDtype) – The datatype of quantization, supporting 4 and 8bits. Default: QuantDtype.INT8.

  • min_init (int, float, list) – The initialized min value. Default: -6.

  • max_init (int, float, list) – The initialized max value. Default: 6.