# oneflow.nn.MinMaxObserver¶

class oneflow.nn.MinMaxObserver(quantization_formula: str = 'google', quantization_bit: int = 8, quantization_scheme: str = 'symmetric', per_layer_quantization: bool = True)

Compute the quantization parameters of the input tensor.

First compute the max and min values of input tensor:

\begin{align}\begin{aligned}& max\_value = max(input)\\& min\_value = min(input)\end{aligned}\end{align}

Then compute the scale and zero_point with the following equations:

if quantization_scheme == “symmetric”:

\begin{align}\begin{aligned}& denom = 2^{quantization\_to\_bit - 1} - 1\\& scale = max(|max\_value|,|min\_value|) / denom\\& zero\_point = 0\end{aligned}\end{align}

elif quantization_scheme == “affine”:

\begin{align}\begin{aligned}& denom = 2^{quantization\_to\_bit} - 1\\& scale = (max\_value - min\_value) / denom\\& zero\_point = -min\_value / scale\end{aligned}\end{align}

If per_layer_quantization is False, then the shape of scale and zero_point will be (input.shape[0],).

Parameters
• input (oneflow.Tensor) – the input value(s), in oneflow.float32.

• quantization_formula (str) – Support “google” or “cambricon”.

• quantization_bit (int) – Quantize input to uintX / intX, X can be in range [2, 8]. Defaults to 8.

• quantization_scheme (str) – “symmetric” or “affine”, quantize to signed / unsigned integer. Defaults to “symmetric”.

• per_layer_quantization (bool) – True or False, means per-layer / per-channel quantization. Defaults to True.

Returns

The scale and zero_point of input tensor.

Return type

Tuple[oneflow.Tensor, oneflow.Tensor]

For example:

>>> import numpy as np
>>> import oneflow as flow

>>> weight = (np.random.random((2, 3, 4, 5)) - 0.5).astype(np.float32)

>>> input_tensor = flow.tensor(
...    weight, dtype=flow.float32
... )

>>> quantization_bit = 8
>>> quantization_scheme = "symmetric"
>>> per_layer_quantization = True

>>> min_max_observer = flow.nn.MinMaxObserver(quantization_formula=quantization_formula, quantization_bit=quantization_bit,
... quantization_scheme=quantization_scheme, per_layer_quantization=per_layer_quantization)

>>> scale, zero_point = min_max_observer(
...    input_tensor, )

__init__(quantization_formula: str = 'google', quantization_bit: int = 8, quantization_scheme: str = 'symmetric', per_layer_quantization: bool = True)None

Calls super().__setattr__(‘a’, a) instead of the typical self.a = a to avoid Module.__setattr__ overhead. Module’s __setattr__ has special handling for parameters, submodules, and buffers but simply calls into super().__setattr__ for all other attributes.

Methods

 __call__(*args, **kwargs) Call self as a function. __delattr__(name) Implement delattr(self, name). __dir__() Default dir() implementation. __eq__(value, /) Return self==value. __format__(format_spec, /) Default object formatter. __ge__(value, /) Return self>=value. __getattr__(name) __getattribute__(name, /) Return getattr(self, name). __getstate__() __gt__(value, /) Return self>value. __hash__() Return hash(self). __init__([quantization_formula, …]) Calls super().__setattr__(‘a’, a) instead of the typical self.a = a to avoid Module.__setattr__ overhead. __init_subclass__ This method is called when a class is subclassed. __le__(value, /) Return self<=value. __lt__(value, /) Return self

Attributes

 _grad_t alias of Union[Tuple[oneflow.Tensor, …], oneflow.Tensor]