Apprendre Rounding and Quantization Noise | Numerical Foundations of Quantization

Glissez pour afficher le menu

When you quantize real numbers to a finite set of values, you must choose a rounding scheme to map continuous values onto discrete levels. Two common schemes are deterministic rounding (often "round-to-nearest") and stochastic rounding. In deterministic rounding, each value is mapped to the nearest quantization level. The operation can be expressed mathematically as:

Q_{\text{nearest}}(x) = \Delta \cdot \text{round}\left(\frac{x}{\Delta}\right)

where $x$ is the input value and $Δ$ is the quantization step size. In contrast, stochastic rounding randomly rounds $x$ up or down to the nearest quantization levels, with probabilities proportional to the distances. The rule is:

Q_{\text{stochastic}}(x) = \begin{cases} \Delta \cdot \left\lfloor \frac{x}{\Delta} \right\rfloor & \text{with probability } 1 - p \\ \Delta \cdot \left\lceil \frac{x}{\Delta} \right\rceil & \text{with probability } p \end{cases}

where $p = (x - Δ ⋅ ⎣x/Δ⎦)/Δ$ . This means the closer $x$ is to the upper level, the more likely it is to round up.

No matter which rounding scheme you use, quantization introduces an error, called quantization noise. This error can be modeled as additive noise:

e(x) = Q(x) - x

where $Q(x)$ is the quantized value of $x$ . To analyze the impact of quantization, you often compute the expectation and variance of this error. For round-to-nearest with uniform quantization, the error $e(x)$ is uniformly distributed in $[ -\Delta/2, \Delta/2 ]$ . The expectation (mean) of the error is:

\mathbb{E}[e(x)] = 0

and the variance is:

\text{Var}[e(x)] = \frac{\Delta^2}{12}

For stochastic rounding, the expectation remains zero, but the variance can be slightly higher or lower depending on the distribution of $x$ relative to quantization levels.

Definition

Quantization bias is the expected value of the quantization error, $\left( \mathbb{E}[e(x)] \right)$ . If the rounding scheme is not symmetric or the input distribution is not uniform, this bias can accumulate and systematically shift neural network outputs, leading to degraded model accuracy or even instability during training.

The choice between deterministic and stochastic rounding schemes has a direct effect on the distribution of quantization errors and the resulting neural network behavior. Deterministic round-to-nearest is simple and efficient, but it can introduce systematic bias, especially if values cluster near quantization boundaries. This bias can accumulate over many layers or operations, causing model drift. Stochastic rounding, while introducing more randomness per operation, ensures that the expected quantization error is zero, even for values near boundaries. This can help preserve statistical properties across layers and reduce the risk of bias accumulation, particularly in low-precision training or inference scenarios. However, the increased variance from stochastic rounding can make individual predictions noisier, so the choice of scheme should be matched to the needs of your application.

Tout était clair ?

Merci pour vos commentaires !

Section 1. Chapitre 3

Demandez à l'IA

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Section 1. Chapitre 3