Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Rounding and Quantization Noise | Numerical Foundations of Quantization
Quantization Theory for Neural Networks

bookRounding and Quantization Noise

When you quantize real numbers to a finite set of values, you must choose a rounding scheme to map continuous values onto discrete levels. Two common schemes are deterministic rounding (often "round-to-nearest") and stochastic rounding. In deterministic rounding, each value is mapped to the nearest quantization level. The operation can be expressed mathematically as:

Qnearest(x)=Δround(xΔ)Q_{\text{nearest}}(x) = \Delta \cdot \text{round}\left(\frac{x}{\Delta}\right)

where xx is the input value and ΔΔ is the quantization step size. In contrast, stochastic rounding randomly rounds xx up or down to the nearest quantization levels, with probabilities proportional to the distances. The rule is:

Qstochastic(x)={ΔxΔwith probability 1pΔxΔwith probability pQ_{\text{stochastic}}(x) = \begin{cases} \Delta \cdot \left\lfloor \frac{x}{\Delta} \right\rfloor & \text{with probability } 1 - p \\ \Delta \cdot \left\lceil \frac{x}{\Delta} \right\rceil & \text{with probability } p \end{cases}

where p=(xΔx/Δ)/Δp = (x - Δ ⋅ ⎣x/Δ⎦)/Δ. This means the closer xx is to the upper level, the more likely it is to round up.

No matter which rounding scheme you use, quantization introduces an error, called quantization noise. This error can be modeled as additive noise:

e(x)=Q(x)xe(x) = Q(x) - x

where Q(x)Q(x) is the quantized value of xx. To analyze the impact of quantization, you often compute the expectation and variance of this error. For round-to-nearest with uniform quantization, the error e(x)e(x) is uniformly distributed in [Δ/2,Δ/2][ -\Delta/2, \Delta/2 ]. The expectation (mean) of the error is:

E[e(x)]=0\mathbb{E}[e(x)] = 0

and the variance is:

Var[e(x)]=Δ212\text{Var}[e(x)] = \frac{\Delta^2}{12}

For stochastic rounding, the expectation remains zero, but the variance can be slightly higher or lower depending on the distribution of xx relative to quantization levels.

Note
Definition

Quantization bias is the expected value of the quantization error, (E[e(x)])\left( \mathbb{E}[e(x)] \right). If the rounding scheme is not symmetric or the input distribution is not uniform, this bias can accumulate and systematically shift neural network outputs, leading to degraded model accuracy or even instability during training.

The choice between deterministic and stochastic rounding schemes has a direct effect on the distribution of quantization errors and the resulting neural network behavior. Deterministic round-to-nearest is simple and efficient, but it can introduce systematic bias, especially if values cluster near quantization boundaries. This bias can accumulate over many layers or operations, causing model drift. Stochastic rounding, while introducing more randomness per operation, ensures that the expected quantization error is zero, even for values near boundaries. This can help preserve statistical properties across layers and reduce the risk of bias accumulation, particularly in low-precision training or inference scenarios. However, the increased variance from stochastic rounding can make individual predictions noisier, so the choice of scheme should be matched to the needs of your application.

question mark

How does the quantization step size (Δ)( \Delta ) affect the variance of quantization noise?

Select the correct answer

Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 3

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

Suggested prompts:

Can you explain more about how stochastic rounding works in practice?

What are some real-world scenarios where stochastic rounding is preferred over deterministic rounding?

How does quantization noise affect neural network performance?

bookRounding and Quantization Noise

Stryg for at vise menuen

When you quantize real numbers to a finite set of values, you must choose a rounding scheme to map continuous values onto discrete levels. Two common schemes are deterministic rounding (often "round-to-nearest") and stochastic rounding. In deterministic rounding, each value is mapped to the nearest quantization level. The operation can be expressed mathematically as:

Qnearest(x)=Δround(xΔ)Q_{\text{nearest}}(x) = \Delta \cdot \text{round}\left(\frac{x}{\Delta}\right)

where xx is the input value and ΔΔ is the quantization step size. In contrast, stochastic rounding randomly rounds xx up or down to the nearest quantization levels, with probabilities proportional to the distances. The rule is:

Qstochastic(x)={ΔxΔwith probability 1pΔxΔwith probability pQ_{\text{stochastic}}(x) = \begin{cases} \Delta \cdot \left\lfloor \frac{x}{\Delta} \right\rfloor & \text{with probability } 1 - p \\ \Delta \cdot \left\lceil \frac{x}{\Delta} \right\rceil & \text{with probability } p \end{cases}

where p=(xΔx/Δ)/Δp = (x - Δ ⋅ ⎣x/Δ⎦)/Δ. This means the closer xx is to the upper level, the more likely it is to round up.

No matter which rounding scheme you use, quantization introduces an error, called quantization noise. This error can be modeled as additive noise:

e(x)=Q(x)xe(x) = Q(x) - x

where Q(x)Q(x) is the quantized value of xx. To analyze the impact of quantization, you often compute the expectation and variance of this error. For round-to-nearest with uniform quantization, the error e(x)e(x) is uniformly distributed in [Δ/2,Δ/2][ -\Delta/2, \Delta/2 ]. The expectation (mean) of the error is:

E[e(x)]=0\mathbb{E}[e(x)] = 0

and the variance is:

Var[e(x)]=Δ212\text{Var}[e(x)] = \frac{\Delta^2}{12}

For stochastic rounding, the expectation remains zero, but the variance can be slightly higher or lower depending on the distribution of xx relative to quantization levels.

Note
Definition

Quantization bias is the expected value of the quantization error, (E[e(x)])\left( \mathbb{E}[e(x)] \right). If the rounding scheme is not symmetric or the input distribution is not uniform, this bias can accumulate and systematically shift neural network outputs, leading to degraded model accuracy or even instability during training.

The choice between deterministic and stochastic rounding schemes has a direct effect on the distribution of quantization errors and the resulting neural network behavior. Deterministic round-to-nearest is simple and efficient, but it can introduce systematic bias, especially if values cluster near quantization boundaries. This bias can accumulate over many layers or operations, causing model drift. Stochastic rounding, while introducing more randomness per operation, ensures that the expected quantization error is zero, even for values near boundaries. This can help preserve statistical properties across layers and reduce the risk of bias accumulation, particularly in low-precision training or inference scenarios. However, the increased variance from stochastic rounding can make individual predictions noisier, so the choice of scheme should be matched to the needs of your application.

question mark

How does the quantization step size (Δ)( \Delta ) affect the variance of quantization noise?

Select the correct answer

Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 3
some-alt