Learn Floating-Point Arithmetic and Machine Precision | Foundations of Numerical Computation

Swipe to show menu

Understanding how computers represent numbers is fundamental to scientific computing. Computers use a system called floating-point representation to store real numbers, which allows you to work with a wide range of values but introduces certain limitations. In this system, numbers are stored in a finite number of bits, which means not all real numbers can be represented exactly. This leads to small errors, known as rounding errors, every time you store or manipulate a number. The smallest difference distinguishable by the computer, called machine epsilon, determines the precision of calculations. Because of these constraints, you must be aware of how digital number storage can affect the accuracy and reliability of numerical results.


              1234567891011121314151617
            
import numpy as np

# Demonstrate floating-point rounding error
a = 0.1 + 0.2
b = 0.3
print("0.1 + 0.2 =", a)
print("0.3 =", b)
print("Are they equal?", a == b)

# Display machine epsilon for float64
eps = np.finfo(float).eps
print("Machine epsilon (float64):", eps)

# Show the effect of adding a very small number to 1.0
small_number = eps / 2
print("1.0 + (eps/2) == 1.0 ?", (1.0 + small_number) == 1.0)
print("1.0 + eps == 1.0 ?", (1.0 + eps) == 1.0)

Floating-Point Operations

Floating-point numbers are stored using a fixed number of bits, typically following the IEEE 754 standard. Each number is divided into a sign bit, an exponent, and a significand (or mantissa). This structure allows for a wide range of values but limits the precision of each number.

Overflow

Overflow occurs when a calculation produces a result larger than the maximum value representable by the floating-point format. In Python, this often results in a value labeled as inf (infinity), which can disrupt further calculations.

Underflow

Underflow happens when a result is closer to zero than the smallest representable positive number. The number may be rounded down to zero, or represented as a subnormal (very small) value, causing a loss of significance in computations.

Impact on Computations

The limited precision and range of floating-point numbers mean that errors can accumulate, especially in iterative calculations or when subtracting nearly equal quantities (catastrophic cancellation). Understanding these effects is crucial for designing stable and accurate numerical algorithms.

Everything was clear?

Thanks for your feedback!

Section 1. Chapter 2

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Section 1. Chapter 2