Tag Archives: IEEE_arithmetic

What Is IEEE Standard Arithmetic?

The IEEE Standard 754, published in 1985 and revised in 2008 and 2019, is a standard for binary and decimal floating-point arithmetic. The standard for decimal arithmetic (IEEE Standard 854) was separate when it was first published in 1987, but … Continue reading

Posted in what-is | Tagged | Leave a comment

Half Precision Arithmetic: fp16 Versus bfloat16

The 2008 revision of the IEEE Standard for Floating-Point Arithmetic introduced a half precision 16-bit floating point format, known as fp16, as a storage format. Various manufacturers have adopted fp16 for computation, using the obvious extension of the rules for … Continue reading

Posted in research | Tagged , , | 9 Comments

The Rise of Mixed Precision Arithmetic

For the last 30 years, most floating point calculations in scientific computing have been carried out in 64-bit IEEE double precision arithmetic, which provides the elementary operations of addition, subtraction, multiplication, and division at a relative accuracy of about . … Continue reading

Posted in research | Tagged , | 3 Comments

Tiny Relative Errors

Let and be distinct floating point numbers. How small can the relative difference between and be? For IEEE double precision arithmetic the answer is , which is called the unit roundoff. What if we now let and be vectors and … Continue reading

Posted in research | Tagged , , | Leave a comment