15.4: Float Rounding Is Also Inexact

Last updated
Save as PDF

Page ID: 43753

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

While float equality is known to be evil, you have to pay attention to other aspects of floats. Let us illustrate that point with the following example.

2.8 truncateTo: 0.01
    → 2.8000000000000003

2.8 roundTo: 0.01
    → 2.8000000000000003

It is surprising but not false that 2.8 truncateTo: 0.01 does not return 2.8 but 2.8000000000000003. This is because truncateTo: and roundTo: perform several operations on floats: inexact operations on inexact numbers can lead to cumulative rounding errors as you saw above, and that’s just what happens again.

Even if you perform the operations exactly and then round to nearest Float, the result is inexact because of the initial inexact representation of 2.8 and 0.01.

(2.8 asTrueFraction roundTo: 0.01 asTrueFraction) asFloat
    → 2.8000000000000003

Using 0.01s2 rather than 0.01 let this example appear to work:

2.80 truncateTo: 0.01s2
    → 2.80s2

2.80 roundTo: 0.01s2
    → 2.80s2

But it’s just a case of luck, the fact that 2.8 is inexact is enough to cause other surprises as illustrated below:

2.8 truncateTo: 0.001s3.
    → 2.799s3

2.8 < 2.800s3.
    → true

Truncating in the Float world is absolutely unsafe. Though, using a ScaledDecimal for rounding is unlikely to cause such discrepancy, except when playing with last digits.