NaNs and reflexivity

In connection with an effort to reason about numerical programs we recently ran into what seems to be strange decisions in the design of the floating-point standard, IEEE 754. I discuss the issues in a blog post ("Reflexivity and other pillars of civilization"), showing in particular that the convention for equality regarding NaN (Not A Number) seem wrong; the reverse convention would be more in line with basic mathematical properties such as the reflexivity of equality.

We encountered the problem in connection with the work on revising the ISO standard for Eiffel, when we realized that transposing the IEEE 754 rule to the language would mean that we cannot put a value into an array and then assume that the value that we put is there! Neither can we assign a value v to a variable x and assume that afterward the value of x is equal to v.

See the detailed explanation at the page linked to above. I would be interested in the experience of people having run into similar issues.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

You simply need multiple

You simply need multiple equality predicates. Sometimes you're basically interested in the concrete representation, and sometimes you want to compare structure, or the "mathematical" value.

Common problem

That's old news: IEEE float equality is not an equivalence relation. Many people consider that a bad move, but we have to live with it. If you care about built-in equality having nice properties, then either make it inapplicable to floats (like e.g. = in SML), or give it a semantics different from IEEE. In either case, you want a separate operator to implement IEEE equality (e.g. Real.== in SML).

Link to specific stories

As a quick aside; when you link to your blog please link to a specific story. That makes it easier for somebody to find which blog post you were referring to when people stumble upon this LtU post months and years down the road. For example, use

http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/

instead of

http://bertrandmeyer.com/

You should be able to edit this story as well as your old stories to update the links. Of course, you probably want to give the link a nice textual description, instead of making the URL directly visible.

It's true

It's true, the IEEE 754 standard is nuts. No reasonable specification for computing would define a value system in which vital foundational properties like "for all x, x=x" don't hold.

The consensus here is reasonable: To support IEEE 754 in a sensible language, you shouldn't let the crazy specification's notion of equality pollute the language's notion of equality. Rather, expose its comparison as an ordinary function, e.g. ieee754_compare(a:float,b:float):bool.

Not Always

Using the = comparison operator on floats is generally a bad idea, due to all the nasty rounding side-effects, particularly on machines where the registers internal format is different from the stored one (e.g. x87: 80bits internal vs. 64bits IEEE doubles). There is little guarantee that if X:=Y, then X is_exactly_equal_to Y.

The result of IEEE comparaisons can have 4 values : '=' , '>', '<' and Unordered. Any comparison involving NaNs shall return 'Unordered'. From these 4 values, one can define 16 predicates, for example, in the SPARC Instruction Set :

FBA Branch Always
FBN Branch Never
FBU Branch on Unordered
FBG Branch on Greater
FBUG Branch on Unordered or Greater
FBL Branch on Less
FBUL Branch on Unordered or Less
FBLG Branch on Less or Greater
FBNE Branch on Not Equal
FBE Branch on Equal
FBUE Branch on Unordered or Equal
FBGE Branch on Greater or Equal
FBUGE Branch on Unordered or Greater or Equal
FBLE Branch on Less or Equal
FBULE Branch on Unordered or Less or Equal
FBO Branch on Ordered

A compiler can choose either FBE or FBUE for equality, or maybe FBN is the safest :-)

From the standard-defined operations, NaN appears as the result of sqrt(negative_number), 0/0, or as user-defined source operands. Then, the NaNs propagates across all operations, tracing that somewhere the calculation gone bad. There is no need to check for NaNs after every operation.

Sometimes, the valid answer to a question is 'i don't know' instead of 'true' or 'false'. NaN means 'Not a Number', literally, these are not values, it is like returning an error code. NaNs can also be used for detecting unitialised memory (the FFFFFFFF pattern is a QNaN), there is no legitimate comparison between unitialised variables.

Correctly handling floating point operations in a programming language is very complex. For example :
- Many operations are imprecises, and non associative, which is bad for optimisers. Combined Multiply-Accumulate is different than separate multiply and add operations.
- The IEEE standard mandates the availability of 4 rounding modes, with fun corner cases. There is very little support of that feature in programming languages.

The IEEE standard is diffcult, both for software handling and hardware implementation, nevertheless, it does the _right_ thing, a reasonable way to handle something that look likes algebra but isn't actually real maths.

See : Kahan Paper :Lecture notes on the status of IEEE Standard 754 for Binary Floating-Point Arithmetic

Then, the NaNs propagates

Then, the NaNs propagates across all operations, tracing that somewhere the calculation gone bad. There is no need to check for NaNs after every operation.

So floating point operations take place in the error monad.

Except for equality comparisons

Except for equality comparisons, which is the issue..

"There is little guarantee

"There is little guarantee that if X:=Y, then X is_exactly_equal_to Y."

Really?
Assignment only do guarantee equality. Using the = operator with floats makes sense in this case only, which is the issue here.

My fault

You're right, from a programming language design standpoint, you can decide that assignement shall yield equality, and make it so in the language implementation.

The fact that C programmers should not expect that is not a valid reason.

Excluding the NaN issue, the IEEE standard states that compare(+0,-0) -> equal, +0 and -0 have different bit patterns. Does that mean that "different" values can be equal ?

Reflexive, Transitive, and Symmetric

That's the defining characteristics of an equivalence relation; consider the rationals: 1/2 == 2/4 even though the bit patterns are different. So no, that's not really a problem in itself.

The bigger issue is that since +0 and -0 are equal, division is not a well-defined function with respect to IEEE equality, as 1 / +0 == +∞ != -∞ == 1 / -0

It can be amusing to play with various library implementations of sorting algorithms and their interaction with NaNs: Python actually behaves pretty well, as it's built-in sort appears to double as a topological sort, but Haskell's sort behaves quite badly.

Thanks for the informative post!

You might be interested in this

William Kahan's Berkeley home page

He has examples, including Java and Excel.

z-bo: your link is messed up (as of this writing)

...

fixed

thanks