Category Archives: math.h

C Ints are Finite Numbers!

In the last few months, we made some changes to math.h (and in C++’s <cmath>) to clean up the isinf(), isnan() and related functions. These are defined by the C and C++ spec for all scalar floating point types. Our previous code used a slightly hacky implementation, which appears to be shared with most other libc implementations.

Since these macros are defined for float, double and long double, we were disambiguating the versions by comparing the size of the argument and the size of various types, with code something like this:

#define macro(x) (\
    (sizeof(x) == sizeof(float))) ? float_fn(x) :
     ((sizeof(x) == sizeof(double)) ? double_fn(x) : 
      ((sizeof(x) == sizeof(long double)) ? long_double_fn(x)))

This works fine as long as the macros are only called with floating point types. But what happens if they are invoked with an int or perhaps a vector of two ints? The good news for our implementation is that this is undefined behaviour, so we are allowed to do whatever we want. This code called with, for example, isnan(1) would first cast the int value of 1 to a float (because sizeof(int) is the same as sizeof(float). On an ILP64 platform, it would cast it to a double) and then called the function that made sense for floats.

This is a valid thing to do, but it’s also nonsense. It hides lots of potential logic errors. If you’re calling isnan(int) or isinf(int) then you’re almost certainly doing something wrong: ints are always numbers and they are always finite.

The new version avoids this. It still falls back to the old code that works for any standards-compliant compiler, but if your compiler supports C11 _Generic() expressions (clang in the base system does), or the __builtin_types_compatible_p() and __builtin_choose_expr() GNU extensions then it uses them instead. The former is very simple: it provides a way of selecting the expression to evaluate based on its type. The above macro becomes something like this:

#define macro(x) _Generic((x),\
    float: float_fn(x),
    double: double_fn(x),
    long double: long_double_fn(x))

Not only is this more readable, it also generates a compile-time error if you try to invoke it with a type that is not listed. You can also provide a default: case if you want to emulate the old behaviour. The version using the GNU extensions is semantically equivalent, but a lot less readable. The __builtin_choose_expr() builtin lets you select between two cases based on a compile-time constant expression. The __builtin_types_compatible_p() builtin lets you compare whether two types are the same (so typeof(x) is compared to float, double, and long double, in turn).

A fairly simple change, which didn’t make any difference for any software that was not relying on undefined behaviour. And yet it generated a surprising amount of fallout in the ports tree. One common problem was that a lot of configure scripts checked for the existence of the isnan() macro by trying to do isnan(1) in their configure scripts. Examples of this included the V8 JavaScript engine and the Mono compiler.

Mono was especially bad, because its configure script checked for the existence of isnan(int) and, if it didn’t find it, then it defined its own isnan(double) to use instead. I’m slightly alarmed at the idea of using a compiler written by people who don’t know the difference between int and double.

These were easy to fix. You can easily check for these functions by doing isnan(1.0) or isnan(1.0f) (if you want to check for float support as well as or instead of double), so the patches to the configure scripts were quite small.

More surprising was that a number of programs, for example Blender, actually call things like isnan(unsigned int) in their code. These are real bugs and they’re found because we turned on slightly stricter compile-time checking in libc. Are there any C standard library APIs that you’ve seen frequently used wrongly and would like stronger checking for?