About maps and NaNs

nobrowser · November 21, 2022, 2:40am

I have just read this

I am curious what the behavior of the Ocaml Map (and also Hashtbl) module is, in this particular respect. (I know it would be relatively easy to try, so forgive my laziness here.)

–
Ian

jeffsco · November 21, 2022, 4:14am

Here is a session with OCaml 4.14.0. It’s definitely interesting to think about how to specify a reasonable behavior for nan.

# module FMap = Map.Make(Float);;
# let mymap = FMap.singleton nan "yes";;
val mymap : string FMap.t = <abstr>
# FMap.find nan mymap;;
- : string = "yes"
# nan = nan;;
- : bool = false
# FMap.cardinal (FMap.remove nan mymap);;
- : int = 0

alan · November 21, 2022, 5:16am

The Map.Make module uses the compare function defined in the input argument. From the Float documentation:

compare x y returns 0 if x is equal to y , a negative integer if x is less than y , and a positive integer if x is greater than y . compare treats nan as equal to itself and less than any other float value. This treatment of nan ensures that compare defines a total ordering relation.

Hashtbls also treat NaNs as equal:

# let tbl = Hashtbl.create 11;;
val tbl : ('_weak1, '_weak2) Hashtbl.t = <abstr>
# Hashtbl.add tbl nan 0;;
- : unit = ()
# Hashtbl.length tbl;;
- : int = 1
# Hashtbl.remove tbl nan;;
- : unit = ()
# Hashtbl.length tbl;;
- : int = 0

The polymorphic hashtbl uses the polymorphic compare function to test for equality; you can read the source code.

yawaramin · November 21, 2022, 5:30am

As is usually the case, whenever I read about a flaw discovered in other languages and ecosystems, I check how OCaml does its equivalent and it already does the correct thing.

benjamin-thomas · November 23, 2022, 6:07pm

To me, nan looks like an weird and ugly value, something that I wouldn’t want to deal with in my programs (I’d prefer getting exception from unexpected inputs).

For instance, this looks illogical to me.

utop # nan = nan;;
- : bool = false
utop # nan > nan;;
- : bool = false
utop # nan < nan;;
- : bool = false

I mainly program at a high level, and if I understand things correctly CPUs themselves may return nan values when computing floating point operations (which seems reasonable).

I’m not sure I understand how that’d could be useful in my userland code though. Could anyone explain to me a use case when having a nan value would in fact be useful?

dbuenzli · November 23, 2022, 7:08pm

One use case is representing the absence of data. Basically you can see the float datatype as being a compact float option.

nojb · November 23, 2022, 7:38pm

Something to keep in mind is that float does not model the mathematical concept of real number but rather implements IEEE 754, the de-facto standard for numeric computing. Doing something different here would make OCaml pretty much worthless for numeric computing.

Cheers,
Nicolas

edwin · November 24, 2022, 12:15am

NaN can also have multiple representations, e.g. both of these are NaN:

utop # Int64.float_of_bits 0x7ff8000000000002L;;
- : float = nan
utop # Int64.float_of_bits 0x7ff0000000000002L;;
- : float = nan

Different operations also produce different NaNs

utop # Float.nan |> Int64.bits_of_float |> Printf.sprintf "%Lx";;
- : string = "7ff0000000000001"
utop # sqrt ~-.1. |> Int64.bits_of_float |> Printf.sprintf "%Lx";;
- : string = "fff8000000000000"
utop # (Float.nan +. 3.) |> Int64.bits_of_float |> Printf.sprintf "%Lx";;
- : string = "7ff8000000000001"

More information here IEEE 754-1985 - Wikipedia and NaN - Wikipedia

In fact I’ve seen some C code that attempts to encode some debug information into the NaN to help tracing its origin. Just discovered that you can actually input such NaN with payloads in OCaml (they just don’t get printed):

top # Float.of_string "nan(42)";;
- : float = nan
top # Float.of_string "nan(42)" |> Int64.bits_of_float |> Printf.sprintf "%Lx";;
- : string = "7ff800000000002a"
utop # Float.of_string "-nan" |> Int64.bits_of_float |> Printf.sprintf "%Lx";;
- : string = "fff8000000000000"

bluddy · November 24, 2022, 7:28am

The rabbit hole goes much deeper, as NaN-boxing is a technique that could completely replace the way OCaml does pointer tagging and allow us to represent both integers and floats as direct values. See the discussion here for example.

benjamin-thomas · November 24, 2022, 9:19pm

Although I knew about float’s lack of precision, I somehow failed to understand that.

Now that I understand that 0.0 doesn’t actually mean zero everything makes sens.

Thanks for insisting on this and the useful code snippets

Sorry for polluting your thread @nobrowser

nobrowser · November 25, 2022, 12:15am

No, it’s fascinating! TBH starting an interesting thread like this is one of my goals when I post. If it ranges wide so much the better!

–
Ian

Topic		Replies	Views
Map.Make with polymorphic variant Learning map	6	302	September 12, 2024
Defining Hashtbl variable Learning	6	996	March 18, 2019
Help. T^T Some Problems encountered when I tried to Implement Map Module Learning	2	409	August 11, 2023
Base: design of empty in Map/Set/Hashtbl Learning	5	618	April 12, 2023
Comparator_witness compatibility for maps and sets Learning	1	116	June 20, 2025

About maps and NaNs

Related topics