Number theoretically correct integer division is supposed to work so that (N / K) + (N mod K) = N.
I was very surprised to see that it’s not how (/) : int → int works!
# 3 mod 2 ;;
- : int = 1
Now, two questions. What is the justification for this behaviour? And does anything provid real integer division?
I assume you meant to include an example with negative numbers?
# 3 mod 2;;
- : int = 1
# (-3) mod 2;;
- : int = -1
# 3 mod (-2);;
- : int = 1
# (-3) mod (-2);;
- : int = -1
The mod operator always has the same sign as the numerator. I think this is for historical reasons, although I don’t know whether to point the finger at C, or x86, or something even earlier.
If you use Base, the % operator gives you the Euclidean modulo operator that I think you’re looking for. Its result always has the same sign as the denominator. This operator is basically equivalent to:
let (%) x y =
let z = x mod y in
if z < 0 then z + y else z
This is a “feature” in most programming languages and I think actually corresponds to the standard way division is implemented in the CPU itself (so it has little to do with OCaml). How this was allowed to become the standard I do not know.
One thing is that I get the impression that people who are not familiar with number theory find the following result extremely counter-intuitive : (-3) / 2 = -2
I believe it is because they think of integer division as an approximation of real division, rather than as being its own special thing, and from this perspective it makes no sense that making a number negative should change the result. They expect the identities that hold of real division (like (a*b) / c = a * (b/c)) to also hold for integer division. (I say “they” not to belittle the perspective, I totally see where they are coming from.)
But then if you have (-3) / 2 = -1, you need to have (-3) mod 2 = -1 to preserve the relation between / and mod that you mention (so you’ll note that the relation does hold in this system).
I tend to think that the behaviour where mod never returns anything negative, in addition to being what a mathematician would expect, is strictly more useful (for what I believe to be the typical use case of modulo over negative numbers in programming, which is indexing into a circular buffer). And I also think that you almost never divide negative numbers, so the useful behaviour for mod should have taken priority when deciding how all this works, and whether / is intuitive or not does not matter much in practice. But I have no idea who took that decision and whether such issues were even considered.
Thanks for this! So the one found in OCaml and most languages is T-division (for truncating) and the one I called more useful is E-division (for Euclidean) which the paper argues for. It says that T-division is found in Ada, that Lisp has two modulo operators, one that does T-division and one that does F-division (halfway between T and E, and works for the circular buffer case), and that Algol and Pascal break the relation between div and mod by doing T-division for div and E-division for mod (if I got it right). Interesting stuff.