Well, it is not that trivial as it looks, especially since OCaml is trying not to do magic stuff and ad-hoc optimizations. But I think I’ve jumped over a few intermediate steps. I will explain below.
OCaml, even with flambda, is a very predictable compiler. It basically follows the code that you’re writing. It doesn’t try to be clever and to guess what you’re trying to do, instead it implements as efficiently as possible whatever you have written. For example, suppose you would like to write a function int_of_bool
, especially since there is no such function in the standard library. So you will probably write something like this:
let int_of_bool x = if x then 1 else 0
And OCaml will fairly translate this into a branching instruction. It won’t perform any fancy optimizations like branch elimination, hoisting, cse elimination, or anything like this. It will, however, generate code that will represent high-level functional programming concepts into a very efficient implementation, i.e., it will uncurry functions, try to remove many indirect calls, compile pattern matching to efficient binary trees, perform cross-module optimization, and, of course, will apply expression optimizations (constant folding) and a fair amount of inlining. Optimizations are applied in the backend where no type information is available, so the compiler no longer knows that the function int_of_bool
has type bool -> int
, nor that they share the representation, so nothing is left here.
With Flambda the story is basically the same, except that the flambda will apply optimizations more aggressively, and the optimizations are applied in the middle-end in the new representation called FLambda which is much richer that enables more optimizations.
That’s all not to say, that OCaml compiler is bad, not optimizing or anything like this. It generates excellent and straightforward code, with no stupid jokes (I’m saying this is a guy who spends most of his life looking into the binary code ). OCaml is good, but it will fairly translate bad code into bad binaries.
Concerning the int_of_bool
problem, it could be easily resolved on the standard library level. It just never came to the table, so nobody looked into it. I’m not sure how much it affects your code or any production code. But if you will find then you have this operation in a tight-loop, you should know that you can always get away of it, since it is a no-op underneath the hood. And it is not necessary to use Obj.magic
, you can, for example, create your own bool type, e.g.,
module Bit : sig
type t = private int
val one : t
val zero : t
val of_int : int -> t
val to_int : t -> t
val is_set : t -> bool
(* ... etc ... *)
end = struct
type t = int
let one = 1
let zero = 0
let of_int x = x <> 0 [@@inline]
let to_int x = x [@@inline]
let is_set x = x = 1 [@@inline]
end
So that the dd
function can now return Bit.t
, which is an int underneath the hood (and compiler knows that it has the int
representation), but its invariant (that it is either one
or zero
is protected by the module system).
Because they left undocumented, as they are considered a debugging tool for compiler developers, and the representation is also undocumented. But should it stop us? Forbidden fruit tastes the sweetest
Wow, that’s actually very wrong In fact, your use of Obj.magic here is totally inappropriate as it breaks type system and will lead to segmentation faults. For example, in this function, you’re downcasting cast 'a reg
to int reg
, but imagine what will happen if someone will pass int32 reg
there? Another problem with your code, is that you’re not using abstractions properly, type t = int
doesn’t create a new type, it introduces a type alias, which basically another name (or type constructor) for the same type. So all your registers are indexed with the same type, all having type int reg
. This will introduce a new type.
type t = Fctrl of int [@@unboxed]
This will also introduce a new type
module Fctrl : sig
type t [@@unboxed]
end = struct
type t = int
let bam = 0x00000400 (* Broadcast Accept Mode *)
end
and this will also create a new type,
module Fctrl : sig
type t = private int
end = struct
type t = int
let bam = 0x00000400 (* Broadcast Accept Mode *)
end
Next, the whole idea of using GADT instead of plain ADT is that you can pass payloads independently of the constructor. Basically, instead of doing set (TTL 1000)
which will allocate a boxed value (TTL 1000)
then pass it to set, which will extract it, and throw away, you can do set TTL 1000
which will call a function with two immediate arguments, and the TTL
type will prescribe the type of the second argument.
So this is wrong (well, in our context):
type 'a reg =
| X : x -> x reg
| Y : y -> y reg
This is what you want:
type 'a reg =
| X : x reg
| Y : y reg
Now, you have N different registers each indexed with the type of a datum that you can store in it. From the terms of OOP it means that you have N different classes sharing the same base. So when you writing the set
method, you should dispatch it over all possibilities, e.g.,
module Status : sig
type t = private int
val s1 : t
val s2 : t
val s3 : t
val s4 : t
end = struct
let s1 = 0b01
let s2 = 0b10
let s3 = 0b11
let s4 = 0b00
end
type 'a reg =
| Status : Status.t reg
| AL : int reg
| AH : int reg
let set : type t. t reg -> t -> unit = fun reg arg -> match reg with
| StatusRegister ->
(* in this branch `t` is refined to `Status.t` *)
set_status_register arg
| AL -> (* here `t` is refined to a completely different type `int` *)
set_al arg
| AH ->
(* here it is also `int` but in modern OCaml
you can't yet unify AL and AH branch *)
set_ah arg
Well, this is a user responsibility in general to provide the input buffer, so I’m not sure why on this problem occurs. Consider the readv
, writev
in posix, they let the user decide on the allocation Policy. So your interface should basically have the same interface - it should accept user data, along with a descriptor, and the data representation should be protected using abstractions, if possible.