[Newbie] Confused by OCaml module system

Hello all,

I’m trying to separate a program of mine into separate modules, and it seems I just don’t get how OCaml’s module system is working.

Here is a very simplified version of what I’m trying to do, starting with all in a single file bar.ml:

external _foo: string -> int * int = "caml_foo"

type my_record_type = {
    x: int;
    y: int
}

let foo s = (let x, y = (_foo s) in { x=x; y=y });;

let
    z = foo "hello, world!"
in
    Printf.printf "z.x = %d - z.y = %d" z.x z.y;
    print_endline ""

I also have a file foo_stubs.c containing the implementation for the C function caml_foo.
So if I compile everything with ocamlopt it works: ocamlopt -o bar bar.ml foo_stubs.c produces the barexecutable, which works as expected.

~

But now, I’d like to isolate the first part in a separate module. So after reading the manual’s part about how modules & the file system work together, here what I tried:
I created a file foo.ml, with the following code:

external _foo: string -> int * int = "caml_foo"

type my_record_type = {
    x: int;
    y: int
}

let foo s = (let x, y = (_foo s) in { x=x; y=y })

I changed the contents of the file bar.ml to:

open Foo;;

let
    z = foo "hello, world!"
in
    Printf.printf "z.x = %d - z.y = %d" z.x z.y;
    print_endline ""

The contents of the file foo_stubs.c is unchanged.
But now, when I try to compile everything together, it no more works:

> ocamlopt -o bar bar.ml foo.ml foo_stubs.c
File "foo_stubs.c", line 1:
Error: No implementations provided for the following modules:
         Foo referenced from bar.cmx

The strange thing being, individual compilations do work: neither ocamlopt -c bar.ml, nor ocamlopt -c foo.ml, nor ocamlopt -c foo_stubs.c produce any error, and the .o files are generated. Even weirder, if I change something in bar.ml that makes it incompatible with the contents of foo.ml (e.g replacing one of the %d with a %s in the Printf.printf line), I actually get an error message:

> ocamlopt -c bar.ml
File "bar.ml", line 7, characters 44-47:
Error: This expression has type int but an expression was expected of type
         string

So the compiler does read the contents of foo.ml; it just doesn’t seem to recognize it as a proper module.

~

I tried quite a few other things, such as creating a file foo.mli with the contents:

type my_record_type = {
    x: int;
    y: int
}

val foo: string -> my_record_type

and with foo.ml containing only:

external _foo: string -> int * int = "caml_foo"

let foo s = (let x, y = (_foo s) in { x=x; y=y })

But it’s even worse: now compiling foo.ml won’t even work:

> ocamlopt -c foo.ml
File "foo.ml", line 3, characters 38-39:
Error: Unbound record field x

I also tried to put everything in a module declaration in a single file foo.ml with the sig part being the contents of foo.mli and the former contents of the foo.ml file in the struct part:

module Foo:
  sig
    type my_record_type = {
        x: int;
        y: int
    }

    val foo: string -> my_record_type
  end
= struct
  external _foo: string -> int * int = "caml_foo"

  let foo s = (let x, y = (_foo s) in { x=x; y=y })
end

But I’m getting the same kind of error:

> ocamlopt -c foo.ml 
File "foo.ml", line 13, characters 40-41:
Error: Unbound record field x

And now I’m stuck. I just can’t figure out how to make this work.

I’m obviously doing something wrong, but I can’t figure out what. Any hint to put me in the right direction?

Thanks!

The order in which you pass the modules on the command-line is meaningful: ocamlopt -o bar foo.ml bar.ml foo_stubs.c should work (modules should come “after” their dependencies).

If you use -c to compile separately you should compile modules in the same order (ie first foo.ml, then bar.ml) to avoid problems.

You could also try using dune which takes care of all this.

Cheers,
Nicolas

1 Like

*facepalm* This is the one thing that I just didn’t think about. And of course, that’s the issue: when I put foo.ml first in the compilation command, it works without problem.

Thank you very much!

Some additional remarks about using an interface: you still need to repeat the type definition in the module value. That’s why you get “Unbound record field x” error: the type is not defined before. The interface only restricts the module signature afterwards.

Thanks for that. I did notice it was working when I inserted the type definition in the .ml too, but that seemed a bit counterintuitive, so I was wondering if I was doing anything wrong about that too.

Isn’t there a simpler way to say what should be exported from a module? Having to repeat the signatures of everything in a .mli feels a bit too much, especially since OCaml does a very good job at figuring these out by itself… Isn’t there a way to just list the names of the items that should be visible from the outside of the module and let the compiler figure out the signatures?

If you want to export everything in a module, you can do that by just omitting a signature (which means not having an .mli file, if your module is a file).

But for narrowing down your interface, you need to specify the types. This is because the type you expose outside of a module can be narrower than the type you expose inside the module. This is what allows the module system to be used to create abstract types and tightly specified APIs.

E.g., you could expose my_record_type with the type as

type my_record_type

Leaving the type abstract. This then ensures nothing outside of the module can directly construct or inspect the content.

Having a fully specified .mli is really helpful for reading and reasoning about your code. When you want to use a module (in your code or someone else’s) you can just peek at the mli file and immediately see the available api at a glance. This is one of the many cases where OCaml optimizes for long term readability, modularity, and stability at a slight added cost at the time of writing the code. I miss this in every other language I work in.

That said, there are techniques that allow you to avoid having to repeat types: https://stackoverflow.com/a/28829891/1187277

2 Likes

Think of the repetition as like a double-entry accounting system. The interface and implementation are checked against each other for errors.

3 Likes

OK, that makes sense. The possibility to generate a first .mlifrom the .ml with ocamlc -i is nice though, I didn’t know that, it would have been very practical.
Thanks!

1 Like

Well, usually, if a language requires you to repeat yourself twice to make sure you know what you’re doing, it means that their compiler doesn’t know what it’s doing. :wink: Kind of trying to move away from that actually. But thanks to shonfeder, I understand the rationale behind the .mlinow.

Well, in this case, it means the compiler knows what it’s doing, but wants to make sure you know what you’re doing :wink:

2 Likes