Bad object file from `ocamlopt -output-obj -o foo.o foo.ml`

If I compile

ocamlopt -output-obj -o foo.o foo.ml 

the inspection by nm of the object foo.o shows

                 U camlFoo__code_begin
                 U camlFoo__code_end
                 U camlFoo__data_begin
                 U camlFoo__data_end
                 U camlFoo__entry
                 U camlFoo__frametable
                 U camlFoo__gc_roots

and consequently foo.o does not link well with the main code in C.

If the name of the output object file is changed to bar.o

ocamlopt -output-obj -o bar.o foo.ml 

the symbols are correctly defined and bar.o links correctly with C.

Is it a bug or a feature, and where this behaviour is documented?

I don’t know if it is a bug or a feature. But what happens is that ocamlopt always creates foo.o when compiling foo.ml, irrespective of -output-obj. So, if you name the output file the same, then you will not obtain it.

1 Like

The unexpected behaviour reported above was observed on Linux with GNU cc 9.3.0 and GNU ld 2.34. Namely,

ocamlopt -output-obj foo.ml

outputs foo.o with defined symbols as in nm foo.o

0000000000000058 D _camlFoo
0000000000000000 T _camlFoo__code_begin
0000000000000025 T _camlFoo__code_end
0000000000000050 D _camlFoo__data_begin
0000000000000078 D _camlFoo__data_end
0000000000000010 T _camlFoo__entry
0000000000000080 D _camlFoo__frametable
0000000000000060 D _camlFoo__gc_roots
0000000000000040 s _caml_absf_mask
0000000000000030 s _caml_negf_mask

but ocamlopt -output-obj foo.ml -o foo.o outputs bad foo.o in which the above symbols are not defined, nm foo.o:

nm foo.o | grep camlFoo
                 U camlFoo__code_begin
                 U camlFoo__code_end
                 U camlFoo__data_begin
                 U camlFoo__data_end
                 U camlFoo__entry
                 U camlFoo__frametable
                 U camlFoo__gc_roots

However, on Darwin with cc Apple clang version 13.0.0 (clang-1300.0.29.3) and ld Apple TAPI version 13.0.0 (tapi-1300.0.6.5) the behaviour of ocamlopt -output-obj foo.ml -o foo.o is correct:

nm foo.o | grep camlFoo
00000000000005d8 D _camlFoo
00000000000000c0 T _camlFoo__code_begin
00000000000000e5 T _camlFoo__code_end
00000000000005d0 D _camlFoo__data_begin
00000000000005f8 D _camlFoo__data_end
00000000000000d0 T _camlFoo__entry
0000000000000600 D _camlFoo__frametable
00000000000005e0 D _camlFoo__gc_roots

My conclusions.

  1. The result of ocamlopt -output-obj foo.ml -o foo.o is different between Linux GNU cc/ld and Darwin clang cc / ld for ocamlopt 4.12.1

  2. The difference of this behaviour is not documented.

  3. The behaviour on Darwin clang cc / ld is what is expected from the user side perspective and the documentation: ocamlopt -output-obj foo.ml -o foo.o is supposed to build a C object file foo.o that can be linked to external C-programs. On Darwin clang cc / ld it is true, but on GNU Linux cc / ld it is false.

The reason for this bug is probably that with GNU cc / ld the linker overrides the compiler output object file foo.o with the linker output object file foo.o before reading from compiler output object file foo.o!

But on Darwin clang cc / ld the probem doesn’t happen. Should we raise the bug or awareness of this problematic behaviour for ocamlopt over GNU cc / ld?

I think that creating an issue on the OCaml bug tracker would be reasonable.

I would strongly discourage against using ocamlopt -output-obj -o foo.o foo.ml.
As you noticed, it produces two different files with the name foo.o, so the fact that it sometimes works is a pure miracle (it also makes the generated foo.cmx unlinkable).
The solution is either to use a different name for the output of -output-obj (or for the source file), or to first compile foo.ml with a different name then link the result (ocamlopt -c foo.ml -o foo_ml.cmx && ocamlopt -output-obj -o foo.o foo_ml.cmx).