Ocamlc bytecode file vs dune bytecode *.bc file

Hello, everybody!
This may seem like a bunch of silly question, but i am very new to ocaml, so here they come:

We know that OCaml bytecode compiler “ocamlc” compiles OCaml source files to bytecode object files and links these object files to produce standalone bytecode executable files.
Now, dune can be also set to build bytecode executables with *.bc extension.

So, which tool(s) does dune use internally to achieve this? Does it use “ocamlc” internally?

Are there any tools available to peek inside bytecode files produced by “ocamlc” and dune apart from just hex editors? But tools able to reflect the inner structure of the said files.

Is internal format of ocaml bytecode and native files documented anywhere? I mean similar to, say, PE files format?
Thanks in advance!

If you haven’t already, you should look at the Real World OCaml page about OCaml Bytecode: The Compiler Backend: Bytecode and Native code - Real World OCaml

That page mentions you can use ocamlc -dinstr to view the generated bytecode instructions of an .ml file.

You can also open a repl with ocaml -dinstr which will print the bytecode instructions for each expression the repl evaluates.

The best reference I’ve found is the following PDFs:
File & data formats: https://cadmium.x9c.fr/distrib/caml-formats.pdf
Instruction set: https://cadmium.x9c.fr/distrib/caml-instructions.pdf

3 Likes

The .bc is just the result of renaming the executable file to distinguish the executable produced by the bytecode compiler ocamlc from the one produced by native code compiler ocamlopt (use file extension .exe). You can disassemble the .bc file, but the result would only be asm code of OCaml bytecode runtime functions, and the bytecode is stored in the executable as data.

It’s just the same file produced by ocamlc renamed.

In addition to what @mnxn mentioned,

The bytecode file .cmo, and others such as .cmi, cmx are produced by OCaml’s Marshal module, which means the binary format could vary depends on OCaml compiler versions. The layout defined as OCaml data types are available from ocaml/file_formats. In general you won’t get very interesting result from them, and you would instruct the compiler to produce text representation of byte code etc for debugging/profiling.

The nativecode file .o is regular object file and can be disassembled by the compiler tool chain available on your system.

2 Likes

Thank you so much dear friends for the clear and detailed explanations!
This is truly helpful and is very much appreciated!

And just one more:
What is the easiest way to set up source level ocaml debugging?
So far I have tried Visual Code + earlybird plugin from hackwaly. on Ubuntu.
It works, but there still questions:
Hackwaly says that his " Example launch configuration" is “used to debug utop examples.”
That is were i am lost.
It seems that only bytecode can be debugged, not the native code. Which is fine, we produce *.bc executable.
But when does utop come into play when it comes to debugging? Because hackwaly’s code creates an utop instance.
Can bytecode be executed only in utop environment? Because it needs an interpreter…
In other words, what is the relationship (if any) between utop = bytecode = debugging?
It is obvious i am confused, so please shed some light!
My final goal is to enter ocaml code and be able to step through it as I am used to do in C.
Thanks in advance!!!

To get source level debug ocamldebug is the debugger provided by OCaml compiler. It only works for bytecode program and has limited availability on Windows.

https://ocaml.org/learn/tutorials/debug.html#The-OCaml-debugger

utop is just a wrapper around OCaml toolchains to make users more comfortable by providing editor integration, color, completion etc. so it’s totally possible to develop OCaml programs without utop.

Great! Thanks for quick reply!
I use Ubuntu 9.x so Windows limitations are not an issue.
I wonder if anybody has actually gotten hackwaly’s debugging plugin “ealrybird” from github, installed it, and tried to use it from within Visual Code? Overall it looks very promising but he brings utop into play and that’s where he looses me.
Or maybe utop is not the problem at all and there is something else, which i do not understand completely, because when i add my own source .ml file to the folder structure, which came from github and compile it into bytecode .bc file, then right-click resulting .bc file and select “Start an Ocaml Debug Session” the IDE still opens the “test-program.bc” and starts debugging it.
Either it is marked as default or something else mysterious happens but that is the problem, which i tried to attribute to utop factor. (erroneously it would seem).

To summarise, my question is how to make Visual Code IDE + earlybird ocaml debug plugin open and step through the file of my making, not the files which come from github as part of ealybird package.
I apologies for the vague and nebulous style, but it is the result of my very limited knowledge int the area.
Thanks again!

Since you have identified your problem is specific to earlybird, maybe you can try contact the author for support?

@hackwaly

1 Like

If your source code is just one main.ml file. You can simply ocamlc -g main.ml to produce a a.out file. It’s a bytecode which earlybird can use it to debug.

1 Like

Thanks, dear friends!
After some digging, i have figured out that my problems were of my own making and stemmed from the very limited knowledge of the subject. I am coming from c/c++, x86 ASM programming and both Ocaml and Visual Code are very new to me… But OCAML is definitely well worth learning!
Now, I pretty much did what hackwaly suggested and it is working just fine!
So, thanks again everybody for your help and patience with a savage like myself!