"external" declarations of arity 0

Currently, the compiler does not support declarations such as

external my_string : string = "my_cpp_string"

Is this some sort of fundamental limitation or do we merely want to prevent people from accidentally defining values?

I currently have some C++ that defines external values (successfully), and it would be nice to be able to access them directly.
e.g.

template<uint8_t p_tag, size_t p_size>
struct __attribute__((packed))
StaticCamlValue
: StaticCamlValueBase
{
  static constexpr const auto tag = p_tag;
  static constexpr const auto size = p_size;
  constexpr StaticCamlValue() : StaticCamlValueBase(Caml_out_of_heap_header(size, tag)) {}

};

constexpr size_t caml_string_wosize(size_t len)
{
  return (len + sizeof(value))/sizeof(value);
}

template<auto s>
struct __attribute__((packed))
  StaticCamlString : StaticCamlValue<String_tag, caml_string_wosize(s.size() - 1)>
{
  static constexpr auto size_no_null = s.size() - 0;
  static constexpr auto wosize = caml_string_wosize(size_no_null);

  const std::array<char, s.size()> string_value = to_array(s);

  const std::array<uint8_t, wosize * sizeof(value) - size_no_null - 1>
    padding = {};

  const uint8_t final_char = sizeof(value) - (size_no_null % sizeof(value));
};

extern "C" constexpr const StaticCamlString<to_array("hello")> testCamlString;

Why would this be necessary? Just have a function that takes unit as argument and returns the value, then a let right after that calls the function and binds the result to a variable.

I found out the hard way that some languages and platforms don’t like, for example, DLLs defining such static values. From memory, either .NET or Java doesn’t like it - the values simply can’t be accessed. I ended up having to break compatibility for customers and redefine them as void functions.

A function has much more overhead than a value (which is just a memory address).
If we used a function, we need to switch from the ocaml stack to the c stack and back, just to get a constant.

1 Like

Right, but that’s -once- to initialize the ocaml variable.

2 Likes

Just a guess.. OCaml GC may move values around.

There are workarounds like declaring it’s address as “root” from C, but there one more thing to be aware of.

@vrotaru:
In this case, I’m using the Caml_out_of_heap_header to inform the GC not to trace the values. (See here: Add a macro for out-of-heap block header by kayceesrk · Pull Request #9564 · ocaml/ocaml · GitHub)

FWIW, you can use this to create your own out-of-heap values, but like you mentioned, you need to be careful regarding the lifetime of values.
You can see an approach to allocating (rather than static) out-of-heap values here: GitHub - smuenzel/ooh: OCaml Out-of-heap – which uses existential types to track the lifetime of the allocation pool.

Quite interesting. Thank you.

As for your original question, I think the response is not technical but economical in nature. The team did not want to complicate the compiler for something which will be only rarely used.

As I understand it, external declaration this is just a way of saying to typechecker - “Bro, believe me!” and does not touch other parts of the compiler/runtime.