Question about a OCaml grammar


#1

Hello!
I am reading a project whose source code written by OCaml. There is a question which is puzzling me, the code is following, there are many details I can’t understand, not only the grammatical aspect, but also the logical aspect. I can not think out why exists such a obscure language! This project had made me headache several days. I hope that there are some masters in our forum, maybe you can teach me, thanks.

(* Make Map of subtrees of t and their corresponding flow table locations *)
let flow_table_subtrees (layout : flow_layout) (t : t) : flow_subtrees =
  let rec subtrees_for_table (table : Field.t list) (t : t)
    (subtrees : t list) : t list =
    match FDD.unget t with
    | Leaf _ -> subtrees
    | Branch ((field, _), tru, fls) ->
      if (List.mem table field) then
        t :: subtrees
      else
        subtrees_for_table table tru subtrees
        |> subtrees_for_table table fls
  in
  let meta_id = ref 0 in
  List.map layout ~f:(fun table -> subtrees_for_table table t [])
  |> List.filter ~f:(fun subtrees -> subtrees <> []) (* ? *)
  |> List.foldi ~init:Map.Poly.empty ~f:(fun tbl_id accum subtrees ->
      List.fold_right subtrees ~init:accum ~f:(fun t accum ->
        Map.add accum ~key:t ~data:(tbl_id,(post meta_id))))

#2

You forget to ask a question.
If you need a general introduction to OCaml you can consult Real World Ocaml or the introduction part of the reference manual.


#3

Actually I can’t describe my question well now. Maybe I need more time to digest what I have learnt. Thank you!


#4

If you are putting code here please don’t forget to put it into backticks so it can be distinguishable from regular text (e.g. some text) and with the original indentation so it can be readable.


#5

Melvin, I have read that the book OCaml from the very beginning is more introductory than Real World OCaml; you may want to look at that as well. (I learned OCaml from RWO and liked it a lot–I am still learning from it–but I already had a lot of experience with functional programming, and had used a language that is similar to OCaml many years ago. Everyone is different.)


#6

Thank you very much, what you had said is very correct, I have not read the book OCaml from the very begining, maybe I should have a try. I feel nervous a lot now, because the project I am reading needs to be refactored, the first step needed is to understand it fully. I am used to java’s coding style, so the process of thinking transition may take some time. I will keep on studying this stranger and obnoxious language, thanks.


#7

Please use code section feature for better readability:

let square x = x * x

let rec fact x =
if x <= 1 then 1 else x * fact (x - 1)


#8

Yes, the functional programming style is very different from object-oriented styles. It takes some getting used to, but there are a lot of nice things about functional style once it becomes familiar. There are some sorts of problems and bugs that are less common in functional programming, for example. OCaml also has its own particular version of functional programming styles, in part because it is a member of the ML family of languages.

Apart from the newness of functional programming, one way in which OCaml differs from Java is that in OCaml it’s not necessary to specify types as often. However, you can discover what types an expression uses by experimenting at the utop prompt, or by using the Merlin editor extension package, or by reading .mli files. If the original authors of the code didn’t write their own .mli files, versions of these files will be generated when the code is compiled. You can find these files somewhere in the build directories.

Another thing you might want to consider would be to convert your code to ReasonML syntax using the refmt tool. This might not be a good idea; I’m not sure. There are no books on ReasonML yet. It is a new OCaml syntax that is still under development, and I think it is still a bit “bleeding edge”. The traditional syntax will never go away, and most OCaml code is written in the traditional syntax, so it might be simpler to focus on learning one syntax: the traditional syntax in which your code was written. However, I think that some people who are unfamiliar with OCaml might find ReasonML syntax more intuitive, so you might find it useful to look at the ReasonML translation of your code. If you find it easier to read, then you can go back to the original code to compare it with the ReasonML translation to help you understand the original code.

However, either way you will need to develop a new functional-programming mindset in order to understand the code. This is a positive thing, though! (Among other things, it can help you think about Java coding in new, useful ways.)


#9

Thank you very much for your sincere reply. When I read your reply, there are some concepts that I haven’t heard of yet, such as ReansonMlL syntax which might be a nice thing. As I said before, the most important thing for me now is to understand the details of the implementation of the project code. I decided to do it in three steps, first by looking at the code to get the logic out of it, then learning the OCaml language itself systematically, finally reading the code carefully to understand the details of implementation.

Anyway, learning a new programming paradigm is certainly useful. I will learn it with a positive attitude and try my best to do it. Thank you.


#10

I’ll make a stab at breaking this up as I know how you feel! First, here’s the code formatted for easier reading (use three ```ocaml at start of the line and end with ``` on a newline).

(* Make Map of subtrees of t and their corresponding flow table locations *)
let flow_table_subtrees (layout : flow_layout) (t : t) : flow_subtrees =
  let rec subtrees_for_table 
    (table : Field.t list)
    (t : t)
    (subtrees : t list) : t list
  =
  match FDD.unget t with
  | Leaf _ -> subtrees
  | Branch ((field, _), tru, fls) ->
    if (List.mem table field) then
      t :: subtrees
    else
      subtrees_for_table table tru subtrees
      |> subtrees_for_table table fls
in
let meta_id = ref 0 in
List.map layout ~f:(fun table -> subtrees_for_table table t [])
|> List.filter ~f:(fun subtrees -> subtrees <> [])
|> List.foldi ~init:Map.Poly.empty ~f:(fun tbl_id accum subtrees ->
  List.fold_right subtrees ~init:accum ~f:(fun t accum ->
    Map.add accum ~key:t ~data:(tbl_id,(post meta_id))))

The first line (surrounded by (* *)) is a comment. The next line

let flow_table_subtrees (layout : flow_layout) (t : t) : flow_subtrees =

is the start of the function definition of flow_table_subtrees which has two parameters: layout and t. Unusually for OCaml the author has type annotated the parameters. So, (layout : flow_layout) means that there is a parameter layout that has type flow_layout and (t : t) means t has type t.

In addition, the return type of the function is defined (after the last :) as flow_subtrees. This level of type annotation is not common in OCaml as the compiler can almost always infer types.

Next the function subtrees_for_table is defined to be recursive (the rec keyword, by default OCaml functions are non-recursive).

  let rec subtrees_for_table 
    (table : Field.t list)
    (t : t)
    (subtrees : t list) : t list
  =

This function has 3 parameters: table, tand subtrees and all have been type annotated so table is a Field.t list and so on. The return type is defined as a t list. Note that as subtrees_for_table is defined inside the function it will only be visible within this function.

Now we get to the body of subtrees_for_table:

  match FDD.unget t with
  | Leaf _ -> subtrees
  | Branch ((field, _), tru, fls) ->

This matches the return value from FDD.unget t (the module FDD is defined outside of this code) with two possible variants: Leaf and Branch (ignoring the bits after the variant name). The variant type being match here has a declaration something like:

type tree = Leaf of ? | Branch of ((Field.t * ?) * (t * t)

The '?'s are not part of the language and just mean I don’t know the type. Checkout variant types for more detail.

In the case of Leaf it has one parameter but we aren’t interested in the value so we use _ to ignore it (and stop the compiler compaining about unused values). When the return value of FDD.unget t matches Leaf _ the value of the match statement becomes the value of the parameters subtrees.

The case of Branch is a little more complex. The Branch match

  | Branch ((field, _), tru, fls) ->

matches the variant Branch and the variables field, tru and fls have their values set to the correspondence values of the variant. The body of the Branch variant is then executed:

    if (List.mem table field) then
      t :: subtrees
    else
      subtrees_for_table table tru subtrees
      |> subtrees_for_table table fls

What this says is: If field is a member of the table list then the value of the if is t pre-pended to subtrees and this will be the value of the match statement

The else case is a little more complex: The first step is to recursively execute subtrees_for_table table tru subtrees which itself returns a t list.

Before going on the ‘|>’ operator is worth looking at: let ( |> ) x f = f x. so let f a = a + 1 in 5 |> f will return 6. It gets more interesting with the partial application of functions. In the above subtrees_for_table table fls is partially applied and is in fact a function that takes 1 argument, a t list. Check out currying for more info.

So, now subtrees_for_table table tru subtrees |> subtrees_for_table table fls actually means (subtrees_for_table table fls) (subtrees_for_table table tru subtrees), which is a t list and is the value of the if…else and as a result the match statement.

In either case the value of the match becomes the value of the function subtrees_for_table.

For the rest, check out lists in real world ocaml as well as the ref cell section in the imperative programming chapter.


#11

Hello, thank you for your valuable time to solve puzzles for me. I have several questions here. The first point is, when the function is defined, if the last function parameter is separated by colon and not expanded by parentheses, is this parameter the return value of the function? As shown below, the bolded section:

let flow_table_subtrees (layout : flow_layout) (t : t) : flow_subtrees =

Second, as far as the usage of List.mem this function I’ve searched online is different from what you’ve explained, the first argument is the element that is judged to be contained, and the second argument is a list. This list is the list that needs to be determined whether or not it contains the preceding elements.

if (List.mem table field) then
  t :: subtrees
else
  subtrees_for_table table tru subtrees
  |> subtrees_for_table table fls

the example I found from the Internet:

List.mem 12 my_list;;
– : bool = false

I am currently learning the OCaml language, using a Cornell University video, but because my native language is not English and the video has no subtitles, it is hard to learn. And the videos are old.
So I think it is necessary for those who are committed to the development of OCaml to make some fine instructional videos to help beginners get through the rudimentary difficulties.

Thank you very much, I sincerely wish you a happy life, you are a kind person.


#12

The flow_subtrees in the first question is indeed the type of the return value.

You are right about the List.mem function parameter order, the implementations I have seen are normally val mem : 'a -> 'a List.t -> bool but the value table is a Fields.t list and so the first parameter in this version must be a list, or the code doesn’t compile! See if you can find the definition of List.mem or figure out which library you are using


#13

OK, I see what you mean. My current version of OCaml is 4.02.3.


#14

Judging by the fact that the functions use ~f as the parameter to specify a processing function I am pretty sure that the code uses Jane Street Core or if it is very new then maybe Base.