Ocaml Yaml Extracting Value using a Key

Hi. I am new to Ocaml and need help with parsing yaml file. I am using the module ocaml-yaml by install opam install yaml .

Let say I have a yaml file call testyaml.yml with the following content:

    api: 222
    secret: "hello"

I am able to read the file into utop like this:
#require "yaml.unix";;
let my_yaml = Yaml_unix.to_file Fpath.(v "testyaml.yml");;

Here is where I am stuck. How do extract the value 222 using the key “api”?

Please advise.
Thanks.

Hi Phage. As far as I know, ocaml-yaml doesn’t come with any built-in functionality for extracting values from the Yaml.value type, so you have to do this yourself because the structure of whatever you read in is very dependent on your data.

So for this example you could write a small find function which works for the `O of (string * value) list type (see here for those types).

Simplest Solution

let find key (yaml : Yaml.value) = match yaml with 
  | `O assoc -> List.assoc_opt key assoc
  | _ -> None

Here we pattern-match to extract the Yaml.value we want (anything else returns None) and use the standard library’s assoc_opt to looks for the correct entry in our “association list”. This function returns a Yaml.value option.

Here is an example usage of that function:

let () = 
  let yaml = Yaml_unix.of_file_exn (Fpath.v "testyaml.yml") in 
    match find "api" yaml with 
      | Some yaml_value -> Yaml.pp Format.std_formatter yaml_value
      | None -> print_endline "Didn't find anything :/"

All going well this should print 222 to standard output. Hope this helps.

Experimental Solution

I’m using ocaml-yaml quite a lot and I’m making a ppx that transforms OCaml types to YAML types (and back again). Note this is very much a WIP, but I’m more than happy to fix any bugs or help with you using it if you wish to do so :).

This means if you know the structure of your YAML ahead of time you can describe it using OCaml types and let the ppx generate functions to transform it. The example above would become:

type t = {api: int; secret: string} [@@deriving yaml]

let () = 
  let yaml = Yaml_unix.of_file_exn (Fpath.v "testyaml.yml") in 
    match of_yaml yaml with 
      | Ok v -> print_int v.api
      | Error (`Msg m) -> failwith m 
4 Likes

Thank you so much for such a clear and excellent explanation. Now, I understand how to proceed. In regards to your ppx project, I have not learn that concept just yet. Perhaps I can get back to you once I make further progress with Ocaml. I have been a software engineer for 10 years and wish I had started with Ocaml sooner.

2 Likes

If you don’t mind, I have a follow-up question. I have gotten further along based on your “Simple Solution” you provided. What I am trying to accomplish is to extract the int and string value using the keys and using these values in this function:

let headers = Header.init ()
  |> fun h -> Header.add_list h 
              [("API-KEY-ID", 222); 
               ("API-SECRET-KEY", "hello")]

Howeve, I ran into my next obstacle. To get to the final primitive int and string value, I build this series of functions to extract them:

let get_list  (yaml: Yaml.value) = match yaml with
 | `O assoc -> Some assoc
 | _ -> None

let unbox opt = match opt with
  | Some x -> x
  | None -> [("null", `String "No key found")]

let my_value s:Yaml.value = match s with
  | `String x -> x
  | _ -> ""

let () =
  let yaml = Yaml_unix.of_file_exn (Fpath.v "testyaml.yaml") in
  let x1 = get_list yaml in
  let x2 = unbox x1 in
  let x3 = List.assoc "api" x2 in 
  let x4 = my_value x3 in
  print_endline (x4)

The problem is that the data that I need is stuck in the Yaml.value constructor and any attempt to get them out has been unsuccessful. Please advise. Thanks.

Hi Phage, I don’t mind at all.

First of all, with your headers function you are supplying a list with different types. The type-checker will complain that "hello" is a string when it is expecting an int because of the 222. OCaml lists are “homogeneous” i.e. all elements must have the same type.

Secondly, (which may just be a typo in the markdown) is that your type annotation for the function my_value needs to use parentheses. You want to restrict s to only Yaml.value types. The way it is written right now will force the return type to be a Yaml.value.

let my_value (s : Yaml.value) = match s with
  | `String x -> x
  | _ -> ""

Though not strictly necessary, when working with polymorphic variants (i.e. `String s) the type-checker will say your function has type [> `String of string ] -> string meaning it could take more than just the Yaml.value variants as a parameter (the >). By adding the type annotation we restrict that.

OCaml-yaml does some type-inference for us when parsing yaml files. Whenever we parse

api: 222
secret: "hello"

ocaml-yaml makes 222 a float (wrapping it in `Float) and "hello" becomes a string (wrapped in a `String - note the quotation marks wouldn’t be necessary for this). There’s no way I know of without effectively wrapping values, to write a single function which extracts the primitive types because then what type would that function have (it would need to return floats, strings, lists etc.). So right now, once the above is fixed x3 is really `Float 222. and when passed to my_value this hits the catch-all case and we get "" printed.

Solutions

So there are two possible solutions - (a) convert all Yaml values to one type (say a string) or (b) write “unboxing” like functions for each Yaml value type and apply them based on your knowledge of what things should be. (a) is straightforward as ocaml-yaml can do it for you.

(* Raise exception version *)
let my_value = Yaml.to_string_exn

(* Provide a default version *)
let my_value s = match Yaml.to_string s with 
  | Ok s -> s 
  | _ -> ""

Note that by convention functions ending in _exn raise an expection and without this ending they usually return an 'a t for some t (Result.t or option are common, ocaml-yaml uses the former).

The second option is to use unboxing function for each Yaml.value type and either provide defaults for the _ catch-all case or raise your on error. Here are two examples:

let unbox_float (f : Yaml.value) = match f with 
  | `Float f -> f 
  | _ -> 0.

let unbox_array (a : Yaml.value) = match a with 
  | `A f -> f 
  | _ -> []

But now there’s a problem, the list returned by unbox_array contains “boxed” elements :confused: Unfortunately without reboxing them to have a single type or through GADT magic (which I know little about) to make heterogeneous lists you just have to put up with this (which I don’t think is a bad thing).

Hope this helps – I feel like this is a great example of how OCaml’s type system feels a little “restricting” but is actually guiding you in writing safer and less error-prone code by having to handle the types and provide cases for everything that you have to.

Thank you once again. The error you pointed out in the function my_value was not a typo. I did not realize that function parameters without parentheses will restrict the return type to Yaml.value. That was what causing me all these headaches.

The second solution provided was much better! I knew there was a simpler way to extract but could not see it. Thanks again for your help!!!

My next challenge may be to try to build a module that can extract a value using the key in a yaml file by encapsulating these series of steps. Do you think such a module already exist? If not, do you think it would be a good exercise as I want to learn more about module and functor?

Thanks again.

1 Like

Briefly interjecting here: there are quite a few such cases that can be surprising to beginners. The OCaml parser is generally very permissive when it comes to whitespace. I strongly encourage you to try using ocamlformat, since it tends to format files in a way that makes it much clearer how OCaml will interpret them.

Welcome to OCaml :slightly_smiling_face:

1 Like

No worries – once you have the type checker on your side, it can really help speed up development :slight_smile:.

I haven’t seen much in the way of libraries surrounding the ocaml-yaml library and I’m of the opinion all practice is good practice. I think my first module and functor was something with a type t and print function and then I made a MakePrintableList functor to have type t list and a print_list function using the original print. Simple, but I found it useful.

One thing I would just add is that as of the most recent versions, the Yaml.value type is identical to the Ezjsonm.value type (see this source code). Ezjsonm is a wrapper around the jsonm library, but it provides some of the functionality you might be looking for. For example:

val get_list : (Ezjsonm.value -> 'a) -> Ezjsonm.value -> 'a list

Here the get_list function expects you to provide the “unboxing” function for the list elements so it can use it to build a list of those types (in reality you could transform the Yaml.value into anything you like, but unboxing them is probably the most common). Perhaps a good exercise is to (w/out copying from ezjsonm :wink: ) build the same utility functions for Yaml.value? Just an idea, good luck and happy OCaml-ing :camel:

Thanks for all the excellent questions @Phage, and for the explanations @patricoferris. When I wrote ocaml-yaml, it was initially a rather low-level set of bindings to handle (the enormously complex) Yaml format.

These days, it’s becoming apparent that the vast majority of “real world Yaml” is essentially the JSON subset. So it might be a good time to write a few yaml/ezjsonm combinators that work on both Yaml and JSON. The ones in Yojson.Util are a pretty good start, and contributions like this would be welcome in OCaml-Yaml :slight_smile:

3 Likes

This is a great idea @avsm :)) - I have opened an issue on ocaml-yaml: https://github.com/avsm/ocaml-yaml/issues/40

@Phage this could be a great way to learn OCaml and contribute to open-source. If you would like to, feel free to DM me and we could build it together or ping me for a review of a PR. If not, no worries, let me know either way :slight_smile:

I definitely would love to collaborate with you but I hope that you will be patient with me. I may asked too many dumb questions. But I will definite not pass up this opportunity.

1 Like