Immutable approach to record composition?

Say I have a text file where each line describes a property given as a key/value pair which I’m trying to use to instantiate a typed record. E.g.

name, Emir
age, 40
location, EU
type person = {name:string; age:int; location:string}

A natural fit for this problem is to instantiate a record and then mutate the fields as lines in the file are read. However, I’d like to understand what I’m suppose to do in these cases if I want a purely functional – immutable – solution? Again, I’d like to end up with a typed record with statically defined fields but I don’t want to use mutations.

I’m very grateful for your help!

What you can do is read the lines into a map first and then construct the record in one go from the map.

That will also force you to think about what to do when lines/fields are missing.

2 Likes

I see – so a Map is an immutable structure which I could conceivably transform in a recursive function to have all the key/values pairs in the file, and then I can statically map the properties of a record to functions which read from the map?

Presumably a get operation on a map will throw exceptions if it doesn’t contain a key so I can use the try / with syntax to act on missing fields.

Thank you!

I’m trying to understand how to actually make a map – it doesn’t appear to be straightforward. The Ocaml and “Real World” guides both seem to involve making a module. From a noobs perspective that strikes me as much ado about what feels like it ought to be trivial.

In this ocaml.org post here the author simply asserts, “To make a map I can do…” and then drops 50 lines of code.

Any chance of a bare bones example of how a map is made and a paragraph about how it works?

There is only one line of code in the example that you linked from ocaml.org, the rest is the signature of the generated module.

I’m just learning at the moment so could you elaborate on what that means? Module generated by what? Why do I have to generate a module for what is a={} in Python for example? Typing the following into utop returns an error:

# module MyUsers = Map.Make(String);;

Is there some bare bones instructions on how to make a map?

Defining an OCaml map requires both a type and a compare function. Writing

module MyUsers = Map.Make(String)

(which should work, what error do you get? Are you opening Core or Base by any chance?)
allow you to define the type of the key and how key are ordered at once.

You can read the rest of the tutorial that you linked to see how to use such map? There are examples given on how to build a map or add an element.

Since it seems likely that you are using Base, your issue is that base is replacing the maps and sets of OCaml with an alternative implementation: Maps and Hash Tables - Real World OCaml .

Lol, Re2 strikes again. Try this:

#require "base";;
#require "re2";;
open Base
open Re2
module MyUsers = Map.Make(String)

Error: Unbound module Map.Make

This is not Re2 but the base library.

Hi,

I will try to sum up what I have learned about Map in OCaml so far.

The Stdlib.Map module provides the functor (a function over module type) Map.Make. It takes as an argument the module types of the key of your map, for example Map.Make (String) will return the implementation of a map with string keys, but you can use any modules having val compare : t -> t -> int in their signatures. You can define the map with string as key:

module StringMap = Map.Make (String)

or map with a custom type as key:

module CustomMap = Map.Make (struct
   type t = custom_type

   let compare c1 c2 = (* compare c1 to c2 *) 
end)

And such as for list, you can specify the type of the map values:

type map_string_to_int = int StringMap.t

let map : map_string_to_int = StringMap.singleton "key1" 10
1 Like

Its actually both. Just the following will cause the same problem. Maybe because Re2 imports Base? Either way, the regular expression module causing the Map interface not to work is a bit surprising for a noob.

#require "re2";;
open Re2
module MyUsers = Map.Make(String)

Indeed, I missed the fact that the Re2 module defines its own Map module. In other words, the module is really not designed to be opened globally.

2 Likes

I am not sure exactly how you intend to represent your data, but if you intend to look up the data by reference to a person’s name then possibly what you want is a map with the relevant person’s name as the key, together with a record or tuple containing the age and location for that name as the value for the key.

If you are used to mutable maps then using immutable ones can initially be a bit of a puzzle, and I think from your posting you may also be finding that problematic. The way it is normally done is to start with an empty map and then to use recursive iteration to add to it until you have built the final map that you want. As an example, if you are using a unix-like OS then this will provide a set containing the regular files in your current directory (this uses the standard library, not Base):

module FileSet = Set.Make(String)

let get_files () =
  let open Unix in
  let dir = opendir "." in
  let rec loop set =
    try
      let entry = readdir dir in
      match (stat entry).st_kind with
        S_REG -> loop (FileSet.add entry set)
      | _ -> loop set
    with End_of_file -> closedir dir ; set in
  loop FileSet.empty

let () =
  let files = get_files () in
  FileSet.iter (fun f -> print_endline f) files

If you are in the REPL you will need to #load "unix.cma" to run it from the REPL.

Thanks. It was straightforward once I found out immutable maps exist. The snippet from my code looks like this:

module StringMap = Map.Make (String)

let rec to_map ?m f =
  let m = match m with
    | None -> StringMap.empty
    | Some v -> v in
  let l = f() in
  match l with
  | None -> to_map ~m:m f
  | Some (k,v) when k = "Game" -> Some (StringMap.add k v m)
  | Some (k,v) -> to_map ~m:(StringMap.add k v m) f

Its my first day with OCaml, and I’m just trying to make something useful with it to help me learn. The functional bits I’m used to, now I’m trying to find some sort of proactive usefulness for the type system in my use cases (mostly data science).

1 Like

You can simplify a bit:

-| Some (k,v) when k = "Game" -> Some (StringMap.add k v m)
+| Some ("Game" as k, v) -> Some (StringMap.add k v m)

EDIT: one more tip–OCaml supports default values for arguments, and punning:

let rec to_map ?(m=StringMap.empty) f =
  match f () with
  | None -> to_map ~m f
  | Some ("Game" as k, v) -> Some (StringMap.add k v m)
  | Some (k, v) -> to_map ~m:(StringMap.add k v m) f
1 Like

Oooo that last edit is nice. Much shorter. Thanks.

Could I ask you to do your magic on this function too? It looks and feels ugly – like it could be written more elegantly.

let to_pairs x =
  let re = Re2.create_exn "([A-Za-z]+) \"([^\"]+)" in
  try
    let match_ = Re2.first_match_exn re x in
    let k = Re2.Match.get match_ ~sub:(`Index 1) in
    let v = Re2.Match.get match_ ~sub:(`Index 2) in
    Some ((Option.value k ~default:""),(Option.value v ~default:""))
  with
    _ -> let re = Re2.create_exn "^(.+)$" in
         try Some ("Game", (Re2.find_first_exn re x))
         with _ -> None

The function above parses lines from a file and turns every line into a pair. Here is a sample of what’s in the file:

[Event "Rated Classical game"]
[Site "https://lichess.org/j1dkb5dw"]
[White "BFG9k"]
[Black "mamalak"]
[Result "1-0"]
[UTCDate "2012.12.31"]
[UTCTime "23:01:03"]
[WhiteElo "1639"]
[BlackElo "1403"]
[WhiteRatingDiff "+5"]
[BlackRatingDiff "-8"]
[ECO "C00"]
[Opening "French Defense: Normal Variation"]
[TimeControl "600+8"]
[Termination "Normal"]

1. e4 e6 2. d4 b6 3. a3 Bb7 4. Nc3 Nh6 5. Bxh6 gxh6 6. Be2 Qg5 7. Bg4 h5 8. Nf3 Qg6 9. Nh4 Qg5 10. Bxh5 Qxh4 11. Qf3 Kd8 12. Qxf7 Nc6 13. Qe8# 1-0