Examples using compiler-libs Lexer

I wonder if anyone has any example code using compiler-libs Lexer module, eg. just to read in an OCaml *.ml file.

What I’m actually trying to do is lex some OCaml source files, to discover “long” multi-line strings that might be converted into quoted literals (ie. {|…|}) . So I want something that does the lexing step (but not parsing) and then lets me call my own function on each string that is found so I can decide if I want to convert it or not.

(Something like this commit but more automatically and across several large code bases, and yes I did try to write an OCaml lexer in Perl first, but that didn’t exactly go well!).

let () =
  Lexer.init ();
  let fp = open_in Sys.argv.(1) in
  let lexbuf = Lexing.from_channel fp in
  let rec loop () =
    let token = Lexer.token lexbuf in
    (match token with
     | Parser.STRING (s, loc, sopt) ->
        Printf.printf "string = %s\n" s;
     | _ -> ()
    );
    loop ()
  in
  loop ()

(* ocamlfind opt -package compiler-libs.common find_strings.ml  -o find_strings.opt -linkpkg *)

This is my initial attempt, but it just spins using 100% CPU.

It is spinning because the OCaml lexer returns a dummy EOF token on end-of-file. You should end your recursive loop when you come across it. Otherwise, code looks OK to me.

Cheers,
Nicolas

1 Like

Excellent, it works! For reference the final code is:

let () =
  Lexer.init ();
  let fp = open_in Sys.argv.(1) in
  let lexbuf = Lexing.from_channel fp in
  let rec loop () =
    let token = Lexer.token lexbuf in
    (match token with
     | Parser.STRING (s, loc, sopt) ->
        Printf.printf "string = %S\n" s;
        loop ();
     | EOF -> ();
     | _ ->
        loop ();
    )
  in
  loop ()

1 Like