I’m working on a tool for ATD, I want to read the ‘version’ field of a given JSON document and depending on the version use the right read_ parser from ATD. To do this, I have parameters of type Yojson.lexer_state and Lexing.lexbuf as input. If use these lexer_state and lexbuf just to read the version of the JSON file or to parse it, it works, but if I do both, it fails, I guess because they get to an invalid state.
For Yojson.lexer_state I can just create a fresh one with Yojson.init_lexer but what about Lexing.lexbuf? Is there a way to reuse it on the same file/channel? Should I duplicate it? How?
The easiest way is to simply to recreate the lexbuf. If it is based on a channel, you will need to either open a new channel for the new lexbuf or rewind the existing one.
Yeah, that’s what I thought, but I would like to keep the same signature as ATD read_ functions, so given a Yojson.lexer_state and a Lexing.lexbuf how do I recreate a new lexbuf? If it’s from a file, I guess I can get it from fname in Yojson.lexer_state but otherwise, how can I know from which channel it was open?
How can I rewind an existing lexbuf?
If you’re willing to do a little surgery, I’ll bet you can
(1) force-read enough into the buffer, to ensure that the first round of reading will not incur a file-read: say, a couple of kilobytes grin
(2) deep-copy enough of the lexbuf and state that you can reuse the copy.
That is to say, do the cloning before using either clone.
val read_foo :
Yojson.Safe.lexer_state -> Lexing.lexbuf -> foo
This is rather low-level. Both lexer_state and lexbuf are mutable, and there are no documented guarantees on what’s valid besides starting from a fresh lexer_state and a fresh lexbuf.
Like @nojb suggested, I think I would keep it simple and do a full re-read of the input file or string, and let the library initialize lexer_state and lexbuf. The file should be cached by the OS, so it’s probably not terribly expensive to re-open it. If the input is from stdin, then read it into a string and use that string as input.
Reading a foo from a string is done with foo_of_string data, and reading from a file is done with Atdgen_runtime.Util.Json.from_file read_foo path.
OK, so I’d rather recreate the read functions from Atdgen_runtime.Util.Json with functions that re-read the channel than trying at all cost to recreate a Lexing.lexbug?