Sedlex (why not ocamllex?)

  1. Still under development [more accurate to say: “it’s raw code, no doc, just starting off”], and currently I’m getting the testsuite to work. Also, this isn’t for YAML, but for what I’m calling (provisionally) “Block Style for JSON” (BS4J). That is to say, it subtracts a lot of what makes YAML hard to parse, as well as the bits that aren’t JSON:
    • anchors
    • tags
    • complex keys
    • special characters aren’t allowed in unquoted strings
      and it adds C++ “raw string literals” for multiline scalars.
  1. I assume you mean “how to handle unicode when parsing with ocamllex”. It’s been nearly 20yr since I did the hack, and the code’s been lost to the sands of time.

P.S. I think the current YAML spec is a complete mess. Nobody designing a “language” would design it at the level of characters; rather, one would define the lexemes of the language, and then the grammar, as is done for JSON. Y’know, the way we define all other languages.

1 Like