I’m receiving chunks of data containing messages with a possible garbage between them. Each message is
<message> ::= <constant prefix> <port> <length> <body of length>
So I have to skip some garbage, find a prefix and length, cut out a body and repeat that til the end of the chunk. One message could begin in one chunk and end in another, so I have to keep the ending of the previous chunk. I’ve written a simple Angstrom parser which do the job, but unfortunately it’s about 4 times slower than a very straightforward naive handwritten parser:
let prefix = 0x44BB
let msg_parser =
let open Angstrom in
let prefix_parser = BE.uint16 prefix in
let len_parser =
any_uint8 >>= fun port' ->
any_uint8 >>= fun length ->
let parity = port' land 0x10 > 0 in
let port = port' land 0xF in
let len = (length * 2) - (if parity then 1 else 0) in
return (port,len)
in
let header_parser =
prefix_parser *> len_parser
in
let msg_parser' =
header_parser >>= fun (port,len) ->
take_bigstring len >>= fun msg ->
return (port, Cstruct.of_bigarray msg)
in
many @@ fix (fun p ->
msg_parser' <|> (any_char *> p))
Is there any way to retrieve messages with a garbage in between in a more optimal manner?