How to decode a GBK string?

I’m trying fetch some content from local web site which might not use UTF-8, after Google a lot, I still couldn’t find a way in OCaml to decode the GBK string.

Meanwhile, it’s easy in Python with web_source.content.decode('gbk') , and in Golang version:

	utf8Reader := transform.NewReader(resp.Body, simplifiedchinese.GBK.NewDecoder())
	body, err := ioutil.ReadAll(utf8Reader)

Seems like Camomile can do it: https://github.com/yoriyuki/Camomile/blob/d7d8843c88fae774f513610f8e09a613778e64b3/Camomile/public/charEncoding.mli#L89

E.g.

utop # #require "camomile";;
utop # module Camomile = CamomileLibraryDefault.Camomile;;
utop # module Enc = Camomile.CharEncoding;;
utop # let gbk = Enc.of_name "GBK";;
val gbk : Enc.t = <abstr>
utop # Enc.recode_string ~in_enc:Enc.utf8 ~out_enc:gbk "Hello";;
- : string = "Hello"

Of course in your case you want to go in the opposite direction, so switch the ‘in’ and ‘out’ encodings. Check your installed packages, you may already have Camomile as a transitive dep.

4 Likes