Web scraping in OCaml


#1

What are the preferred tools for doing web scraping in OCaml? I’m interested both in what’s available for automating interaction (like the Mechanize packages for Ruby and Python) as well as the HTML parsing side.


#2

For the first part, there is mechaml (by @yannham) and for the second part, you can use lambdasoup (by @antron)!
I’m sure both authors would be very happy to have feedback. :wink:


#3

The url for lambdasoup is https://github.com/aantron/lambda-soup :wink:


#4

I can give a big thumbs up to Lambdasoup, which is an absolute pleasure to work with. It’s worked on all the random HTML I’ve thrown at it so far… thanks @antron for releasing it!


#5

perl4caml is a possibility here. You can use Perl libraries for scraping from OCaml code.