I’m trying to use async to make a mini-crawler, when I send a request by get_body
to the target site A, it returns its body with string Deferred.t
type.
And I wrote another function fn_on_soup
in order to get the wikilink
on it.
let get_bd_wiki_abst (keyword:string) =
let%map body = get_body keyword in
let soup = parse body in
let wikilink = fn_on_soup soup in
wikilink
The wikilink
will just relocate to another site with a 302
header, so here’s what I did to get the real link:
let get_302_hdr (url:string) =
let uri = Uri.of_string url in
let%map resp_head = Cohttp_async.Client.head uri in
let hdrs = Response.headers resp_head in
match Header.get hdrs "location" with
| Some location -> location
| None -> "No redir"
Now I combine these two functions as follows:
let get_real_link =
let%map wikilink = get_bd_wiki_abst "soga" in
let%map real_link = get_302_hdr wikilink in
real_link
It turns out that the get_real_link
has type string Deferred.t Deferred.t
, I was wondering how may I make it return only one Deferred.t
?