Question

Here is what I am going to do:

I have a list of task and I need to run them all every 1 hour (scheduling).

All those tasks are similar. for example, for one task, I need to download some data from a server (using http protocol and would take 5 - 8 seconds) and then do a computation on the data (would take 1 - 5 seconds).


I think I can use lwt to achieve these, but can't figure out the best way for efficiency.


For the task scheduling part, I can do like this (How to schedule a task in OCaml?):

let rec start () = 
  (Lwt_unix.sleep 1.)  >>= (fun () -> print_endline "Hello, world !"; start ())

let _ = Lwt_main.run (start())  

The questions come from the actual do_task part.

So a task involves http download and computation.

The http download part would have to wait for 5 to 8 seconds. If I really execute each task one by one, then it wastes the bandwidth and of course, I wish the download process of all tasks to be in parallel. So should I put this download part to lwt? and will lwt handle all the downloads in parallel?

By code, should I do like this?:

let content = function
  | Some (_, body) -> Cohttp_lwt_unix.Body.string_of_body body
  | _ -> return ""


let download task = 
  Cohttp_lwt_unix.Client.get ("http://dataserver/task?name="^task.name)

let get_data task = 
  (download task)  >>= (fun response -> Lwt.return (Content response))

let do_task task = 
  (get_data task) >>= (fun data -> Lwt.return_unit (calculate data))

So, through the code above, will all tasks be executed in parallel, at least for the http download part?

For calculation part, will all calculations be executed in sequence?

Furthermore, can any one briefly describe the mechanism of lwt? Internally, what is the logic of light weight thread? Why can it handle IO in parallel?

Was it helpful?

Solution

To do parallel computation using lwt, you can check the lwt_list module, and especially iter_p.

val iter_p : ('a -> unit Lwt.t) -> 'a list -> unit Lwt.t

iter_p f l call the function f on each element of l, then waits for all the threads to terminate. For your purpose, it would look like :

let do_tasks tasks = List.iter_p do_task tasks

Assuming that "tasks" is a list of task.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top