I would set up a pipeline using TPL Dataflow
. You post the addresses and the actors are:
- Web page download
- Processing
- Add to DB
Use async
wherever you can (as long as the operation is truly asynchronous) and set a high MaxDegreeOfParallelism
to allow TPL
to choose the optimal value by itself.