Question

Please take a look at my TPL Dataflow network scheme below. There are a list of URLs, a number of Load blocks and a Parse block. Load blocks load HTML pages with different proxy servers and all of them linked to Parse block where CPU-bound work happens. If there were an exception during page loading, URL adds back to list.

I post URLs to Load blocks with a hand-made cycle (on pic). My question: Is there any block type, that can help with choosing Load block to post URLs to instead my hand-made cycle? For example it will post URLs to first Load clock with .InputCount <=2.

And one more. Proxy server can become unavailable during Dataflow execution. I think if I place a BufferBlock instead URLs List, then I will be able to dynamically unlink LoadBlocks with dead proxy from this BufferBlock if there is such ability. So is there a way to dynamically unlink blocks from network?

Dataflow network scheme

Was it helpful?

Solution

Is there any block type, that can help with choosing Load block to post URLs to instead my hand-made cycle? For example it will post URLs to first Load clock with .InputCount <=2.

What you can do is to have a single BufferBlock that is linked to all the load blocks. You would then set BoundedCapacity of the load blocks to something like 3 (1 item being processed + 2 in the input and output queues). With this setup, items will wait in the BufferBlock until space becomes available in one of the load blocks.

is there a way to dynamically unlink blocks from network?

Yes, LinkTo() returns an IDisposable which can be used to destroy that link (by calling Dispose()).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top