Question

I have a reasonable number of records in an Azure Table that I'm attempting to do some one time data encryption on. I thought that I could speed things up by using a Parallel.ForEach. Also because there are more than 1K records and I don't want to mess around with continuation tokens myself I'm using a CloudTableQuery to get my enumerator.

My problem is that some of my records have been double encrypted and I realised that I'm not sure how thread safe the enumerator returned by CloudTableQuery.Execute() is. Has anyone else out there had any experience with this combination?

Was it helpful?

Solution 2

Despite my best efforts I've been unable to replicate my original problem. My conclusion is therefore that it is perfectly OK to use Parallel.ForEach loops with CloudTableQuery.Execute().

OTHER TIPS

I would be willing to bet the answer to Execute returning a thread-safe IEnumerator implementation is highly unlikely. That said, this sounds like yet another case for the producer-consumer pattern.

In your specific scenario I would have the original thread that called Execute read the results off sequentially and stuff them into a BlockingCollection<T>. Before you start doing that though, you want to start a separate Task that will control the consumption of those items using Parallel::ForEach. Now, you will probably also want to look into using the GetConsumingPartitioner method of the ParallelExtensions library in order to be most efficient since the default partitioner will create more overhead than you want in this case. You can read more about this from this blog post.

An added bonus of using BlockingCollection<T> over a raw ConcurrentQueueu<T> is that it offers the ability to set bounds which can help block the producer from adding more items to the collection than the consumers can keep up with. You will of course need to do some performance testing to find the sweet spot for your application.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top