سؤال

I have a table with a bunch of things written to it. Each entity in that table will have info needed to tell a worker role to write some of that information to a particular table's partition. Entities in the same partition of the table that the worker role is reading can be batched together (will be writing to the same partition). I don't know the partition keys in the table with info for worker role. Is there a way to do a full table scan that will not grab entities from different partitions at the same time?

Ex:

Partition Key| Row Key  
========================
1            |  X       
1            |  Y       
1            |  Z       
2            |  X       
2            |  Y       
2            |  Z       

I would want to scan that table and get all of the first partition that is retrieved when doing a table scan of the above table without bleeding into the next partition. After that the worker role will do some operations based on what was read and then continue on to the next partition and do the same thing. If table storage cannot grab a group of entities in a table that are in different partitions while doing a table scan then I think I can use the paging token's next partition property to determine if a new partition will be read from next.

هل كانت مفيدة؟

المحلول

A single table query can return entities with different partition keys. To get only entities with a single partition key, you first need to determine what the next partition key is with a query for PartitionKey > lastKnownParitionKey that only returns a single result ($top=1). Then you can do a partition scan (PartitionKey = currentPartitionKey) to get all of the entities with this partition key.

If you typically have less than, say, ten entities per partition key, consider optimizing by increasing the first request to get ten entities and discard any that don't have the partition key you are looking for.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top