Question

How good is to use a query to get record/records by rowkey? Does it make any sense of having queries that check's only the row key? I know you can combine primarykey + rowkey and get a specific record or then get all records by a specific partition key (so all related data will be returned fast).

I guess that if you make a query that looks for rowkey then the performance will go down when you query a table with high amount of items as it has to look through all of them.

Can you guys tell a case when querying by rowkey (alone) make's sens? I don't mean into the results retrieved but in the query sent to azure storage to return the items.

Was it helpful?

Solution

Azure Table Storage (as of now) builds two indexes that makes lookups faster/fast which are the PartitionKey and Rowkey. Querying by the rowkey only would make sense if you had one partition (or very few partitions). If you have a lot of partitions and you just specify the rowkey it will have to look up all partitions.

For example, say you stored social security numbers in table storage. Let's look at two scenarios...

A good partition strategy might be to have the state as the partition key. In your query if you just pass PartitionKey='CA' & RowKey ='123456789' Azure Table Storage knows the partition to go to and the exact row in that partition. If your query was just: RowKey = '123456789', Azure Table storage has to scan all the partitions (50 states) to find the matching RowKey.

Another strategy might be one huge single partition with the rowkeys as social security numbers. If your query: RowKey = '123456789' then Azure Table storage can use the index on the rowkey to lookup the value pretty quick. Since there is only one partition, the PartitionKey not being part of the query won't slow it down (or at least should not).

Also remember, Azure Table Storage internally can put partitions on different drives for optimizations for heavy usage. So specifying the partitionkey for large tables with lots of partitions is ideal.

OTHER TIPS

As Bart Czernicki also mentioned, only specifying Row Key in the query would lead to a full table scan, as the server will need to go through all partitions in the table. Please find more about this subject in the How to get most out of Windows Azure Tables article (specifically the "Partitions" section).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top