Azure Table Storage - Indexes?

https://stackoverflow.com/questions/21938265

14-10-2022
|

Question

I have a table of entities, "stories" for example. It will contain a large list of "stories" that people can vote on.

The main feature of my application will be users reading the "top" stories, which have the most votes (and might eventually have other algorithms going on).

My first thought for the structure of the Azure table is:

RowKey = unique id
PartitionKey = ??? (maybe User Id, because you can view a User's list of stories)
Title
Description
User Id
Url

How can I effectively query against stories considered the "top" stories? Most of the traffic is going to be querying the top stories, and doesn't need to pull out ranges of stories otherwise. What I'm wanting is a way to index the top stories, but indexes are not a feature of table storage. I thought about keeping a second table, but that could get hairy if the user updates the story in the other table.

This is my first hangup using Azure Table Storage, the rest of the app is going to work great. I'd hate to upgrade to using full SQL Azure because of this one issue.

PS - I'm open to storing the "top" stories in another place besides an Azure table if it makes sense. My server will be running C# web api, but probably makes no difference.

Solution 2

Azure Table storage is a flat, non-relational data store. As such, the way you store and model data is dramatically different. A common pattern is modelling two different data-stores for different types of access. So one table for most recent, and another that's update for say "most liked".

OTHER TIPS

The Azure Storage Table Design Guide walks you through different approaches for creating your own secondary indicies. It also provides principles to considering when designing NoSQL databases and implementation guidance.

You should first reflect what "top stories" really means. Do you mean last top 10 stories or rather above specyfic rate value?

You could use rate value as partition key (eg Rate_1, Rate_2, Rate_3, Rate_4, Rate_5). But you have to round values to integers so if the value is 4.1 it will be placed into partition Rate_4.

Alternatively you can use just 2 partitions: "TopStories" and "OtherStories".

Given that

your top story algorithm may evolve overtime
the fact that is summary like information
and can be aged out

I would stay way from table storage, and instead model it in a relational database for flexibility of querying.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow