Designing a Windows Azure Tables database for storing checkboxes states

Question 1

First, you don't need to coerce Azure tables into a relational structure. They're very fast and very cheap, and designed so you can dump blocks of data in and worry about the structure when you retrieve it.

Second, correctly identifying and structuring your partition keys makes retrieval even faster.

Third, Azure tables don't have to have uniform structures. You can store different kinds of data within one table, even with the same partition keys. This opens up possibilities not available to an RDBMS.

So how are you planning to retrieve the data? What are the use cases?

Let's say your primary use case is to retrieve the data by time, like an audit log. In that case, I would suggest this approach:

Put your procedures, executions, and actions all within the same table.
Create a new table for each unit of time that gives you tens of thousands to hundreds of thousands of rows per table, or some other unit that makes sense. (For one project I've done recently, the application's event log uses one table per month, with each table growing to around 100,000 rows.)
Create a partition key that gives you hundreds to thousands of rows per partition. (We use hours remaining until DateTimeOffset.MaxValue. When you query an Azure table without using a partition key, you see the lowest partitions first. This descending-hourly scheme means the most recent hour's entries are at the top of the results pane in our Azure tool.)
Structure your row keys to be human-readable. Remember they need to be unique within the table. So possibly a row key like Procedure_Bob_ID12345_20140514-134630Z_unique where unique is a counter or hash would work.
When you query for data, pull back the entire partition--remember, it's just a few hundred rows--and filter the results in memory, where it's faster.

Say you have a second use case where you need to retrieve data by user name. Simple: within the same table, add a second row containing the same data but with a partition key based on the user name (bob_execution_20140514).

Another thing to consider is storing the entire procedure etc. object graphs in the table. Getting back to our logging example, a log entry might have detailed information, so we just plop an entire block of JSON right in the table. (We're usually retrieving it in an Azure cloud service, so the network throughput isn't a meaningful constraint as Azure-to-Azure speeds within the same region are gigabits per second.)

Question 2

Depending on usage approach use either Procedure ID or a combination of ProcedureID-ExecutionID. Don't worry about building a quasi-relational model - just choose the right partition key based on how you are most likely to create or consume the data in the majority of cases (i.e. will you care more about procedures, executions, assignees or steps in the longer term and how might you retrieve all items related to a single entity such a procedure in a single query?)

Depending on volume of steps in a procedure you might not even care too much about how step values are tracked (maybe using an integer or enum that could be combined via a bitwise operator?) see - Most common C# bitwise operations on enums

Question 3

The selection of PK, RK and other table properties depends on how you are going to use the data, your dominant query and application behavior. The storage team blob (http://blogs.msdn.com/b/windowsazurestorage/archive/2010/11/06/how-to-get-most-out-of-windows-azure-tables.aspx) has guidance on this for common scenarios.