First, you don't need to coerce Azure tables into a relational structure. They're very fast and very cheap, and designed so you can dump blocks of data in and worry about the structure when you retrieve it.
Second, correctly identifying and structuring your partition keys makes retrieval even faster.
Third, Azure tables don't have to have uniform structures. You can store different kinds of data within one table, even with the same partition keys. This opens up possibilities not available to an RDBMS.
So how are you planning to retrieve the data? What are the use cases?
Let's say your primary use case is to retrieve the data by time, like an audit log. In that case, I would suggest this approach:
- Put your procedures, executions, and actions all within the same table.
- Create a new table for each unit of time that gives you tens of thousands to hundreds of thousands of rows per table, or some other unit that makes sense. (For one project I've done recently, the application's event log uses one table per month, with each table growing to around 100,000 rows.)
- Create a partition key that gives you hundreds to thousands of rows per partition. (We use hours remaining until
DateTimeOffset.MaxValue
. When you query an Azure table without using a partition key, you see the lowest partitions first. This descending-hourly scheme means the most recent hour's entries are at the top of the results pane in our Azure tool.) - Structure your row keys to be human-readable. Remember they need to be unique within the table. So possibly a row key like
Procedure_Bob_ID12345_20140514-134630Z_unique
where unique is a counter or hash would work. - When you query for data, pull back the entire partition--remember, it's just a few hundred rows--and filter the results in memory, where it's faster.
Say you have a second use case where you need to retrieve data by user name. Simple: within the same table, add a second row containing the same data but with a partition key based on the user name (bob_execution_20140514
).
Another thing to consider is storing the entire procedure etc. object graphs in the table. Getting back to our logging example, a log entry might have detailed information, so we just plop an entire block of JSON right in the table. (We're usually retrieving it in an Azure cloud service, so the network throughput isn't a meaningful constraint as Azure-to-Azure speeds within the same region are gigabits per second.)