loading lots of Azure blob data in a WPF app

Question 1

As I understand it, you have a set of log file in Azure Blob storage that are formatted in a particular way (gzip) and you want to display them.

How big are these files? Are you displaying every single piece of information in the log file?

Assuming that if this is a log file, it is static and historical...meaning that once the log/gzip file is created it cannot be changed (you are not updating the gzip file once it is out on Blog storage). Only new files can be created...

One Solution

Why not create an worker role/job process that periodically goes out and scans the blob storage and builds a persisted "database" so that you can display. Nice thing about this is that you are not putting the unzipping/business logic to extract the log file in a WPF app or UI.

1) I would have the worker role scan the log file in Azure Blob storage 2) Have some kind of mechanism to track which ones where processed and a current "state" maybe the UTC date of the last gzip file 3) Do all the unzipping/extracting of the log file in the worker role 4) Have the worker role place the content in a SQL database, Azure Table Storage or Distributed Cache for access 5) Access can be done by a REST service (ASP.NET Web API/Node.js etc)

You can add more things if you need to scale this out, for example run this as a job to re-do all of the log files from a given time (refresh all). I don't know the size of your data so I am not sure if that is feasable.

Nice thing about this is that if you need to scale your job (overnight), you can spin up 2, 3, 6 worker roles...extract the content, pass the result to a Service Bus or Storage Queue that would insert into SQL, Cache etc for access.

Question 2

Simply storing the blobs isn't sufficient. The metadata you want to filter on should be stored somewhere else where it's easy to filter and retrieve all the metadata. So I think you should split this into 2 problems:

A. How do I efficiently list all "gzips" with their metadata and how can I apply a filter on these gzips in order to show them in my client application.

Solutions

Blobs: Listing blobs is slow and filtering is not possible (you could group in a container per month or week or user or ... but that's not filtering).
Table Storage: Very fast, but searching is slow (only PK and RK are indexed)
SQL Azure: You could create a table with a list of "gzips" together with some other metadata (like user that created the gzip, when, total size, ...). Using a stored procedure with a few good indexes you can make search very fast, but SQL Azure isn't the most scalable solution
Lucene.NET: There's an AzureDirectory for Windows Azure which makes it possible to use Lucene.NET in your application. This is a super fast search engine that allows you to index your 'documents' (metadata) and this would be perfect to filter and return a list of "gzips"

Update: Since you only filter on date and severity you should review the Blob and Table options:

Blobs: You can create a container per date+severity (20121107-low, 20121107-medium, 20121107-high ...). Assuming you don't have too many blobs per data+severity, you can simply list the blobs directly from the container. The only issue you might have here is that a user will want to see all items with a high severity from the last week (7 days). This means you'll need to list the blobs in 7 containers.
Tables: Even though you say table storage or db aren't an option, do consider table storage. Using partitions and row keys you can easily filter in a very scalable way (you can also use CompareTo to get a range of items (for example, all records between 1 and 7 november). Duplicating data is perfectly acceptable in Table Storage. You could include some data from the gzip in the Table Storage entity in order to show it in your WPF application (the most essential information you want to show after filtering). This means you'll only need to process the blob when the user opens/double clicks the record in the WPF application

B. How do I display a "gzip" in my application (after double clicking on a search result for example)

Solutions

Connect to the storage account from the WPF application, download the file, unzip it and display it. This means that you'll need to store the storage account in the WPF application (or use SAS or a container policy), and if you decide to change something in the backend of how files are stored, you'll also need to change the WPF application.
Connect to a Web Role. This Web Role gets the blob from blob storage, unzips it and sends it over the wire (or send it compressed in order to speed up the transfer). In case something changes in how you store files, you only need to update the Web Role