Question

I have a WCF service hosted on Windows Azure as a "cloud service." When the service starts, it needs to populate data from files/disk to its memory so it is accessed fast (cached in other words). Right now I'm using like C:\Documents\Filestoprocess folder so that the WCF calls the folder and populates data data in that folder in its memory. I have like 5,000 small files. How do I do this in Azure? Is there a folder path that I can call within the WCF so that the WCF calls these files and opens each files and saves each data in the files? I'm not really looking for complicated Blob access through network using bandwidth. I'm looking for simple disk I/O access to these files from the WCF "cloud service" that is running on its own public web address.

Was it helpful?

Solution

Blob access is not complex. In fact, you could do a single download of a zip file from blob storage to local disk, unzip it, then prime your wcf service from those 5,000 small files.

Check out this msdn page documenting DownloadBlobToFile(). The essential parts:

CloudBlobClient blobClient = 
        new CloudBlobClient(blobEndpoint, new StorageCredentialsAccountAndKey(accountName, accountKey));

    // Return a reference to the blob.
    CloudBlob blob = blobClient.GetBlobReference("mycontainer/myblob.txt");

    // Download the blob to a local file.
    blob.DownloadToFile("c:\\mylocalblob.txt");

Now: I don't agree with saving to the root folder on C:. Rather, you should grab some local storage (easily configurable). Once you configure local storage in your role configuration, just ask the role environment for it, and ask for root path:

var localResource = RoleEnvironment.GetLocalResource("mylocalstorage");
var rootPath = localResource.RootPath;

Note: As @KingPancake mentioned, you could use an Azure drive. However: remember that an Azure drive can only be writeable by one instance. You'd need to make additional snapshots for your other instances. I think it's much simpler for you to go with a simple blob, copy your files down (either as single zip or individual files), and go from there.

You mentioned concern with network+bandwidth. You don't pay for bandwidth within the same data center. Also: It's extremely fast: 100Mbps per core. So even with a Small instance, you'll have your files copied down very quickly, moreso when you go to larger instance sizes.

One last thought: The only other ways to gain access to your 5,000 files, without using blob storage or Azure Drives (which are mounted as vhd's in blob storage) would be to either download the files from an external source or bundle them with your Windows Azure package (and then they'd show up in your app's folder, under whatever subfolder you stuck them in). Bundling has two downsides:

  • Longer time to upload your deployment package due to added size
  • Inability to change any of the individual files without redeploying the package.

By storing in a blob, you can easily change one (or all) of your small files without redeploying your code - you'd just need to signal it to either re-read from blob storage or restart the instances so they automatically download the new files.

OTHER TIPS

You should try to use a cloud storage service to store data, as if you write to the local file system it can get destroyed on a restart of the service or recycling of the service.

You can look into using the azure drive service, which is like creating a disk dive. It is on top of blob storage.

But if you really want to write and read data on the local file system check out this blog post http://blog.codingoutloud.com/2011/06/12/azure-faq-can-i-write-to-the-file-system-on-windows-azure/

It talks about setting up your service definition to allow writing to the local file system.

Depending on the size of your instances you'll get a non-presistent disk where you can store this kind of temporary data. The minimum is 20GB for an extra small instance. You shouldn't access the disk directly, but you need to use a local resource instead which you can configure in your service definition file or in Visual Studio (double click your Web / Worker Role).

This storage is non-persistent, this means if you delete your deployment, if you decrease the number of instances, in case of hardware problems, ... you loose all data saved here. If you want to persist your files you should use blob storage instead. But in your case, where you need the files as some kind of caching mechanism, local resources are perfect.

And if your goal is to cache data you might want to take a look at the caching features included in Windows Azure: Caching in Windows Azure

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top