Question

I have an application that ingests data from clients on a daily/weekly basis (two different data sets, one daily and the other weekly) into a SQL Azure Database. The clients' data source depends on what software they use, so can vary from client to client. I currently have two methods of integration, depending on the client:

  1. Using Azure Data Factory and Self-Hosted Integration Runtime. In this method, the client is required to provide (within their network) a VM in which I set up the Integration Runtime, and a SQL Server database with just two tables where they dump the two datasets as required. In ADF, I create pipelines to pull the data directly from their SQL Server into my Azure SQL Database, then run necessary import procedures.
  2. Using Azure Data Factory and BLOB Storage. In this method, I provide the client with a set of Powershell scripts to be run on a schedule (Windows task scheduler) that help them to copy their exported files (.CSV) to our BLOB storage. Then, the ADF pipelines copy from the BLOB storage to the Azure SQL Db, then run the necessary import procedures.

The first method is much simpler, but in terms of infrastructure at the client end, it seems like a bit of overkill to set up a mostly blank Windows VM and a database with just a couple of data dump tables. Obviously, this can be costly if the client themselves are cloud-hosted - firing up a new VM is not cheap, so could make them think twice about using our product.

The second method requires me to set up a storage container for each client, which I feel could make administration difficult as we scale up. Also, providing scripts to run with Windows Task Scheduler doesn't feel overly elegant.

Does anybody have any alternative solutions to this scenario? Or am I on the right track?

Any insights would be greatly appreciated. Thanks.

Was it helpful?

Solution

Update: Found no better solutions.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top