Handlers for File Removal

Question 1

You could extend

.CSVSource.onFileAction(IFileWatcher watcher, Collection<String> added, Collection<String> modified, Collection<String> deleted)

by calling super.onFileAction(...) which will process the added and modified files, and add more logic to handle deleted files.

This can be done by updating the facts which has contributed a deleted file in their deletedFile field. Such a field could be filled automatically by adding the FILEPATH metadata in your LoadInstructions.csv file:

Format,FilePattern,FilePath,MetaData
FormatName,formatRegex.csv,someFolder,FILEPATH=N/A

and having a field like:

<field name="FILEPATH" type="string" indexation="dictionary" nullable="true" defaultValue="N/A" />

Question 2

If we understand correctly, and to simplify the usecase, your dataset has two measures A and B. For the same records one file brings measure 'A' and another file brings measure 'B'. And you want to freely update or delete the data for measure A or B independently.

There are several ways you can achieve this.

First you could decouple the measures: instead of records that bear both A and B fields, you would have two records with a generic "value" field, and a "mesure type" field to distinguish between both measure types. This design is flexible because you can introduce a new measure 'C' later, itself fed from another file.

The most elegant option is probably to use the ActivePivot Distributed Architecture, with Polymorphic Distribution. You would setup two independent cubes, one holding only the 'A' measure, another cube with the 'B' measure. Then join the cubes together with polymorphic distribution, ActivePivot will merge them together on the fly and present both measure as if they belonged to the same (virtual) cube.

Finally the quick and dirty solution: configure your measures as 'nullable' fields in ActivePivot. This way when you want to erase measure 'A', you actually write 'null' to the 'A' fields of your records.