Question

I use the FileDataModel as the DataModel for Recommendations in Mahout. I first generate the base file (e.g. prefs.txt). From time to time, there are some changes, which are written to update files (prefs.1.txt, prefs.2.txt, ...).

Am I allowed to delete the old update-Files after loading them to the model? When I try deleting them (in Windows), the Explorer say's that the file is currently used by Java. Why is it the case that it's not allowed to delete the original file? I believe that the data is now stored in memory and thus Mahout doesn't need the file anymore.

Was it helpful?

Solution

It doesn't re-read the old files, only the new update data. However it does expect that the "main" data file is always there since it keeps looking for its last modified time.

The general idea is that you sometimes push a full copy of the data file, and in between, more frequently, push small update files. If you do that it should work as expected, and, yes, you can delete update files once they've been read.

(OF course, if you rebooted your server, it would have to start over from the last main data file and whatever update files were left, which could be incomplete or inconsistent. I'd only delete updates after you've pushed a new main data file.)

I don't know why you can't delete them as they are never held open. Maybe it's a weird Windows thing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top