Mahout - JPA integration. Do I need a CSV file?
-
30-10-2019 - |
Question
I have an existing data model using openJPA and I am trying to integrate a CF system using Mahout.
Forgive me if this is a bone head question, but I just started researching mahout. Mahout in action is in the mail, so I should be up to speed soon.
My question is how to integrate mahout with an existing jpa model. Do I need to provide a CSV file to the DataModel class, or can I extend DataModel to read directly from my existing dataSource. I realize it wouldn't be very complicated to generate a CSV file from my data, but doing this seems to be an unnecessary intermediate step.
I am very new to the "large data set" world, so forgive my ignorance. But do most systems that use Mahout use a CSV data set? Somehow this seems strange to me.
Thanks.
Edit:
So I am reading the preview amazon provides on Mahout in Action. It seems that you can have mahout interface directly into your DB, but you do so at the cost of performance. I can't wait to get my hands on this book. Any comments or tips about this would still be very much appreciated.
No correct solution