Question

I want to implement Amazon-like recommendations in Alfresco.

For instance, if an employee searches for "financial reports 2007", the search UI will show related documents, for instance documents that were downloaded/viewed by users who previously searched for the same thing.

It might show documents that would not have been found by Lucene (which Alfresco uses).
For instance, has anyone integrated Alfresco with Apache Mahout or pysuggest?

Was it helpful?

Solution

We've integrated Mahout into Alfresco to provide Content Recommendation based on similar content users have viewed and also based on how users have been rating content. The Alfresco Mahout integration code is available at

https://github.com/zaizi/alfresco-recommendations

This provides Amazon style content recommendation services. It can be extended to recommend similar search phrases.

OTHER TIPS

The good thing is that alfresco by default supports references (associations). So you can define many usefull relations between documents. For example:

Document->User => viewed-by

Document->User => searched-by

Document->User => downloaded-by

Document->Document => Related-to

Document->Document => Same-year

...

You can catch/implement most of the events using alfresco policies/behaviours (http://wiki.alfresco.com/wiki/Policy_Component). For example: when onCreate event occurs (document is created) do a search for documents with same author and link this document (add associations) to them.

Then you can implement a custom search (webscript maybe) that will return results and for each result also return it's references (associations).

The only thing that worries me is that some events would probably be only accessible via audit log which I have no idea how to capture programatically using java.

In the end you can then feed this stuff to your engine that will learn on that.

Interesting topic! Recently I read about Mahout in context of Lucene/Solr. There are some people deeply involved in Mahout at Lucidimaginations, see:

Since Lucene/Solr is part of Alfresco you could think about integrating it at search engine level. Additionally you could ask to canoo company (Basel, Switzerland). In the past they offered us an interesting solution for a multi-platform related-document engine they developed based on Solr.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top