I want to implement Amazon-like recommendations in Alfresco.
For instance, if an employee searches for "financial reports 2007", the search UI will show related documents, for instance documents that were downloaded/viewed by users who previously searched for the same thing.
It might show documents that would not have been found by Lucene (which Alfresco uses).
For instance, has anyone integrated Alfresco with Apache Mahout or pysuggest?
The good thing is that alfresco by default supports references (associations). So you can define many usefull relations between documents. For example:
Document->User => viewed-by
Document->User => searched-by
Document->User => downloaded-by
Document->Document => Related-to
Document->Document => Same-year
...
You can catch/implement most of the events using alfresco policies/behaviours (http://wiki.alfresco.com/wiki/Policy_Component). For example: when onCreate event occurs (document is created) do a search for documents with same author and link this document (add associations) to them.
Then you can implement a custom search (webscript maybe) that will return results and for each result also return it's references (associations).
The only thing that worries me is that some events would probably be only accessible via audit log which I have no idea how to capture programatically using java.
In the end you can then feed this stuff to your engine that will learn on that.
Interesting topic! Recently I read about Mahout in context of Lucene/Solr. There are some people deeply involved in Mahout at Lucidimaginations, see:
Since Lucene/Solr is part of Alfresco you could think about integrating it at search engine level. Additionally you could ask to canoo company (Basel, Switzerland). In the past they offered us an interesting solution for a multi-platform related-document engine they developed based on Solr.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With