I am curious what are the methods / approaches to overcome the "cold start" problem where when a new user or an item enters the system, due to lack of info about this new entity, making recommendation is a problem.
I can think of doing some prediction based recommendation (like gender, nationality and so on).
You can cold start a recommendation system.
There are two type of recommendation systems; collaborative filtering and content-based. Content based systems use meta data about the things you are recommending. The question is then what meta data is important? The second approach is collaborative filtering which doesn't care about the meta data, it just uses what people did or said about an item to make a recommendation. With collaborative filtering you don't have to worry about what terms in the meta data are important. In fact you don't need any meta data to make the recommendation. The problem with collaborative filtering is that you need data. Before you have enough data you can use content-based recommendations. You can provide recommendations that are based on both methods, and at the beginning have 100% content-based, then as you get more data start to mix in collaborative filtering based. That is the method I have used in the past.
Another common technique is to treat the content-based portion as a simple search problem. You just put in meta data as the text or body of your document then index your documents. You can do this with Lucene & Solr without writing any code.
If you want to know how basic collaborative filtering works, check out Chapter 2 of "Programming Collective Intelligence" by Toby Segaran
Maybe there are times you just shouldn't make a recommendation? "Insufficient data" should qualify as one of those times.
I just don't see how prediction recommendations based on "gender, nationality and so on" will amount to more than stereotyping.
IIRC, places such as Amazon built up their databases for a while before rolling out recommendations. It's not the kind of thing you want to get wrong; there are lots of stories out there about inappropriate recommendations based on insufficient data.
Working on this problem myself, but this paper from microsoft on Boltzmann machines looks worthwhile: http://research.microsoft.com/pubs/81783/gunawardana09__unified_approac_build_hybrid_recom_system.pdf
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With