Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neural Network based ranking of documents

I'm planning of implementing a document ranker which uses neural networks. How can one rate a document by taking in to consideration the ratings of similar articles?. Any good python libraries for doing this?. Can anyone recommend a good book for AI, with python code.

EDIT

I'm planning to make a recommendation engine which would make recommendations from similar users as well as using the data clustered using tags. User would be given chance to vote for articles. There will be about hundred thousand articles. Documents would be clustered based on their tags. Given a keyword articles would be fetched based on their tags and passed through a neural network for ranking.

like image 841
jvc Avatar asked Sep 26 '11 12:09

jvc


People also ask

What are neural ranking models?

Neural ranking models have been used to extract feature representations for query and document using text data. For example, a deep neural network model can be used to map the query and documents to feature vectors independently, and then a relevance score is calculated using the extracted features.

What is the best algorithm for learning to rank?

RankNet, LambdaRank, and LambdaMART are popular learning to rank algorithms developed by researchers at Microsoft Research. All make use of pairwise ranking.

What is ranking in NLP?

In ranking creation, given a request, one wants to generate a ranking list of offerings based on the features derived from the request and the offerings. In ranking aggregation, given a request, as well as a number of ranking lists of offerings, one wants to generate a new ranking list of the offerings.

Which neural network is best for data classification?

Radial Basis Function Networks (RBFNs) RBFNs are special types of feedforward neural networks that use radial basis functions as activation functions. They have an input layer, a hidden layer, and an output layer and are mostly used for classification, regression, and time-series prediction.


1 Answers

The problem you are trying to solve is called "collaborative filtering".

Neural Networks

One state-of-the-art neural network method is Deep Belief Networks and Restricted Boltzman Machines. For a fast python implementation for a GPU (CUDA) see here. Another option is PyBrain.

Academic papers on your specific problem:

  • This is probably the state-of-the-art of neural networks and collaborative filtering (of movies):

    Salakhutdinov, R., Mnih, A. Hinton, G, Restricted Boltzman Machines for Collaborative Filtering, To appear in Proceedings of the 24th International Conference on Machine Learning 2007. PDF

  • A Hopfield network implemented in Python:

    Huang, Z. and Chen, H. and Zeng, D. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems (TOIS), 22, 1,116--142, 2004, ACM. PDF

  • A thesis on collaborative filtering with Restricted Boltzman Machines (they say Python is not practical for the job):

    G. Louppe. Collaborative filtering: Scalable approaches using restricted Boltzmann machines. Master's thesis, Universite de Liege, 2010.
    PDF

Neural networks are not currently the state-of-the-art in collaborative filtering. And they are not the simplest, wide-spread solutions. Regarding your comment about the reason for using NNs being having too little data, neural networks don't have an inherent advantage/disadvantage in that case. Therefore, you might want to consider simpler Machine Learning approaches.

Other Machine Learning Techniques

The best methods today mix k-Nearest Neighbors and Matrix Factorization.

If you are locked on Python, take a look at pysuggest (a Python wrapper for the SUGGEST recommendation engine) and PyRSVD (primarily aimed at applications in collaborative filtering, in particular the Netflix competition).

If you are open to try other open source technologies look at: Open Source collaborative filtering frameworks and http://www.infoanarchy.org/en/Collaborative_Filtering.

like image 163
cyborg Avatar answered Nov 06 '22 03:11

cyborg