I've been reading papers on pairwise ranking and this is what I don't get:
what is the difference in the training/testing data between pointwise and pairwise ranking? This is the paper that I have been reading: http://www.cs.cornell.edu/people/tj/publications/joachims_02c.pdf
In there, it says that a data point in pairwaise ranking is an inequality between two links:
[line] .=. [inequality between two links, which is the target] qid:[qid] [[feature of both link 1 and 2]:[value of 1 and 2]] # [info]
RankLib, however, does support pairwise rankers like RankNet and RankBoost, but the datapoint format that it uses it's that of pointwise
[line] .=. [absolute ranking, which is the target] qid:[qid] [feature1]:[value1] [feature2]:[value2] ... # [info]
Is there something I am missing?
Point wise ranking is analogous to regression. Each point has an associated rank score, and you want to predict that rank score. So your labeled data set will have a feature vector and associated rank score given a query
IE: {d1, r1} {d2, r2} {d3, r3} {d4, r4}
where r1 > r2 > r3 >r4
Pairwise ranking is analogous to classification. Each data point is associated with another data point, and the goal is to learn a classifier which will predict which of the two is "more" relevant to a given query.
IE: {d1 > d2} {d2 > d3} {d3 > d4}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With