Paper Reading for aMMAI: [Summary] Learning to Rank: Rrom Pairwise Approach to Listwise Approach

Topic: Learning to Rank: Rrom Pairwise Approach to Listwise Approach

Author: Cao

Summary:

The paper is concerned with learning to rank, which is to construct a model or a function for ranking objects. Several methods for learning to rank have been proposed, which take object pairs as ‘instances’ in learning. We refer to them as the pairwise approach in this paper. However, it ignores the fact that ranking is a prediction task on list of objects. The paper postulates that learning to rank should adopt the listwise approach in which lists of objects are used as ‘instances’ in learning.

Listwise Approach
Create a ranking function f ; for each feature vector x(i)j (corresponding to document d(i)j ) it outputs a score f (x(i)j ). For the list of feature vectors x(i) we obtain a list of scores z(i)=(f(x(i)1),..., f(x(i)n(i))). The objective of learning is formalized as minimization of the total losses with respect to the training data.
In ranking, when a new query q(i') and its associated documents d(i') are given, we construct feature vectors x(i') from them and use the trained ranking function to assign scores to the documents d(i'). Finally we rank the documents d(i') in descending order of the scores. We call the learning problem described above as the listwise approach to learning to rank.

Probability Models
Proposed probability models to calculate the listwise loss function
Permutation Probability
Suppose that the set of objects to be ranked are identified with the numbers 1, 2, ..., n. A permutation pi on the objects is defined as a bijection from {1, 2, ..., n} to itself. Suppose that there is a ranking function which assigns scores to the n objects. We use s to denote the list of scores s = (s1, s2, ..., sn), where sj is the score of the j-th object.

Theorem 3 indicates that for any ranking list based on the given ranking function, if we exchange the position of an object with higher score and the position of an object with lower score, we obtain a ranking list with lower permutation probability. Theorem 4 indicates given a ranking function, the list of objects sorted based on the ranking function has the highest permutation probability, while the list of objects sorted in the inverse order has the lowest permutation probability.

Given two lists of scores, we can first calculate two permutation probability distributions from them, and then calculate the distance between the two distributions as the listwise loss function.

Top One Probability
The top one probability of an object represents the probability of its being ranked on the top, given the scores of all the objects.

The top one probability of object j equals the sum of the permutation probabilities of permutations in which object j is ranked on the top.

Learning Method: ListNet
We employ a new learning method for optimizing the listwise loss function based on top one probability, with Neural Network as model and Gradient Descent as optimization algorithm. For simplicity, we define phi in Definition 1 as an exponential function.

With Cross Entropy as metric, the loss for query q(i) becomes

With some derivation (please refer to Appendix), we can get the gradient of L(y(i), z(i)(fw)) with respect to the parameter w as follow

Notice that ListNet is similar to RankNet. The only major di erence lies in that the former uses document lists as instances while the latter uses document pairs as instances; the former utilizes a listwise loss function while the latter utilizes a pairwise loss function. Interestingly, when there are only two documents for each query, i.e., n(i) = 2, then the listwise loss function in ListNet becomes equivalent to the pairwise loss function in RankNet.

Comment:
The method is interesting and the performance is also outperformed to other method. The transformation between cross entropy and KL-divergence plus entropy makes the method can be processed in another way. Furthermore, the paper is much easier to read than the one in the last week.

Paper Reading for aMMAI

2013年4月18日星期四

[Summary] Learning to Rank: Rrom Pairwise Approach to Listwise Approach

沒有留言:

張貼留言

2013年4月18日 星期四

[Summary] Learning to Rank: Rrom Pairwise Approach to Listwise Approach

沒有留言:

張貼留言

2013年4月18日星期四