Paper Reading for aMMAI: [Summary] Aggregating local descriptors into a compact image representation

Topic: Aggregating local descriptors into a compact image representation

Author: Herve Jegou

Summary:

Accuracy, efficiency, and the memory usage are the three constraints that have to be considered jointly in image searching on a large scale.

The proposed approach contains three main parts to optimize the accuracy:

1. The presentation: how to aggregate local image descriptors into a vector representation

The proposed algorithm, VLAD(vector of locally aggregated descriptors), is to accumulate, for each visual word ci, the differences x-ci of the vectors x assigned to ci. It can be seen as a simplification of the Fisher kernel(only consider the "mean" factor). As in figure 1, we can observe that similar pictures have similar VLAD descriptors.

2. Dimensionality reduction of the vectors

It used principal component analysis(PCA) for dimensionality reduction: the eigenvectors associated with the D' most energetic eigenvalues of the covariance matrix are used to define a matrix M mapping vector x(D dimension) to x'=Mx(D' dimension). Then, by using the ADC(asymmetric distance computation) approach, it encoded the vector x' to q(x').

3. The indexing algorithm

Dataset: Holidays, UKB, Flickr

The choice of D' is constrained by the structure of ADC, which D' is a multiple of m.
The optimization is solely based on the mean square error quantization criterion.
There is a tradeoff on D'. If D' is large, the projection error vector is limited but a large quantization error is introduced.
The proposed approach significantly outperforms the state of the art.

Comments:
The simplification of GMM and proposed a new method is terrific! However, the paper doesn't explain why the weight and the variance in the GMM are not be considered. What if the weight and the variance are included? Do they really have no impact on the results?

Paper Reading for aMMAI

2013年3月14日星期四

[Summary] Aggregating local descriptors into a compact image representation

沒有留言:

張貼留言

2013年3月14日 星期四

[Summary] Aggregating local descriptors into a compact image representation

沒有留言:

張貼留言

2013年3月14日星期四