Empowering Visual Categorization with the GPU

Empowering Visual Categorization with the GPU

The paper “Empowering Visual Categorization with the GPU” by Koen E. A. van de Sande, Theo Gevers, and Cees G. M. Snoek is now officially published in IEEE Transactions on Multimedia. In this paper, we analyze the bag-of-words model for visual categorization, the most powerful method in the literature, in terms of computational cost and identify two major bottlenecks: the quantization step and the classification step. We address these two bottlenecks by proposing two efficient algorithms for quantization and classification by exploiting the GPU hardware and the CUDA parallel programming model. The algorithms are designed to 1) keep categorization accuracy intact, 2) decompose the problem, and 3) give the same numerical results. In the experiments on large scale datasets, it is shown that, by using a parallel implementation on the Geforce GTX260GPU, classifying unseen images is 4.8 times faster than a quad-core CPU version on the Core i7 920, while giving the exact same numerical results. In addition, we show how the algorithms can be generalized to other applications, such as text retrieval and video retrieval. Moreover, when the obtained speedup is used to process extra video frames in a video retrieval benchmark, the accuracy of visual categorization is improved by 29%.

This entry was posted in Science. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *