Published May 20th, 2013
The paper “Bootstrapping Visual Categorization With Relevant Negatives” by Xirong Li, Cees Snoek, Marcel Worring, Dennis Koelma, and Arnold Smeulders appears in the current issue of IEEE Transaction on Multimedia. Learning classifiers for many visual concepts are important for image categorization and retrieval. As a classifier tends to misclassify negative examples which are visually similar to positive ones, inclusion of such misclassified and thus relevant negatives should be stressed during learning. User-tagged images are abundant online, but which images are the relevant negatives remains unclear. Sampling negatives at random is the de facto standard in the literature. In this paper, we go beyond random sampling by proposing Negative Bootstrap. Given a visual concept and a few positive examples, the new algorithm iteratively finds relevant negatives. Per iteration, we learn from a small proportion of many user-tagged images, yielding an ensemble of meta classifiers. For efficient classification, we introduce Model Compression such that the classification time is independent of the ensemble size. Compared with the state of the art, we obtain relative gains of 14% and 18% on two present-day benchmarks in terms of mean average precision. For concept search in one million images, model compression reduces the search time from over 20 h to approximately 6 min. The effectiveness and efficiency, without the need of manually labeling any negatives, make negative bootstrap appealing for learning better visual concept classifiers.
Published April 17th, 2013
The ICMR2013 paper ‘Recommendations for Video Event Recognition Using Concept Vocabularies’ by Amirhossein Habibian, Koen van de Sande and Cees Snoek is now available. Representing videos using vocabularies composed of concept detectors appears promising for event recognition. While many have recently shown the benefits of concept vocabularies for recognition, the important question what concepts to include in the vocabulary is ignored. In this paper, we study how to create an effective vocabulary for arbitrary event recognition in web video. We consider four research questions related to the number, the type, the specificity and the quality of the detectors in concept vocabularies. A rigorous experimental protocol using a pool of 1,346 concept detectors trained on publicly available annotations, a dataset containing 13,274 web videos from the Multimedia Event Detection benchmark, 25 event groundtruth definitions, and a state-of-the-art event recognition pipeline allow us to analyze the performance of various concept vocabulary definitions. From the analysis we arrive at the recommendation that for effective event recognition the concept vocabulary should i) contain more than 200 concepts, ii) be diverse by covering object, action, scene, people, animal and attribute concepts, iii) include both general and specific concepts, and iv) increase the number of concepts rather than improve the quality of the individual detectors. We consider the recommendations for video event recognition using concept vocabularies the most important contribution of the paper, as they provide guidelines for future work.
Published April 17th, 2013
The ICMR2013 paper ‘Searching Informative Concept Banks for Video Event Detection’ by Masoud Mazloom, Efstratios Gavves, Koen van de Sande and Cees Snoek is now available. An emerging trend in video event detection is to learn an event from a bank of concept detector scores. Different from existing work, which simply relies on a bank containing all available detectors, we propose in this paper an algorithm that learns from examples what concepts in a bank are most informative per event. We model finding this bank of informative concepts out of a large set of concept detectors as a rare event search. Our proposed approximate solution finds the optimal concept bank using a cross-entropy optimization. We study the behavior of video event detection based on a bank of informative concepts by performing three experiments on more than 1,000 hours of arbitrary internet video from the TRECVID multimedia event detection task. Starting from a concept bank of 1,346 detectors we show that 1.) some concept banks are more informative than others for specific events, 2.) event detection using an automatically obtained informative concept bank is more robust than using all available concepts, 3.) even for small amounts of training examples an informative concept bank outperforms a full bank and a bag-of-word event representation, and 4.) we show qualitatively that the informative concept banks make sense for the events of interest, without being programmed to do so. We conclude that for concept banks it pays to be informative.
Published December 21st, 2012
Two positions of POSTDOCTORAL RESEARCH FELLOW in Video Search are open in the Informatics Institute of the University of Amsterdam, starting Spring 2013.
The positions are part of a 5-year Personal VIDI Grant funded by the Dutch Organization for Scientific Research and headed by Dr. Cees Snoek. The successful candidates will participate in a frontier research project on video recognition and explanation, and will work in a stimulating environment of a leading and highly-active research team including 1 faculty member and 6 Ph.D. students. The team has repeatedly won the major visual search competitions, including NIST TRECVID, PASCAL Visual Object Challenge, ImageCLEF, and the ImageNet Large Scale Visual Recognition Challenge.
Details on requirements, appointment and application are now available: http://www.uva.nl/en/about-the-uva/working-at-the-uva/vacancies/item/13-007.html
Published November 8th, 2012
As part of the Dutch Prize for ICT research a beautiful poster has been created by NWO and Smidswater, which will be distributed to high schools in the Netherlands. You may also download it here.