2005
|
| Frank J Seinstra, Cees G M Snoek, Dennis C Koelma, Jan-Mark Geusebroek, Marcel Worring: User Transparent Parallel Processing of the 2004 NIST TRECVID Data Set. In: IPDPS, pp. 90–97, Denver, USA, 2005. @inproceedings{SeinstraIPDPS05,
title = {User Transparent Parallel Processing of the 2004 NIST TRECVID Data Set},
author = {Frank J Seinstra and Cees G M Snoek and Dennis C Koelma and Jan-Mark Geusebroek and Marcel Worring},
url = {http://staff.science.uva.nl/~fjseins/Papers/Conferences/ipdps2005.pdf},
year = {2005},
date = {2005-04-01},
booktitle = {IPDPS},
pages = {90--97},
address = {Denver, USA},
abstract = {The Parallel-Horus framework, developed at the University of Amsterdam, is a unique software architecture that allows non-expert parallel programmers to develop fully sequential multimedia applications for efficient execution on homogeneous Beowulf-type commodity clusters. Previously obtained results for realistic, but relatively small-sized applications have shown the feasibility of the Parallel-Horus approach, with parallel performance consistently being found to be optimal with respect to the abstraction level of message passing programs. In this paper we discuss the most serious challenge Parallel-Horus has had to deal with so far: the processing of over 184 hours of video included in the 2004 NIST TRECVID evaluation, i.e. the de facto international standard benchmark for content-based video retrieval. Our results and experiences confirm that Parallel- Horus is a very powerful support-tool for state-of-the-art research and applications in multimedia processing.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
The Parallel-Horus framework, developed at the University of Amsterdam, is a unique software architecture that allows non-expert parallel programmers to develop fully sequential multimedia applications for efficient execution on homogeneous Beowulf-type commodity clusters. Previously obtained results for realistic, but relatively small-sized applications have shown the feasibility of the Parallel-Horus approach, with parallel performance consistently being found to be optimal with respect to the abstraction level of message passing programs. In this paper we discuss the most serious challenge Parallel-Horus has had to deal with so far: the processing of over 184 hours of video included in the 2004 NIST TRECVID evaluation, i.e. the de facto international standard benchmark for content-based video retrieval. Our results and experiences confirm that Parallel- Horus is a very powerful support-tool for state-of-the-art research and applications in multimedia processing. |
| Cees G M Snoek, Marcel Worring: Multimedia Pattern Recognition in Soccer Video using Time Intervals. In: Classification the Ubiquitous Challenge, Proceedings of the 28th Annual Conference of the Gesellschaft fur Klassifikation e.V., University of Dortmund, March 9-11, 2004, pp. 97–108, Springer-Verlag, Berlin, Germany, 2005. @inproceedings{SnoekGFKL05,
title = {Multimedia Pattern Recognition in Soccer Video using Time Intervals},
author = {Cees G M Snoek and Marcel Worring},
year = {2005},
date = {2005-01-01},
booktitle = {Classification the Ubiquitous Challenge, Proceedings of the 28th Annual Conference of the Gesellschaft fur Klassifikation e.V., University of Dortmund, March 9-11, 2004},
pages = {97--108},
publisher = {Springer-Verlag},
address = {Berlin, Germany},
series = {Studies in Classification, Data Analysis, and Knowledge Organization},
abstract = {In this paper we propose the Time Interval Multimedia Event (TIME) framework as a robust approach for recognition of multimedia patterns, e.g. highlight events, in soccer video. The representation used in TIME extends the Allen temporal interval relations and allows for proper inclusion of context and synchronization of the heterogeneous information sources involved in multimedia pattern recognition. For automatic classification of highlights in soccer video, we compare three different machine learning techniques, i.c. C4.5 decision tree, Maximum Entropy, and Support Vector Machine. It was found that by using the TIME framework the amount of video a user has to watch in order to see almost all highlights can be reduced considerably, especially in combination with a Support Vector Machine.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper we propose the Time Interval Multimedia Event (TIME) framework as a robust approach for recognition of multimedia patterns, e.g. highlight events, in soccer video. The representation used in TIME extends the Allen temporal interval relations and allows for proper inclusion of context and synchronization of the heterogeneous information sources involved in multimedia pattern recognition. For automatic classification of highlights in soccer video, we compare three different machine learning techniques, i.c. C4.5 decision tree, Maximum Entropy, and Support Vector Machine. It was found that by using the TIME framework the amount of video a user has to watch in order to see almost all highlights can be reduced considerably, especially in combination with a Support Vector Machine. |
| Cees G M Snoek, Marcel Worring: Multimodal Video Indexing: A Review of the State-of-the-art. In: Multimedia Tools and Applications, vol. 25, no. 1, pp. 5–35, 2005. @article{SnoekMMTA05,
title = {Multimodal Video Indexing: A Review of the State-of-the-art},
author = {Cees G M Snoek and Marcel Worring},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/snoek-review-mmta.pdf},
year = {2005},
date = {2005-01-01},
journal = {Multimedia Tools and Applications},
volume = {25},
number = {1},
pages = {5--35},
abstract = {Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video indexing have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. Therefore, instead of separately treating the different information sources involved, and their specific algorithms, we focus on the similarities and differences between the modalities. To that end we put forward a unifying and multimodal framework, which views a video document from the perspective of its author. This framework forms the guiding principle for identifying index types, for which automatic methods are found in literature. It furthermore forms the basis for categorizing these different methods.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video indexing have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. Therefore, instead of separately treating the different information sources involved, and their specific algorithms, we focus on the similarities and differences between the modalities. To that end we put forward a unifying and multimodal framework, which views a video document from the perspective of its author. This framework forms the guiding principle for identifying index types, for which automatic methods are found in literature. It furthermore forms the basis for categorizing these different methods. |
2004
|
| Cees G M Snoek, Marcel Worring, Jan-Mark Geusebroek, Dennis C Koelma, Frank J Seinstra: The MediaMill TRECVID 2004 Semantic Video Search Engine. In: TRECVID, Gaithersburg, USA, 2004. @inproceedings{SnoekTRECVID04,
title = {The MediaMill TRECVID 2004 Semantic Video Search Engine},
author = {Cees G M Snoek and Marcel Worring and Jan-Mark Geusebroek and Dennis C Koelma and Frank J Seinstra},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/UvA-MM_TRECVID2004.pdf},
year = {2004},
date = {2004-11-01},
booktitle = {TRECVID},
address = {Gaithersburg, USA},
abstract = {This year the UvA-MediaMill team participated in the Feature Extraction and Search Task. We developed a generic approach for semantic concept classification using the semantic value chain. The semantic value chain extracts concepts from video documents based on three consecutive analysis links, named the content link, the style link, and the context link. Various experiments within the analysis links were performed, showing amongst others the merit of processing beyond key frames, the value of style elements, and the importance of learning semantic context. For all experiments a lexicon of 32 concepts was exploited, 10 of which are part of the Feature Extraction Task. Top three system-based ranking in 8 out of the 10 benchmark concepts indicates that our approach is very promising. Apart from this, the lexicon of 32 concepts proved very useful in an interactive search scenario with our semantic video search engine, where we obtained the highest mean average precision of all participants.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
This year the UvA-MediaMill team participated in the Feature Extraction and Search Task. We developed a generic approach for semantic concept classification using the semantic value chain. The semantic value chain extracts concepts from video documents based on three consecutive analysis links, named the content link, the style link, and the context link. Various experiments within the analysis links were performed, showing amongst others the merit of processing beyond key frames, the value of style elements, and the importance of learning semantic context. For all experiments a lexicon of 32 concepts was exploited, 10 of which are part of the Feature Extraction Task. Top three system-based ranking in 8 out of the 10 benchmark concepts indicates that our approach is very promising. Apart from this, the lexicon of 32 concepts proved very useful in an interactive search scenario with our semantic video search engine, where we obtained the highest mean average precision of all participants. |
| Cees G M Snoek, Marcel Worring, Alexander G Hauptmann: Detection of TV News Monologues by Style Analysis. In: ICME, Taipei, Taiwan, 2004. @inproceedings{SnoekICME04,
title = {Detection of TV News Monologues by Style Analysis},
author = {Cees G M Snoek and Marcel Worring and Alexander G Hauptmann},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/snoek-style-icme2004.pdf},
year = {2004},
date = {2004-06-01},
booktitle = {ICME},
address = {Taipei, Taiwan},
abstract = {We propose a method for detection of semantic concepts in produced video based on style analysis. Recognition of concepts is done by applying a classifier ensemble to the detected style elements. As a case study we present a method for detecting the concept of news subject monologues. Our approach had the best average precision performance amongst 26 submissions in the 2003 TRECVID benchmark.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
We propose a method for detection of semantic concepts in produced video based on style analysis. Recognition of concepts is done by applying a classifier ensemble to the detected style elements. As a case study we present a method for detecting the concept of news subject monologues. Our approach had the best average precision performance amongst 26 submissions in the 2003 TRECVID benchmark. |
2003
|
| Alexander Hauptmann, Robert V Baron, Ming-Yu Chen, Michael Christel, Pinar Duygulu, Chang Huang, Rong Jin, Wei-Hao Lin, Dorbin Ng, Neema Moraveji, Norman Papernick, Cees G M Snoek, George Tzanetakis, Jun Yang, Rong Yan, Howard D Wactlar: Informedia at TRECVID 2003: Analyzing and Searching Broadcast News Video. In: TRECVID, Gaithersburg, USA, 2003. @inproceedings{HauptmannTRECVID03,
title = {Informedia at TRECVID 2003: Analyzing and Searching Broadcast News Video},
author = {Alexander Hauptmann and Robert V Baron and Ming-Yu Chen and Michael Christel and Pinar Duygulu and Chang Huang and Rong Jin and Wei-Hao Lin and Dorbin Ng and Neema Moraveji and Norman Papernick and Cees G M Snoek and George Tzanetakis and Jun Yang and Rong Yan and Howard D Wactlar},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/TREC03Informedia.pdf},
year = {2003},
date = {2003-11-01},
booktitle = {TRECVID},
address = {Gaithersburg, USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
|
| Cees G M Snoek, Marcel Worring: Time Interval Maximum Entropy based Event Indexing in Soccer Video. In: ICME, pp. 481–484, Baltimore, USA, 2003. @inproceedings{SnoekICME03a,
title = {Time Interval Maximum Entropy based Event Indexing in Soccer Video},
author = {Cees G M Snoek and Marcel Worring},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/icme2003.pdf},
year = {2003},
date = {2003-07-01},
booktitle = {ICME},
pages = {481--484},
address = {Baltimore, USA},
abstract = {Multimodal indexing of events in video documents poses problems with respect to representation, inclusion of contextual information, and synchronization of the heterogeneous information sources involved. In this paper we present the Time Interval Maximum Entropy (TIME) framework that tackles aforementioned problems. To demonstrate the viability of TIME for event classification in multimodal video, an evaluation was performed on the domain of soccer broadcasts. It was found that by applying TIME, the amount of video a user has to watch in order to see almost all highlights can be reduced considerably.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Multimodal indexing of events in video documents poses problems with respect to representation, inclusion of contextual information, and synchronization of the heterogeneous information sources involved. In this paper we present the Time Interval Maximum Entropy (TIME) framework that tackles aforementioned problems. To demonstrate the viability of TIME for event classification in multimodal video, an evaluation was performed on the domain of soccer broadcasts. It was found that by applying TIME, the amount of video a user has to watch in order to see almost all highlights can be reduced considerably. |
| Cees G M Snoek, Marcel Worring: Goalgle: A Soccer Video Search Engine. In: ICME, Baltimore, USA, 2003. @inproceedings{SnoekICME03b,
title = {Goalgle: A Soccer Video Search Engine},
author = {Cees G M Snoek and Marcel Worring},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/goalgle.pdf},
year = {2003},
date = {2003-07-01},
booktitle = {ICME},
address = {Baltimore, USA},
abstract = {Goalgle is a prototype search engine for soccer video. Browsing and retrieval functionality is provided by means of a web based interface. This interface allows users to jump to video segments from a collection of prerecorded and analyzed soccer matches based on queries on specific players, events, matches, and/or text. In this contribution we discuss the system architecture and functionality of the Goalgle soccer video search engine.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Goalgle is a prototype search engine for soccer video. Browsing and retrieval functionality is provided by means of a web based interface. This interface allows users to jump to video segments from a collection of prerecorded and analyzed soccer matches based on queries on specific players, events, matches, and/or text. In this contribution we discuss the system architecture and functionality of the Goalgle soccer video search engine. |
2002
|
| Jeroen Vendrig, Jurgen den Hartog, David van Leeuwen, Ioannis Patras, Stephan Raaijmakers, Jeroen van Rest, Cees G M Snoek, Marcel Worring: TREC Feature Extraction by Active Learning. In: TREC, Gaithersburg, USA, 2002. @inproceedings{VendrigTREC02,
title = {TREC Feature Extraction by Active Learning},
author = {Jeroen Vendrig and Jurgen den Hartog and David van Leeuwen and Ioannis Patras and Stephan Raaijmakers and Jeroen van Rest and Cees G M Snoek and Marcel Worring},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/trec2002.pdf},
year = {2002},
date = {2002-11-01},
booktitle = {TREC},
address = {Gaithersburg, USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
|
| Cees G M Snoek, Marcel Worring: A Review on Multimodal Video Indexing. In: ICME, pp. 21–24, Lausanne, Switzerland, 2002. @inproceedings{SnoekICME02,
title = {A Review on Multimodal Video Indexing},
author = {Cees G M Snoek and Marcel Worring},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/icme2002.pdf},
year = {2002},
date = {2002-08-01},
booktitle = {ICME},
volume = {2},
pages = {21--24},
address = {Lausanne, Switzerland},
abstract = {Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. Efficient, single modality based, video indexing methods have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. In this paper we present a framework for multimodal video indexing, which views a video document from the perspective of its author. The framework serves as a blueprint for a generic and flexible multimodal video indexing system, and generalizes different state-of-the-art video indexing methods. It furthermore forms the basis for categorizing these different methods.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. Efficient, single modality based, video indexing methods have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. In this paper we present a framework for multimodal video indexing, which views a video document from the perspective of its author. The framework serves as a blueprint for a generic and flexible multimodal video indexing system, and generalizes different state-of-the-art video indexing methods. It furthermore forms the basis for categorizing these different methods. |
| Marcel Worring, Andrew Bagdanov, Jan C van Gemert, Jan-Mark Geusebroek, Minh Hoang, Guus Schreiber, Cees G M Snoek, Jeroen Vendrig, Jan Wielemaker, Arnold W M Smeulders: Interactive Indexing and Retrieval of Multimedia Content. In: Proceedings of the Annual Conference on Current Trends in Theory and Practice of Informatics, pp. 135-148, Springer-Verlag, Milovy, Czech Republic, 2002. @inproceedings{WorringSOFSEM02,
title = {Interactive Indexing and Retrieval of Multimedia Content},
author = {Marcel Worring and Andrew Bagdanov and Jan C van Gemert and Jan-Mark Geusebroek and Minh Hoang and Guus Schreiber and Cees G M Snoek and Jeroen Vendrig and Jan Wielemaker and Arnold W M Smeulders},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/sofsem2002.pdf},
year = {2002},
date = {2002-01-01},
booktitle = {Proceedings of the Annual Conference on Current Trends in Theory and Practice of Informatics},
volume = {2540},
pages = {135-148},
publisher = {Springer-Verlag},
address = {Milovy, Czech Republic},
series = {Lecture Notes in Computer Science},
abstract = {The indexing and retrieval of multimedia items is difficult due to the semantic gap between the user's perception of the data and the descriptions we can derive automatically from the data using computer vision, speech recognition, and natural language processing. In this contribution we consider the nature of the semantic gap in more detail and show examples of methods that help in limiting the gap. These methods can be automatic, but in general the indexing and retrieval of multimedia items should be a collaborative process between the system and the user. We show how to employ the user's interaction for limiting the semantic gap.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
The indexing and retrieval of multimedia items is difficult due to the semantic gap between the user's perception of the data and the descriptions we can derive automatically from the data using computer vision, speech recognition, and natural language processing. In this contribution we consider the nature of the semantic gap in more detail and show examples of methods that help in limiting the gap. These methods can be automatic, but in general the indexing and retrieval of multimedia items should be a collaborative process between the system and the user. We show how to employ the user's interaction for limiting the semantic gap. |
2001
|
| Jan Baan, Alex van Ballegooij, Jan-Mark Geusebroek, Djoerd Hiemstra, Jurgen den Hartog, Johan List, Cees G M Snoek, Ioannis Patras, Stephan Raaijmakers, Leon Todoran, Jeroen Vendrig, Arjen de Vries, Thijs Westerveld, Marcel Worring: Lazy Users and Automatic Video Retrieval Tools in (the) Lowlands. In: TREC, Gaithersburg, USA, 2001. @inproceedings{BaanTREC01,
title = {Lazy Users and Automatic Video Retrieval Tools in (the) Lowlands},
author = {Jan Baan and Alex van Ballegooij and Jan-Mark Geusebroek and Djoerd Hiemstra and Jurgen den Hartog and Johan List and Cees G M Snoek and Ioannis Patras and Stephan Raaijmakers and Leon Todoran and Jeroen Vendrig and Arjen de Vries and Thijs Westerveld and Marcel Worring},
url = {http://isis-data.science.uva.nl/cgmsnoek/pub/lowlands01.pdf},
year = {2001},
date = {2001-11-01},
booktitle = {TREC},
address = {Gaithersburg, USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
|