337 entries « 1 of 7 »

2026

RegionReasoner: Region-Grounded Multi-Round Visual ReasoningWenfang Sun, Hao Chen, Yingjun Du, Yefeng Zheng, Cees G M Snoek: RegionReasoner: Region-Grounded Multi-Round Visual Reasoning. In: ICLR, 2026. (Type: Proceedings Article | Abstract | BibTeX)
Purrception: Variational Flow Matching for Vector-Quantized Image GenerationRăzvan-Andrei Matişan, Vincent Tao Hu, Grigory Bartosh, Björn Ommer, Cees G M Snoek, Max Welling, Jan-Willem van de Meent, Mohammad Mahdi Derakhshani, Floor Eijkelboom: Purrception: Variational Flow Matching for Vector-Quantized Image Generation. In: ICLR, 2026. (Type: Proceedings Article | Abstract | Links | BibTeX)
MoAlign: Motion-Centric Representation Alignment for Video Diffusion ModelsAritra Bhowmik, Denis Korzhenkov, Cees G M Snoek, Amirhossein Habibian, Mohsen Ghafoorian: MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models. In: ICLR, 2026. (Type: Proceedings Article | Abstract | Links | BibTeX)
What Layers When: Learning to Skip Compute in LLMs with Residual GatesFilipe Laitenberger, Dawid Jan Kopiczko, Cees G M Snoek, Yuki M Asano: What Layers When: Learning to Skip Compute in LLMs with Residual Gates. In: ICLR, 2026. (Type: Proceedings Article | Abstract | Links | BibTeX)
Prompt-Robust Vision-Language Models via Meta-FinetuningHaohui Liang, Runlin Huang, Yingjun Du, Yujia Hu, Weifeng Su, Cees G M Snoek: Prompt-Robust Vision-Language Models via Meta-Finetuning. In: ICLR, 2026. (Type: Proceedings Article | Abstract | BibTeX)
Adapting Vision-Language Models for E-Commerce Understanding at ScaleMatteo Nulli, Orshulevich Vladimir, Tala Bazazo, Christian Herold, Michael Kozielski, Marcin Mazur, Szymon Tuzel, Cees G M Snoek, Seyyed Hadi Hashemi, Omar Javed, Yannick Versley, Shahram Khadivi: Adapting Vision-Language Models for E-Commerce Understanding at Scale. In: EACL, 2026. (Type: Proceedings Article | Abstract | BibTeX)
QUOTA: Quantifying Objects with Text-to-Image Models for Any DomainWenfang Sun, Yingjun Du, Gaowen Liu, Cees G M Snoek: QUOTA: Quantifying Objects with Text-to-Image Models for Any Domain. In: WACV, 2026. (Type: Proceedings Article | Abstract | Links | BibTeX)
GateRA: Token-Aware Modulation for Parameter-Efficient Fine-TuningJie Ou, Shuaihong Jiang, Yingjun Du, Cees G M Snoek: GateRA: Token-Aware Modulation for Parameter-Efficient Fine-Tuning. In: AAAI, 2026. (Type: Proceedings Article | Abstract | Links | BibTeX)
SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation LearningFida Mohammad Thoker, Letian Jiang, Chen Zhao, Piyush Bagad, Hazel Doughty, Bernard Ghanem, Cees G M Snoek: SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation Learning. In: International Journal of Computer Vision, 2026, (Submitted.). (Type: Journal Article | Abstract | Links | BibTeX)
Piyush Bagad, Makarand Tapaswi, Cees G M Snoek, Andrew Zisserman: The Sound of Water: Inferring Physical Properties from Pouring Liquids. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026, (Pending minor revision). (Type: Journal Article | Links | BibTeX)
LoTeR: Localized text prompt refinement for zero-shot referring image segmentationLei Zhang, Yongqiu Huang, Yingjun Du, Fang Lei, Zhiying Yang, Cees G M Snoek, Yehui Wang: LoTeR: Localized text prompt refinement for zero-shot referring image segmentation. In: Computer Vision and Image Understanding, vol. 263, iss. January, no. 104596, 2026. (Type: Journal Article | Abstract | Links | BibTeX)

2025

REALM: A Real-to-Sim Validated Benchmark for Generalization in Robotic ManipulationMartin Sedlacek, Pavlo Yefanov, Georgy Ponimatkin, Jai Bardhan, Simon Pilc, Mederic Fourmy, Evangelos Kazakos, Cees G M Snoek, Josef Sivic, Vladimir Petrik: REALM: A Real-to-Sim Validated Benchmark for Generalization in Robotic Manipulation. arXiv:2512.19562, 2025. (Type: Unpublished | Abstract | Links | BibTeX)
Elastic ViTs from Pretrained Models without RetrainingWalter Simoncini, Michael Dorkenwald, Tijmen Blankevoort, Cees G M Snoek, Yuki M Asano: Elastic ViTs from Pretrained Models without Retraining. In: NeurIPS, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation ModelsTim Veenboer, George Yiasemis, Eric Marcus, Vivien van Veldhuizen, Cees G M Snoek, Jonas Teuwen, Kevin B. W. Groot Lipman: TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation Models. arXiv:2512.00872, 2025. (Type: Unpublished | Abstract | Links | BibTeX)
Lost in Time: A New Temporal Benchmark for VideoLLMsDaniel Cores, Michael Dorkenwald, Manuel Mucientes, Cees G M Snoek, Yuki M Asano: Lost in Time: A New Temporal Benchmark for VideoLLMs. In: BMVC, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
TWIST & SCOUT: Grounding Multimodal LLM-Experts by Forget-Free TuningAritra Bhowmik, Mohammad Mahdi Derakhshani, Dennis Koelma, Yuki M Asano, Martin R Oswald, Cees G M Snoek: TWIST & SCOUT: Grounding Multimodal LLM-Experts by Forget-Free Tuning. In: ICCV, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
MoSiC: Optimal-Transport Motion Trajectory for Dense Self-Supervised LearningMohammadreza Salehi, Shashanka Venkataramanan, Ioana Simion, Efstratios Gavves, Cees G M Snoek, Yuki M Asano: MoSiC: Optimal-Transport Motion Trajectory for Dense Self-Supervised Learning. In: ICCV, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
Visual Odometry with TransformersVladimir Yugay, Duy-Kien Nguyen, Theo Gevers, Cees G M Snoek, Martin R Oswald: Visual Odometry with Transformers. arXiv:2510.03348, 2025. (Type: Unpublished | Abstract | Links | BibTeX)
Bridging the Gap: Exposing the Hidden Challenges Towards Adoption of Artificial Intelligence in SurgeryAna Manzano Rodriguez, Cees G M Snoek, Marlies P Schijven: Bridging the Gap: Exposing the Hidden Challenges Towards Adoption of Artificial Intelligence in Surgery. In: BJS, vol. 112, iss. 11, 2025. (Type: Journal Article | Abstract | Links | BibTeX)
KV Cache Steering for Controlling Frozen LLMsMax Belitsky, Dawid J Kopiczko, Michael Dorkenwald, M. Jehanzeb Mirza, Cees G M Snoek, Yuki M Asano: KV Cache Steering for Controlling Frozen LLMs. arXiv:2507.08799, 2025. (Type: Unpublished | Abstract | Links | BibTeX)
NeoBabel: A Multilingual Open Tower for Visual GenerationMohammad Mahdi Derakhshani, Dheeraj Varghese, Marzieh Fadaee, Cees G M Snoek: NeoBabel: A Multilingual Open Tower for Visual Generation. arXiv:2507.06137, 2025. (Type: Unpublished | Abstract | Links | BibTeX)
Continual Hyperbolic Learning of Instances and ClassesMelika Ayoughi, Mina Ghadimi Atigh, Mohammad Mahdi Derakhshani, Cees G M Snoek, Pascal Mettes, Paul Groth: Continual Hyperbolic Learning of Instances and Classes. arXiv:2506.10710, 2025. (Type: Unpublished | Links | BibTeX)
Commonsense Video Question Answering through Video-Grounded Entailment Tree ReasoningHuabin Liu, Filip Ilievski, Cees G M Snoek: Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning. In: CVPR, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
Foundation Models in Medical Imaging -- A Review and OutlookVivien van Veldhuizen, Vanessa Botha, Chunyao Lu, Melis Erdal Cesur, Kevin Groot Lipman, Edwin D de Jong, Hugo Horlings, Clárisa I Sanchez, Cees G M Snoek, Lodewyk Wessels, Ritse Mann, Eric Marcus, Jonas Teuwen: Foundation Models in Medical Imaging -- A Review and Outlook. arXiv:2506.09095, 2025. (Type: Unpublished | Links | BibTeX)
Union-over-Intersections: Object Detection beyond Winner-Takes-AllAritra Bhowmik, Pascal Mettes, Martin R Oswald, Cees G M Snoek: Union-over-Intersections: Object Detection beyond Winner-Takes-All. In: ICLR, 2025, (Spotlight presentation). (Type: Proceedings Article | Abstract | Links | BibTeX)
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual PixelsDuy-Kien Nguyen, Mahmoud Assran, Unnat Jain, Martin R Oswald, Cees G M Snoek, Xinlei Chen: An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels. In: ICLR, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
TULIP: Token-length Upgraded CLIPIvona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki M Asano, Nanne van Noord, Marcel Worring, Cees G M Snoek : TULIP: Token-length Upgraded CLIP. In: ICLR, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
One Hundred Neural Networks and Brains Watching Videos: Lessons from AlignmentChristina Sartzetaki, Gemma Roig, Cees G M Snoek, Iris I A Groen: One Hundred Neural Networks and Brains Watching Videos: Lessons from Alignment. In: ICLR, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent CooperationJie Liu, Pan Zhou, Yingjun Du, Ah-Hwee Tan, Cees G M Snoek, Jan-Jakob Sonke, Efstratios Gavves: CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation. In: ICLR, 2025. (Type: Proceedings Article | Links | BibTeX)
DynaPrompt: Dynamic Test-Time Prompt TuningZehao Xiao, Shilin Yan, Jack Hong, Jiayin Cai, Xiaolong Jiang, Yao Hu, Jiayi Shen, Cheems Wang, Cees G M Snoek: DynaPrompt: Dynamic Test-Time Prompt Tuning. In: ICLR, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
Association Between Social Distancing Compliance and Public Place Crowding During the COVID-19 Pandemic: Cross-Sectional Observational Study Using Computer Vision to Analyze Surveillance FootageLasse Suonperä Liebst, Wim Bernasco, Peter Ejbye-Ernst, Nigel van Herwijnen, Thomas van der Veen, Dennis Koelma, Cees G M Snoek, Marie Rosenkrantz Lindegaard: Association Between Social Distancing Compliance and Public Place Crowding During the COVID-19 Pandemic: Cross-Sectional Observational Study Using Computer Vision to Analyze Surveillance Footage. In: JMIR Public Health and Surveillance, 2025, ISBN: 2369-2960, (In press). (Type: Journal Article | Abstract | Links | BibTeX)
Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly DetectionsAlireza Salehi, Mohammadreza Salehi, Reshad Hosseini, Cees G M Snoek, Makoto Yamada, Mohammad Sabokrou: Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections. arXiv:2504.11055, 2025. (Type: Unpublished | Abstract | Links | BibTeX)
The Sound of Water: Inferring Physical Properties from Pouring LiquidsPiyush Bagad, Makarand Tapaswi, Cees G M Snoek, Andrew Zisserman: The Sound of Water: Inferring Physical Properties from Pouring Liquids. In: ICASSP, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent DaylightYunhua Zhang, Hazel Doughty, Cees G M Snoek: Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent Daylight. In: International Journal of Computer Vision, vol. 133, iss. 4, pp. 2136-2157, 2025. (Type: Journal Article | Abstract | Links | BibTeX)
Structured-Noise Masked Modeling for Video, Audio and BeyondAritra Bhowmik, Fida Mohammad Thoker, Carlos Hinojosa, Bernard Ghanem, Cees G. M. Snoek: Structured-Noise Masked Modeling for Video, Audio and Beyond. arXiv:2503.16311, 2025. (Type: Unpublished | Abstract | Links | BibTeX)
GeneralizeFormer: Layer-Adaptive Model Generation across Test-Time Distribution ShiftsSameer Ambekar, Zehao Xiao, Xiantong Zhen, Cees G M Snoek: GeneralizeFormer: Layer-Adaptive Model Generation across Test-Time Distribution Shifts. In: WACV, 2025. (Type: Proceedings Article | Abstract | Links | BibTeX)
Geometric Neural Process FieldsWenzhe Yin, Zehao Xiao, Jiayi Shen, Yunlu Chen, Cees G M Snoek, Jan-Jakob Sonke, Efstratios Gavves: Geometric Neural Process Fields. In: Transactions on Machine Learning Research, 2025, (Submitted). (Type: Journal Article | Abstract | Links | BibTeX)
SimPLR: A Simple and Plain Transformer for Scaling-Efficient Object Detection and SegmentationDuy-Kien Nguyen, Martin R Oswald, Cees G M Snoek: SimPLR: A Simple and Plain Transformer for Scaling-Efficient Object Detection and Segmentation. In: Transactions on Machine Learning Research, 2025, ISSN: 2835-8856. (Type: Journal Article | Abstract | Links | BibTeX)

2024

IPO: Interpretable Prompt Optimization for Vision-Language ModelsYingjun Du, Wenfang Sun, Cees G M Snoek: IPO: Interpretable Prompt Optimization for Vision-Language Models. In: NeurIPS, 2024. (Type: Proceedings Article | Abstract | Links | BibTeX)
Redefining Normal: A Novel Object-Level Approach for Multi-Object Novelty DetectionMohammadreza Salehi, Nikolaos Apostolikas, Efstratios Gavves, Cees G M Snoek, Yuki M Asano: Redefining Normal: A Novel Object-Level Approach for Multi-Object Novelty Detection. In: ACCV, 2024, (Oral presentation). (Type: Proceedings Article | Abstract | Links | BibTeX)
Beyond Coarse-Grained Matching in Video-Text RetrievalAozhu Chen, Hazel Doughty, Xirong Li, Cees G M Snoek: Beyond Coarse-Grained Matching in Video-Text Retrieval. In: ACCV, 2024, (Oral presentation). (Type: Proceedings Article | Abstract | Links | BibTeX)
LocoMotion: Learning Motion-Focused Video-Language RepresentationsHazel Doughty, Fida Mohammad Thoker, Cees G M Snoek: LocoMotion: Learning Motion-Focused Video-Language Representations. In: ACCV, 2024, (Oral presentation). (Type: Proceedings Article | Abstract | Links | BibTeX)
Beyond Model Adaptation at Test Time: A SurveyZehao Xiao, Cees G M Snoek: Beyond Model Adaptation at Test Time: A Survey. arXiv:2411.03687, 2024. (Type: Unpublished | Abstract | Links | BibTeX)
Prompt Diffusion Robustifies Any-Modality Prompt LearningYingjun Du, Gaowen Liu, Yuzhang Shang, Yuguang Yao, Ramana Kompella, Cees G M Snoek: Prompt Diffusion Robustifies Any-Modality Prompt Learning. arXiv:2410.20164, 2024. (Type: Unpublished | Abstract | Links | BibTeX)
SIGMA: Sinkhorn-Guided Masked Video ModelingMohammadreza Salehi, Michael Dorkenwald, Fida Mohammad Thoker, Efstratios Gavves, Cees G M Snoek, Yuki M Asano: SIGMA: Sinkhorn-Guided Masked Video Modeling. In: ECCV, 2024. (Type: Proceedings Article | Abstract | Links | BibTeX)
SelEx: Self-Expertise in Fine-Grained Generalized Category DiscoverySarah Rastegar, Mohammadreza Salehi, Yuki M Asano, Hazel Doughty, Cees G M Snoek: SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery. In: ECCV, 2024. (Type: Proceedings Article | Abstract | Links | BibTeX)
GeneralAD: Anomaly Detection Across Domains by Attending to Distorted FeaturesLuc Sträter, Mohammadreza Salehi, Efstratios Gavves, Cees G M Snoek, Yuki M Asano: GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features. In: ECCV, 2024. (Type: Proceedings Article | Abstract | Links | BibTeX)
Probabilistic Test-Time Generalization by Variational Neighbor-LabelingSameer Ambekar, Zehao Xiao, Jiayi Shen, Xiantong Zhen, Cees G M Snoek: Probabilistic Test-Time Generalization by Variational Neighbor-Labeling. In: CoLLAs, 2024. (Type: Proceedings Article | Abstract | Links | BibTeX)
Focus for Free in Density-Based CountingZenglin Shi, Pascal Mettes, Cees G M Snoek: Focus for Free in Density-Based Counting. In: International Journal of Computer Vision, vol. 132, iss. 7, pp. 2600-2617, 2024. (Type: Journal Article | Abstract | Links | BibTeX)
Low-Resource Vision Challenges for Foundation ModelsYunhua Zhang, Hazel Doughty, Cees G M Snoek: Low-Resource Vision Challenges for Foundation Models. In: CVPR, 2024, (Best paper FGVC2024 workshop.). (Type: Proceedings Article | Abstract | Links | BibTeX)
337 entries « 1 of 7 »
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.