Course Summary

This graduate course is especially meant for Ph.D. students who have basic familiarity with computer vision, image processing, and machine learning and want to upsurge their knowledge and machinery to the state-of-the-art, with direct utility in their own research.

The topic of attention is the challenge of computer vision by learning. We address the theoretical foundations of computer vision in conjunction with machine learning and present algorithms that achieve state-of-the-art performance while maintaining efficient execution with minimal supervision. This year we explain and emphasize on computer vision by deep learning, including challenges like 3D object detction, fine-grained recognition, geometric deep learning, self-supervised representation learning and video understanding  . We give an overview of the latest developments and future trends in the field on the basis of several recent challenges, and we indicate how to obtain improvements in the near future.

Course Registration

Course registration is handled by the ASCI research school, via this form. Note that the number of seats for this course is limited.

Lab requirements: bring your own device

For the lab, you are expected to bring your own device, either a laptop with a good GPU or a laptop that can connect to a workstation with a good GPU. In case you cannot connect to a GPU, you should make a CoLAB Google Account and make sure you can run a GPU powered notebook (You can turn the GPU on by the following steps: Edit->Notebook settings->Hardware accelerator->GPU). The lab assignments are detailed on a separate page. 

Course Schedule

Monday May 9, 2022: Fundamentals

TimeRoom TopicLecturer
0900-0930CasaWelcome with coffee and tea
0930-1010CasaIntroduction and vision by learning basicsCees Snoek
1010-1020Short break
1020-1100CasaVision in the deep learning eraEfstratios Gavves
1130-1215CasaDeep learning beyond classificationEfstratios Gavves 
1215-1330Lunch break (included)
1330-1700CasaLab session - day 1: MLP / CNN 

Tuesday May 10, 2022: Computer vision by deep learning

TimeRoom TopicLecturer
0900-0930Casa Welcome with coffee and tea
0930-1010CasaTransformersCees Snoek
1010-1020Short break
1020-1100CasaLearning from little dataSubhransu Maji
1130-1215Casa3D representation learningMartin Oswald
1215-1330Lunch break (included)
1330-1700CasaLab session - day 2: CNN / Transformer 

Wednesday May 11, 2022: Machine learning for computer vision

TimeRoom TopicLecturer
0900-0930CasaWelcome with coffee and tea
0930-1030CasaGroup equivariant deep learningErik Bekkers
1100-1200CasaLearning of time and dynamicsEfstratios Gavves 
1200-1330Lunch break (included)
1330-1700CasaLab session - day 3: Geometric deep learning 

Thursday May 12, 2022: Computer video by learning

TimeRoom TopicLecturer
0900-0930CasaWelcome with coffee and tea
0930-1010CasaSelf-supervised learningYuki Asano
1010-1020Short break
1020-1100CasaAction understanding in video Hazel Doughty
1130-1215CasaBeyond spatial classificationCees Snoek
1215-1330Lunch break (included)
1330-1700CasaLab session - day 4: Self-supervised learning 
1700-1800CasaClosing borrel with drinks and snacks 

Friday May 13, 2022: Invited tutorial by Serge Belongie

TimeRoom TopicLecturer
0900-0930Startup VillageWelcome with coffee and tea
0930-1045Startup Village Fine-grained visual analysisSerge Belongie 
1115-1215Startup VillageRepresentation learning for narratives in social mediaSerge Belongie

Invited tutorial

  • Serge Belongie

    is a professor of Computer Science at the University of Copenhagen, where he also serves as the head of the Danish Pioneer Centre for Artificial Intelligence. Previously, he was the Andrew H. and Ann R. Tisch Professor of Computer Science at Cornell Tech, where he also served as Associate Dean. He has also been a member of the Visiting Faculty program at Google.


  • Cees Snoek

    is full professor in computer science at the University of Amsterdam, where he heads the Video & Image Sense Lab. He is also a director of three public-private AI research labs: QUVA Lab with Qualcomm, Atlas Lab with TomTom and AIM Lab with the Inception Institute of Artificial Intelligence.  He was a visiting scientist at Carnegie Mellon University, Pittsburgh and the University of California, Berkeley. His research interest is video and image understanding by computer vision and machine learning.

  • Efstratios Gavves

    is an Associate Professor with the University of Amsterdam in the Netherlands. He received his Ph.D. in 2014 at the University of Amsterdam. He was a post-doctoral researcher at the KU Leuven from 2014 - 2015. He has authored several papers in major computer vision and machine learning conferences and journals. He is a recipient of the ERC Career Starting Grant 2020 and NWO VIDI grant 2020 to research on the Computational Learning of Temporality for spatiotemporal sequences.

Guest Lecturers

  • Yuki M. Asano

    is an assistant professor for computer vision and machine learning at the QUVA lab at the University of Amsterdam. He did his PhD at the Visual Geometry Group at the University of Oxford. He is interested in computer vision, self-supervised learning and multi-modal learning.

  • Erik Bekkers

    is an assistant professor in Geometric Deep Learning at the University of Amsterdam. Before this he did a post-doc in applied differential geometry at the Technical University Eindhoven. Erik is a recipient of a MICCAI Young Scientist Award 2018, Philips Impact Award and a personal VENI research grant.

  • Hazel Doughty

    is a Postdoctoral researcher at the University of Amsterdam. She received her PhD in 2020 at the University of Bristol and was a visiting researcher at INRIA Willow (Paris) in 2019. Her area of interest is Video Understanding.

  • Subhransu Maji

    an Associate Professor at the University of Massachusetts, Amherst and the co-director of the Computer Vision Lab. Prior to this he spent three years as a Research Assistant Professor at TTI Chicago.

  • Martin Oswald

    is an assistant professor at the Atlas lab of the University of Amsterdam. He was previously a the Computer Vision and Geometry Group at ETH Zurich. He obtained his PhD at Technische Universit√§t M√ľnchen.