CVPR 2024 Tutorial:

Diffusion-based Video Generative Models

Date: Tuesday, June 18, 2024
Time: 2:00 PM–5:00 PM (PST)
Location: Summit 437-439, Seattle Convention Center

Links to previous tutorial: [Slides] [Paper List] [Recording]


The introduction of diffusion models has had a profound impact on video creation, democratizing a wide range of applications, sparkling startups, and leading to innovative products. This tutorial offers an in-depth exploration of diffusion-based video generative models, a field that stands at the forefront of creativity. We expect a wide range of attendees. For students, researchers, and practitioners eager to enter and contribute to this domain, we will help them get the necessary knowledge, understand the challenges, and choose a promising research direction. Our tutorial is also open to video creators and enthusiasts, helping them to harness the power of video diffusion models in crafting visually stunning and innovative videos.


Mike Zheng Shou
National U. of Singapore
Jay Zhangjie Wu
National U. of Singapore


Title Speaker Time (PST)
Diffusion models, video foundation models, pre-training, etc.
Mike Zheng Shou 2:00 PM–3:00 PM
Fine-tuning, editing, controls, personalization, motion customization, etc.
Jay Zhangjie Wu 3:00 PM–4:00 PM
Evaluation & Safety
Benchmark, metrics, attack, watermark, copyright protection, etc.
Deepti Ghadiyaram 4:00 PM–5:00 PM

About Us

Mike Zheng Shou
National U. of Singapore

Prof. Shou is a tenure-track Assistant Professor at National University of Singapore. He was a Research Scientist at Facebook AI in Bay Area. He obtained his Ph.D. degree at Columbia University in the City of New York, working with Prof. Shih-Fu Chang. He was awarded Wei Family Private Foundation Fellowship. He received the best paper finalist at CVPR'22, the best student paper nomination at CVPR'17. His team won the 1st place in the international challenges including ActivityNet 2017, EPIC-Kitchens 2022, Ego4D 2022 & 2023. He regularly serves as Area Chair for top-tier artificial intelligence conferences including CVPR, ECCV, ICCV, ACM MM. He is a Fellow of National Research Foundation (NRF) Singapore. He is on the Forbes 30 Under 30 Asia list.

Jay Zhangjie Wu
National U. of Singapore

Jay is a PhD student at Show Lab, National University of Singapore, adviced by Prof. Mike Zheng Shou. He was previously an intern at Tencent ARC Lab working with Yixiao Ge and Xintao Wang. He obtained his Bachelor's degree in Computer Science from Shen Yuan Honors College at Beihang University. His research focuses on generative AI for video content creation. His representative works include Tune-A-Video, Show-1, and MotionDirector.

Deepti is an incoming Assistant Professor at Boston University starting July 2024. She is currently a Staff Research Scientist and Tech lead at Runway where she is working on improving the quality and safety of generative models. Previously, she was a Senior Research Scientist and Tech lead at Fundamental AI Research (FAIR) in Meta AI where she worked on a broad variety of topics in Computer Vision, Machine Learning, and Image and Video Processing. Her research interests span several topics such as building image and video understanding models, fair and inclusive computer vision models, ML explainability, and perceptual image and video quality. She has served her professional community in many ways, including as area chair and program committee member in major machine learning and AI conferences.