[SIST Seminar] Open-world visual categorization, reconstruction and generation

ON2024-06-04TAG: ShanghaiTech UniversityCATEGORY: Lecture

Topic: Open-world visual categorization, reconstruction and generation

Speaker: Assistant Professor Han Kai, Department of Statistics & Actuarial Science, University of Hong Kong (HKU)

Date and time: June 5, 14:00

Venue: Room A200, #1 Building of SIST

Host: He Xuming

Abstract:

In this talk, I will introduce our recent works that cover open-world visual categorization, reconstruction, and generation. Firstly, I will discuss our recent study on open-world learning, including generalized category discovery and open-vocabulary action recognition on images and videos, respectively, leveraging foundation models. Next, I will present our recent work on generalizable visual SLAM, focusing on the development of a feed-forward SLAM system that eliminates the need for per-scene optimization. We propose an image-based depth fusion framework to achieve this goal. Finally, I will discuss our recent work on 3D human modeling, encompassing both reconstruction and generation perspectives.

Biography:

Han Kai is an assistant professor in the Department of Statistics and Actuarial Science at the University of Hong Kong, where he directs the Visual AI Lab. His research interests lie in computer vision, machine learning, and artificial intelligence. His current research focuses on open-world learning, 3D vision, generative AI, foundation models and their relevant fields. Previously, he was a researcher at Google Research, an assistant professor in the Department of Computer Science at University of Bristol, and a postdoctoral researcher in the Visual Geometry Group (VGG) at the University of Oxford. He received his PhD degree in the Department of Computer Science at the University of Hong Kong. During his PhD study, he also worked at the WILLOW team of Inria Paris and École Normale Supérieure (ENS) in Paris. He serves as Area Chair for CVPR 2024 and ECCV 2024.