Muhammad Uzair Khattak
PhD Candidate, EPFL, Switzerland - MSc from MBZUAI, Abu-Dhabi - BSc from SEECS, NUST, Pakistan.
Lausanne, Switzerland
Hi, I am Muhammad Uzair, a PhD candidate at VILAB at EPFL supervised by Prof. Amir Zamir and PD. Dr. Federico Tombari. Previously, I completed my MSc in Computer Vision at the IVAL lab at Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI), where I was kindly supervised by Dr. Salman Khan and Dr. Fahad Khan. I am also grateful to be co-supervised and mentored by Dr. Muzammal Naseer.
My research focus is on adapting foundational multi-modal models for vision tasks including image recognition, object detection and video action recognition. The goal is to steer these foundational models for downstream tasks with limited data (few-/zero-shot) while maintaining their pre-trained generalization for novel tasks.
Currently, I am focusing on scaling up Multi-task foundational models and Complex Video Reasoning using Large Multi-modal Models (LMMs).
Email / Google Scholar / Github / Twitter / CV
News
Sep 1, 2024 | I have started my PhD studies at EPFL, Switzerland. |
---|---|
May 9, 2024 | We have released CVRR-ES: Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs. More details on the project page. |
Feb 22, 2024 | Invited talk on Multi-modal learning @ Amazon Prime Video. |
Feb 5, 2024 | Invited talk on our recent ProText work at Cohere For AI. (Slides / Recording) |
Jan 5, 2024 | We have released ProText, a novel framework to adapt Vision-Language models with text-only data. More details on the project page ! |
Dec 16, 2023 | Invited talk on our recent PromptSRC work at Computer Vision Talks. |
Dec 15, 2023 | Invited talk at WADLA 2023 Deep Learning Workshop. (Recording) |
Nov 4, 2023 | Paper and Code for our NeurIPS’23 work PromptAlign are released! |
Selected publications
* denotes joint first authors
2024
2023
2022
- Investigating and Improving Common Loop Closure Failures in Visual SLAMAutonomous Robots, Oct 2022