Hao Shao     邵昊

I am an first-year PhD student in Multimedia Laboratory in the Chinese University of Hong Kong. I'm supervised by Prof. Hongsheng Li and Prof.Xiaogang Wang.

Before that, I received my Master's degree from Tsinghua University in 2022, and my Bachelor degree from the University of Electronic Science and Technology of China in 2019.

My research interests lie in the area of Autonomous Driving and Computer Vision. Specifically, I'm pariticularly interested in end-to-end autonomous driving, trajectory prediction and video understanding.

Email / Google Scholar / Github

Mount Gongga, Sichuan

Aug. 2023 - , Department of Electronic Engineering, the Chinese University of Hong Kong

PhD Student


Sept. 2019 - Jun. 2022 , School of Software Engineering, Tsinghua University



Sept. 2015 - Jun. 2019 , School of Software Engineering, University of Electronic Science and Technology of China

Bachelor GPA: 3.98/4


Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu, Hongsheng Li
Technical Report, 2024
[paper] [code]

We propose Visual CoT, including a new pipeline/dataset/benchmark that enhances the interpretability of MLLMs by incorporating visual Chain-of-Thought reasoning, optimizing for complex visual inputs.


LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Hao Shao, Yuxuan Hu, Letian Wang, Guanglu Song, Steven L. Waslander, Yu Liu, Hongsheng Li
Computer Vision and Pattern Recognition (CVPR), 2024
[website] [paper] [code]

We propose a novel end-to-end, closed-loop, language-based autonomous driving framework, LMDrive, which interacts with the dynamic environment via multi-modal multi-view sensor data and natural language instructions.


SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction
Yang Zhou*, Hao Shao*, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu
Computer Vision and Pattern Recognition (CVPR), 2024
[paper] [code]

We introduce a novel scenario-adaptive refinement strategy to refine trajectory prediction with minimal additional computation.


Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors
Letian Wang, Jie Liu, Hao Shao, Wenshuo Wang, Ruobing Chen, Yu Liu, Steven L Waslander
Robotics: Science and Systems (RSS), 2023
[paper] [code]

We present an efficient reinforcement learning (ASAP-RL) that simultaneously leverages parameterized motion skills and expert priors for autonomous vehicles to navigate in complex dense traffic.


ReasonNet: End-to-End Driving with Temporal and Global Reasoning
Hao Shao, Letian Wang, Ruobing Chen, Steven L Waslander, Hongsheng Li, Yu Liu
Computer Vision and Pattern Recognition (CVPR), 2023
[paper] [code]

We present ReasonNet, a novel end-to-end driving framework that extensively exploits both temporal and global information of the driving scene.


Safety-enhanced autonomous driving using interpretable sensor fusion transformer
Hao Shao, Letian Wang, Ruobing Chen, Hongsheng Li, Yu Liu
Conference on Robot Learning (CoRL), 2022
[paper] [code]

We propose a safety-enhanced autonomous driving framework to fully process and fuse information from multi-modal multi-view sensors for achieving comprehensive scene understanding and adversarial event detection.


Blending anti-aliasing into vision transformer
Shengju Qian, Hao Shao, Yi Zhu, Mu Li, Jiaya Jia
Advances in Neural Information Processing Systems (NeurIPS), 2021

We propose a plug-and-play Aliasing-Reduction Module (ARM) to alleviate the problem of aliasing in vision transformer.


Temporal interlacing network
Hao Shao, Shengju Qian, Yu Liu
AAAI Conference on Artificial Intelligence (AAAI), 2020
[paper] [code]

We present a simple yet powerful operator – temporal interlacing network (TIN). TIN fuses the two kinds of information by interlacing spatial representations from the past to the future, and vice versa.

Industry Experience

Apr 2019 - Now, XLab

Researcher(intern). Beijing, China


Sep 2018 - Apr 2019, Computer Vision

Research intern. Shenzhen, China


Jul 2017 - May 2018, Recommend System

Research intern. Beijing, China

Honors and Awards

  • X-Temporal stars , Easily implement SOTA video understanding methods with PyTorch on multiple machines and GPUs

  • Awesome End-to-End Autonomous Driving stars , Paper list about end-to-end autonomous driving

  • DI-drive stars , Decision Intelligence Platform for Autonomous Driving simulation

  • Fast Jieba stars , Fast Chinese word segmentation library, rewriting Jieba core functions (accumulated 138K downloads)