End-to-end Autonomous Driving: Challenges and Frontiers

February 2024

tl;dr: Very good high level overview of end-to-end autonomous driving, with a focus on planning.

Overall impression

This can be seen as an extensive background review for UniAD. The review part for IL and RL, and closed-loop evaluation part is well-written.

End-to-end systems (or referred to as visuomotor system, i.e. vision in, motor control out) contrasts with modular pipeline involving rule-based design. E2E systems as fully diff programs that take raw sensor data as iput and produce a plan or low-level control action as output.

E2E does not necessarily mean one black box with only planning/control outputs. Yet the E2E system mentioned in this paper is mainly still a modular based design, but every component and the entire compound system is differentiable. Actually there is another E2E architype which completely eliminates the notion of perception, prediction and planning, and it is gaining popularity in embodied AI (robotics) field.

Conventional CV tasks are typically dense prediction task (obj det, semantic seg). Yet autonomous driving predicts very sparse signals. (So does the computer-aided diagnosis for medical imaging systems.) This means the sparse signal alone cannot guarantee good representation learning. This does not necessarily exclude the possibility of a black-box design.

Key ideas

Technical details