Learning-Deep-Learning

CaP: Code as Policies: Language Model Programs for Embodied Control

July 2023

tl;dr: Code gen for robotic control policies. LMP (language model generated programs) that can represent robotics policies, including reactive policies and waypoint-based policies. This is one way to “ground” LLM to the real world.

Overall impression

This is one way to “ground” LLM to the real world. VoxPoser provides another, more generalized way to generate control. CaP still relies on manually designed motion primitives, while VoxPoser circumvent this via an automatic value map generation.

CaP is the first step for LLM to step beyond a sequence of skills. Code writing LLMs can orchestrate planning, policy logic and control. CaP can interpret language commands to generate LMPs that represent reactive low-level policies (PD or impedance controllers) and waypoints-based policies (for vision-based pick and place, or trajectory based control).

CaP alleviates the need to collect data and train a fixed set of predefined skills or language conditioned policies. CaP still relies on low level control APIs.

CaP generalizes at a specific layer in the robot stack: interpreting NL instructions processing perception outputs and then parameterize low-dimensional inputs to control primitives. This fits into system with factorized perception and control.

Despite great progress, how to form data close-loop to continuously improve CaP is unclear. It is also limited to a handful of named primitive parameters of control API.

The concurrent work is ProgPrompt. The comparison can be found in its project page.

Key ideas

Technical details

Notes