Codex: Evaluating Large Language Models Trained on Code

February 2023

tl;dr: Improved version of GPT3 trained on code. Model behind Github Copilot.

Overall impression

This paper introduces Codex, the model behind Copilot. It can synthesize programs from docstrings (and the reverse). Better performance can be done by generating multiple results and picking the one that is the best.

Codex has the same architecture as GPT-3, but finetuned with code data. It has developed the CoT (chain of thought) and significantly boosted the performance on solve code problems.

As a technical report from openAI, the paper also has a lengthy discussion on the limitations, risks and mitigation plans, from their commitment to responsible and safe AI.

Key ideas

Technical details