GPT-4 Technical Report

March 2023

tl;dr: Multimodal GPT with significantly improved, near human-level performance on various academic and professional benchmarks.

Overall impression

GPT-4 is a large multimodal model capable of processing image and text inputs and producing text outputs.

Over GPT-3.5, it is much more capable (through pretraining on a larger scale), more reliable and truthful to fact (through RLHF), and generates less toxic results (through model assisted safety pipeline by adding more safety prompts and rule-based RM).

The technical report did not report the model size due to competitive landscape and safety implications. We know that it uses publicly available data and data licenced from 3rd parties. Also it uses post-training RLHF as in previous work, such a InstructGPT.

The paper presents three major aspects of GPT4, the capabilities, the limitations, and the risks. This is actually a very general framework for analyzing AI systems. For example, for an object detector, this largely translates to TP (capabilities), FN (limitations) and FP (risks).

It is exciting to see that “GPT-4 was used for help with wording, formatting, and styling throughout this [technical report].”

Key ideas

Technical details