Learning-Deep-Learning

Fast-dLLM v2: Efficient Block-Diffusion LLM

January 2026

tl;dr: Adapt AR into a block diffusion model (hybrid model, blockwise AR + intra-block diffusion), and inference intra-block with Fast-dLLM.

Overall impression

The combinatio of block diffusion + fast-dLLM v1.

Both v1 and v2 enforce left-to-right blockwise AR dependency. But v2 adapts the training of the model and aligns it better with inference (训推一致).

Note that Fast-dLLM v2 is a hybrid model (block diffusion model) and is NOT a Full Diffusion Model.

Key ideas

Technical details

Notes