The Transformer AI model architecture has many different versions. The decoder-only models have gained popularity because of its simplicity, scalability, efficiency and parallelization.
Diagram taken from textbook “Hands-On Generative AI with Transformer and Diffusion Models”
