Transformers are more than meets the eye ... such the GPT family, are decoder only. Encoder-decoder models combine both components, making them useful for translation and other sequence ...